Integrate DataFrames
Pyoframe is built on DataFrames
Most other optimization libraries require you to convert your data from its DataFrame
format to another format.1 Not Pyoframe! DataFrames form the core of Pyoframe making it easy to seamlessly — and efficiently — integrate large datasets into your models.
You are going to re-build the previous example using a dataset, food_data.csv
, instead of hard-coded values. This way, you can add as many vegetarian proteins as you like without needing to write more code. If you're impatient, skip to the end to see the final result.
food_data.csv
food protein cost tofu_block 18 4 chickpea_can 15 3
Step 1: Load the data
Load food_data.csv
using Polars or Pandas.
Pandas vs. Polars: Which should I use?
Pyoframe works the same whether you're using Polars or Pandas, two similar libraries for manipulating data with DataFrames. We prefer using Polars because it is much faster (and generally better), but you can use whichever library you're most comfortable with.
Note that, internally, Pyoframe always uses Polars during computations to ensure the best performance. If you're using Pandas, your DataFrames will automatically be converted to Polars prior to computations. If needed, you can convert a Polars DataFrame back to Pandas using polars.DataFrame.to_pandas()
.
Step 2: Create the model
A pyoframe.Model
instance sets the foundation of your optimization model onto which you can add optimization variables, constraints, and an objective.
Step 3: Create a dimensioned variable
Previously, you created two variables: m.tofu_blocks
and m.chickpea_cans
. Instead, create a single variable dimensioned over the column food
.
Printing the variable shows that it contains a food
dimension with labels tofu
and chickpeas
!
>>> m.Buy
<Variable 'Buy' lb=0 height=2>
┌──────────────┬───────────────────┐
│ food ┆ variable │
│ (2) ┆ │
╞══════════════╪═══════════════════╡
│ tofu_block ┆ Buy[tofu_block] │
│ chickpea_can ┆ Buy[chickpea_can] │
└──────────────┴───────────────────┘
Capitalize model variables
We suggest capitalizing model variables (i.e. m.Buy
, not m.buy
) to make distinguishing what is and isn't a variable easy.
Step 3: Create the objective with .sum()
Previously you had:
How do you make use of the dimensioned variable m.Buy
instead?
First, multiply the variable by the protein amount.
>>> data[["food", "cost"]] * m.Buy
<Expression height=2 terms=2 type=linear>
┌──────────────┬─────────────────────┐
│ food ┆ expression │
│ (2) ┆ │
╞══════════════╪═════════════════════╡
│ tofu_block ┆ 4 Buy[tofu_block] │
│ chickpea_can ┆ 3 Buy[chickpea_can] │
└──────────────┴─────────────────────┘
As you can see, Pyoframe with a bit of magic converted the Variable
into an Expression
where the coefficients are the protein amounts.
Second, notice that the Expression
still has the food
dimension—it really contains two separate expressions, one for tofu and one for chickpeas. All objective functions must be a single expression (without dimensions) so let's sum over the food
dimension.
>>> (data[["food", "cost"]] * m.Buy).sum("food")
<Expression terms=2 type=linear>
4 Buy[tofu_block] +3 Buy[chickpea_can]
This works and since food
is the only dimensions you don't even need to specify it. Putting it all together:
Step 4: Add the constraint
This is similar to how you created the objective, except now you're using protein
and you turn the Expression
into a Constraint
with the >=
operation.
Put it all together
import pandas as pd
import pyoframe as pf
data = pd.read_csv("food_data.csv")
m = pf.Model()
m.Buy = pf.Variable(data["food"], lb=0, vtype="integer")
m.minimize = (data[["food", "cost"]] * m.Buy).sum()
m.protein_constraint = (data[["food", "protein"]] * m.Buy).sum() >= 50
m.optimize()
So you should buy:
>>> m.Buy.solution
┌──────────────┬──────────┐
│ food ┆ solution │
│ --- ┆ --- │
│ str ┆ i64 │
╞══════════════╪══════════╡
│ tofu_block ┆ 2 │
│ chickpea_can ┆ 1 │
└──────────────┴──────────┘
Notice that since m.Buy
is dimensioned, m.Buy.solution
returned a DataFrame with the solution for each of the labels.
Returning Pandas DataFrames
Pyoframe currently always returns Polars DataFrames but you can easily convert them to Pandas using .to_pandas()
. In the future, we plan to add support for automatically returning Pandas DataFrames. Upvote the issue if you'd like this feature.
-
For example, Pyomo converts your DataFrames to individual Python objects, Linopy uses multi-dimensional matrices via xarray, and gurobipy requires Python lists, dictionaries and tuples. While gurobipy-pandas uses dataframes, it only works with Gurobi! ↩
Comments