Solving real-world optimisation problems - a crash course with PuLP

February 10, 2024 10 minute read

Contents

Some of the links in this article are affiliate links. This means if you click on the link and purchase or subscribe, I will receive a commission. All opinions and recommendations remain objective. You can also read the affiliates disclosure for more information.

I’ve read a few tutorials recently to refresh my knowledge on optimal resource allocation, and either the examples were too complex or delved too far into the maths. I also enrolled on a useful course from DataCamp called Supply Chain Analytics in Python. This article focuses more on the practical steps required for you to get started quickly with some good examples.

By the end of this article, you should be able to solve simple and intermediate optimisation problems using Python and PuLP.

This is a really useful skill for statisticians, data scientists, operational reseachers and business to make the best decisions, maximise profits, production, minimise time, costs and more.

We will start with a small example, then build up to more complex examples as we proceed. I ran all of the code contained in this article using Spyder IDE with Anaconda.

What is optimisation and linear programming?

Optimisation helps to find the best decision given some inputs, so aims to maximise or minimise an objective function, given a number of constraints
Linear programming (LP), also called linear optimisation, is a method to achieve the best outcome (such as maximum profit or minimal cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming also known as mathematical optimisation.
PuLP is a library in Python to help with optimisation and linear programming tasks. PuLP stands for “Python. Linear Programming”

What are the steps to solving an optimisation problem?

An optimisation problem that uses linear programming (LP) and PuLP typically has the following steps / components:

Model - an initialised PuLP model
Decision variables - what you can control
Objective function - the goal to maximise or minimise like profit, cost, resources
Constraints - limitations to our solution like demand, capacity, time
Solve model - then view the most optimal outcome

Let's see these in action in our first example.

Exercise routine

Use LP to decide on an exercise routine to burn as many calories as possible.

	Pushup	Running
Minutes	0.2 per pushup	10 per mile
Calories	3 per pushup	130 per mile

Constraint = only 10 minutes to exercise

exercise.py

from pulp import LpProblem, LpVariable, LpMaximize, LpMinimize, LpStatus, lpSum, value

# 1. Initialise model
model = LpProblem("Maximize Calories Burnt", LpMaximize)

# 2. Define Decision Variables: pushups and running
pushup = LpVariable('Pushup', lowBound=0, upBound=None, cat="Continuous")
running = LpVariable('Running', lowBound=0, upBound=None, cat="Continuous")

# 3. Define objective function: calories per pushup or per mile
model += 3 * pushup + 130 * running

# 4. Define constraints: our model's limitations
model += 0.2 * pushup + 10 * running <= 10  # Time constraint is 10 minutes to exercise
model += pushup >= 0 + running >= 0         # Our results must be more than 0 pushups or miles ran (so not negative)

# 5. Solve model
model.solve()
print("Run = {} miles".format(running.varValue))
print("Pushups = {}".format(pushup.varValue))
print(f"Calories burnt: {(running.varValue * 130) + (pushup.varValue * 3)}")

Our workflow in this code consisted of:

Initialising the model with the help of PuLP using LpProblem and set our goal as LpMaximize - since we want to maximise calories burnt
Defining our two decision variables as either pushups or running and set the category as Continuous
Setting the objective function in mathematical form, which were calories per pushup (3) and calories per mile of running (130)
Setting the constraints which were 10 minutes to exercise, and not a negative result
Solve the model and output the results

The results printed were:

Run = 0.0 miles

Pushups = 50.0

Calories burnt: 150.0

This has computed all possible combinations and returned the most optimal decision in miliseconds! We can see the most optimal outcome is to perform 50 pushups which burns 150 calories and is under the 10 minute constraint (0.2 * 10 = 10)

I hope you can see the power here of quickly solving optimisation problems that would be very difficult to solve by hand accounting for all possible combinations.

Spyder IDE and PuLP

If you're using Spyder IDE you can explore PuLP's model classes in the variable explorer tab.

Glass manufacturing

We are tasked with planning the optimal production at a glass manufacturer to maximise profit. This manufacturer only produces wine and beer glasses:

there is a maximum production capacity of 60 hours
each batch of wine and beer glasses takes 6 and 5 hours respectively
the warehouse has a maximum capacity of 150 rack spaces
each batch of the wine and beer glasses takes 10 and 20 spaces respectively
the production equipment can only make full batches, no partial batches
Also, we only have orders for 6 batches of wine glasses. Therefore, we do not want to produce more than this. Each batch of the wine glasses earns a profit of $5 and the beer $4.5

resources.py

from pulp import LpProblem, LpVariable, LpMaximize, LpMinimize, LpStatus, lpSum, value

# 1. Initialise model
model = LpProblem("Maximize Glass Co. Profits", LpMaximize)

# 2. Define Decision Variables: wine and beer glasses
wine = LpVariable('Wine', lowBound=0, upBound=None, cat="Integer")
beer = LpVariable('Beer', lowBound=0, upBound=None, cat="Integer")

# 3. Define objective function: profit for both wine glasses and beer glass decision variables
model += 5 * wine + 4.5 * beer

# 4. Define constraints: our model's limitations
model += 10 * wine + 20 * beer <= 150   # Rack space cannot exceed 150
model += 6 * wine + 5 * beer <= 60      # Maximum production capacity is 60 hours
model += wine <= 6                      # Wine glasses cannot exceed 6 batches

# 5. Solve model
model.solve()
print("Produce {} batches of wine glasses".format(wine.varValue))
print("Produce {} batches of beer glasses".format(beer.varValue))

We followed the same pattern in this example, but defined more constraints. We also defined the category for our decision variables as Integer because we can only make full batches, no partial batches.

Given these constraints, we calculate the optimal production outcome to maximise profit is to produce 6 batches of wine and 4 batches or beer!

Produce 6.0 batches of wine glasses

Produce 4.0 batches of beer glasses

Warehouse stock allocation

Decide which warehouse to ship from to fulfil customer unit demand at the lowest cost.

This example is more complex so uses Python list comprehension to define many decision variables, objective functions and constraints quickly.

logistics.py

from pulp import LpProblem, LpVariable, LpMaximize, LpMinimize, LpStatus, lpSum, value

warehouses = ['New York', 'Atlanta']
customers = ['A', 'B', 'C']

costs = {
    ('New York', 'A'): 232,
    ('New York', 'B'): 255,
    ('New York', 'C'): 264,
    ('Atlanta',  'A'): 255,
    ('Atlanta',  'B'): 233,
    ('Atlanta',  'C'): 250
}

demand = {
    'A': 1500,
    'B': 900,
    'C': 800
}

# 1. Initialise model
model = LpProblem("Minimise_Transportation_Costs", LpMinimize)

# 2. Define 6 Decision Variables in a few lines of code using LpVarible.dicts
#    That's (2 warehouses * 3 customers)
key = [(w, c) for w in warehouses for c in customers]
shipments =  LpVariable.dicts('Shipments', key, lowBound=0, cat='Integer')

# 3. Define objective function: shipping costs
model += lpSum([costs[(w, c)] * shipments[(w, c)] 
                for w in warehouses for c in customers])

# 4. Define constraints: our model's limitations which is demand must be met for each customer
for c in customers:
    model += lpSum([shipments[(w, c)] for w in warehouses]) == demand[c]

# 5. Solve model
model.solve()
print("Status", LpStatus[model.status], "\n")

# 6. Print values for each decision variable - demand
print("Optimal units for each warehouse:")
for decision_variable in model.variables():
    print(decision_variable.name, "=", decision_variable.varValue)

# 7. Print value for the objective function - costs
print("\nObjective =", value(model.objective))

In this example we've created some dictionaries to hold our data for warehouses, customers, costs (warehouse to customer), and demand (units).

We then follow the same pattern but use list comprehension to define every combination of decision variables for warehouses and customers. We do the same thing to define all of our shipping costs.

Finally, we can define the constraints in that the shipments for each warehouse must meet demand and solve the model.

The output from PuLP gives us:

Status Optimal

Optimal units for each warehouse:

Shipments_('Atlanta',_'A') = 0.0

Shipments_('Atlanta',_'B') = 900.0

Shipments_('Atlanta',_'C') = 800.0

Shipments_('New_York',_'A') = 1500.0

Shipments_('New_York',_'B') = 0.0

Shipments_('New_York',_'C') = 0.0

Objective = 757700.0

We can see that to meet demand for:

customer B we need 900 units in Atlanta
customer C we need 800 units in Atlanta
customer A we need 1500 units in New York

This results in optimal shipping costs of 757,000 and we've solve a much bigger problem with many more variables.

C02 monitor allocation

Let's say we were tasked with allocating C02 monitors to schools in order to manage and monitor air quality similar to this real scenario.

We need to allocate them proportionally to have the greatest impact, with some left over for additional demand later. This is the longest example given there are a number of constraints to define.

monitors.py

"""
Allocates the optimal number of C02 monitors to schools given the constraints.
"""
from pulp import LpProblem, LpVariable, LpMaximize, LpMinimize, LpStatus, lpSum, value

# Objective function: number of monitors
available_monitors = 200

# Decision variables: a list of schools to allocate monitors
schools = ["School A", "School B", "School C", "School D"] 

# Constraints: dictionaries of size, rooms and pupil counts for each school
school_sizes = {"School A": 5000, "School B": 6000, "School C": 4000, "School D": 5500}  # in square feet
school_rooms = {"School A": 30, "School B": 40, "School C": 25, "School D": 35}
pupil_counts = {"School A": 2000, "School B": 3000, "School C": 1500, "School D": 2500}


def allocate_co2_monitors(schools, available_monitors, school_sizes, school_rooms, pupil_counts):
    # 1. Initialise model
    model = LpProblem("CO2_Monitor_Allocation", LpMinimize)

    # 2. Define the decision variables - the things we can control
    # .dicts creates a dictionary of LpVariables https://coin-or.github.io/pulp/technical/pulp.html#pulp.LpVariable.dicts
    monitors = LpVariable.dicts("Monitors", schools, lowBound=0, cat="Integer")

    # 3. Define the objective function: the thing we want to minimise or maximise so total number of monitors used per school
    # Passing a list to lpSum can add many decision variables at once
    model += lpSum(monitors)

    # 4. Define the constraints:

    # At least one monitor to each school
    for school in schools:
        model += monitors[school] >= 1 

    # The total number of allocated monitors should not exceed the available monitors
    model += lpSum(monitors) <= available_monitors - 20

    # There must be 1 monitor per 500 square feet
    for school in schools:
        model += monitors[school] >= school_sizes[school] / 500

    # There must be 1 monitor per 2 rooms
    for school in schools:
        model += monitors[school] >= school_rooms[school] / 2

    # There must be 1 monitor per 50 pupils
    for school in schools:
        model += monitors[school] >= pupil_counts[school] / 50 

    # 5. Solve the LP problem
    model.solve()

    # 6. Check the status of the solution
    if LpStatus[model.status] != "Optimal":
        print("Unable to find an optimal solution.")
        return None

    # 7. Get the model results
    allocation = {}
    for school in schools:
        allocation[school] = value(monitors[school])

    return allocation

allocation = allocate_co2_monitors(schools, available_monitors, school_sizes, school_rooms, pupil_counts)

if allocation:
    print("CO2 Monitor Allocation:")
    total_monitors_allocated = 0
    for school, monitors in allocation.items():
        total_monitors_allocated += int(monitors)
        print(f"{school}: {monitors} monitors")
    print(f"\nTotal monitors allocated: {total_monitors_allocated}")
    print(f"Total monitors leftover: {str(available_monitors - total_monitors_allocated)}")

Here we define the available monitors, and set our data for schools, school size, rooms, and pupil counts.

Following the same pattern, we initialise the model, and generate our decision variables from the schools list - our decision variable is what we can change so here it's the schools and how many monitors to assign to each.

Finally we add each of the constraints and solve:

At least one monitor to each school
The total number of allocated monitors should not exceed the available monitors
There must be 1 monitor per 500 square feet
There must be 1 monitor per 2 rooms
There must be 1 monitor per 50 pupils

These are reasonable assumptions for the constraints but we could change them if they are too strict. In this case, we have a solved model which gives:

CO2 Monitor Allocation:

School A: 40.0 monitors

School B: 60.0 monitors

School C: 30.0 monitors

School D: 50.0 monitors

Total monitors allocated: 180

Total monitors leftover: 20

Great! So from the 200 available monitors we have allocated 180 given the constraints with 20 leftover. I find this the most impressive example as solving this scenario without LP and PuLP would take so much more work!

Conclusion

Well done if you made it through all the examples. You should be now be able to solve simple and intermediate optimisation problems using Python and PuLP using this workflow. You just have to frame your problem as an LP problem and then modify your decision variables and constraints.

It is worth reminding ourselves that sometimes there won't be a solution to problem. In that case, we must revisit our inputs to loosen them a little if possible. Maybe the constraints are too strict and need to be made more forgiving. This is where operations meets analysis.

Always question the outputs and sense check them through solid quality assurance - find out more in the article Six tips for producing and assuring high quality analytical code.

I hope you enjoyed this article, you've hopefully added a seriously useful tool to your toolkit. You may also be interested in these articles on the site:

Improving Wi-Fi 2.4GHz and 5GHz speeds after Full Fibre (FTTP) upgrade

Reflections on digital streaming and reducing smartphone usage