Building a Mathematical Optimiser (in Zig): Part 1 - The Simplex Method
- Published on
- Joseph Miocevich--17 min read
Introduction
We will be starting this series learning the simplex method. We wont be diving into the tableau approach, which for learning I highly recommend. We will be starting of using the matrix form, and will be solving a simple problem step by step, then coding it up in Zig.
In this walkthrough, we'll solve a simple linear program step-by-step using the Revised Simplex Method, showing all matrix calculations explicitly. By the end, you'll understand how the algorithm navigates from one basic feasible solution to another until reaching optimality.
First some base assumptions:
- We will not be parsing a .mps file, this can be for later time
- No optimisation, this will be a naive solution for now, and will have a dedicated post. However we will structure the code in a way so we are not backed into a corner later.
- simple problem, Not covering MIP solutions or dual simplex (yet)
Recommended reading
Problem Formulation
For this example, we will solve this problem. (We will harcode the matrix intepretation of this for now in our code, we will be building out a parser later.)
Minimize: z = -5x₁ - 3x₂
Subject to:
- 2x₁ + 3x₂ + x₃ = 18
- x₂ + x₄ = 4
- x₁ + x₅ = 6
- x₁, x₂, x₃, x₄, x₅ ≥ 0
Where x₃, x₄, x₅ are slack variables.
Matrix Representation
We can write this in matrix form as:
Minimize: c^T x
Subject to: Ax = b, x ≥ 0
Where:
The Algorithm Pattern
Every iteration follows the same structure:
- Compute current solution: x_B = B⁻¹b
- Compute pricing: π = c_B^T B⁻¹
- Check optimality: c̄_j = c_j - πA_j ≥ 0 for all j?
- Select pivot: Choose entering (min c̄_j) and leaving (min ratio test)
- Update basis: Swap one variable
Geometric Interpretation
- Each basis represents a vertex of the feasible region
- Each iteration moves to an adjacent vertex with better objective value
- The algorithm never revisits the same vertex (in non-degenerate cases)
- At optimality, all directions lead to worse (or equal) objective values
Code: Problem Setup
const std = @import("std");
// Constraint matrix A (3 rows × 5 columns)
const A = [3][5]f64{
[_]f64{ 2.0, 3.0, 1.0, 0.0, 0.0 },
[_]f64{ 0.0, 1.0, 0.0, 1.0, 0.0 },
[_]f64{ 1.0, 0.0, 0.0, 0.0, 1.0 },
};
// Right-hand side vector b
const b = [_]f64{ 18.0, 4.0, 6.0 };
// Cost coefficients c
const c = [_]f64{ -5.0, -3.0, 0.0, 0.0, 0.0 };
The variables are partitioned into:
- Basic variables (x_B): Currently in the basis (non-zero)
- Non-basic variables (x_N): Not in basis (set to zero)
Initial Setup
The simplex method starts with an initial basic feasible solution. Here, the slack variables form an obvious starting basis:
Initial Basis: [x₃, x₄, x₅] = indices [2, 3, 4]
This means:
- x₃ = 18, x₄ = 4, x₅ = 6 (basic variables, non-zero)
- x₁ = 0, x₂ = 0 (non-basic variables, set to zero)
Code: Finding Initial Basis
// Find initial basis by looking for identity columns in A
fn findInitialBasis(comptime rows: usize, comptime cols: usize,
mat: *const [rows][cols]f64) ![rows]usize {
var basis: [rows]usize = undefined;
var basis_count: usize = 0;
var row_used: [rows]bool = [_]bool{false} ** rows;
for (0..cols) |col| {
var unit_row: usize = 0;
if (isUnitColumn(rows, cols, mat, col, &unit_row)) {
if (!row_used[unit_row]) {
basis[unit_row] = col;
row_used[unit_row] = true;
basis_count += 1;
}
}
}
if (basis_count != rows) {
return error.NoIdentityBasis;
}
return basis;
}
// Check if a column is a unit vector (one 1, rest 0s)
fn isUnitColumn(comptime rows: usize, comptime cols: usize,
mat: *const [rows][cols]f64, col: usize,
unit_row: *usize) bool {
var ones: usize = 0;
var one_position: usize = 0;
for (0..rows) |i| {
const val = mat[i][col];
if (@abs(val - 1.0) < 1e-10) {
ones += 1;
one_position = i;
} else if (@abs(val) > 1e-10) {
return false;
}
}
if (ones == 1) {
unit_row.* = one_position;
return true;
}
return false;
}
// Usage
const basisIndex = try findInitialBasis(3, 5, &A);
// Result: [2, 3, 4] (columns x₃, x₄, x₅)
Iteration 0: First Improvement
Step 1: Extract Basis Matrix B
The basis matrix B consists of columns from A corresponding to basic variables:
This is the identity matrix (by design for the initial basis).
Code: Extract Basis Matrix
// Extract columns from A corresponding to basis indices
fn extractBasisMatrix(comptime rows: usize, comptime cols: usize,
mat: *const [rows][cols]f64,
basis: [rows]usize) [rows][rows]f64 {
var B: [rows][rows]f64 = undefined;
for (0..rows) |i| {
for (0..rows) |j| {
B[i][j] = mat[i][basis[j]];
}
}
return B;
}
// Usage
const B = extractBasisMatrix(3, 5, &A, basisIndex);
// Result: 3×3 identity matrix
Step 2: Compute B⁻¹
Since B is already the identity matrix:
Code: Matrix Inversion
// Compute matrix inverse using Gauss-Jordan elimination
fn matrixInverse(comptime n: usize, mat: *const [n][n]f64) ![n][n]f64 {
var aug: [n][n * 2]f64 = undefined;
// Create augmented matrix [A | I]
for (0..n) |i| {
for (0..n) |j| {
aug[i][j] = mat[i][j];
aug[i][j + n] = if (i == j) 1.0 else 0.0;
}
}
// Gauss-Jordan elimination
for (0..n) |i| {
const pivot = aug[i][i];
if (@abs(pivot) < 1e-10) {
return error.SingularMatrix;
}
// Scale row i
for (0..n * 2) |j| {
aug[i][j] /= pivot;
}
// Eliminate column i from other rows
for (0..n) |k| {
if (k != i) {
const factor = aug[k][i];
for (0..n * 2) |j| {
aug[k][j] -= factor * aug[i][j];
}
}
}
}
// Extract inverse from right half
var result: [n][n]f64 = undefined;
for (0..n) |i| {
for (0..n) |j| {
result[i][j] = aug[i][j + n];
}
}
return result;
}
// Usage
const B_inv = try matrixInverse(3, &B);
Step 3: Compute Basic Solution x_B = B⁻¹b
Current solution: x = [0, 0, 18, 4, 6] Objective value: z = (-5)(0) + (-3)(0) = 0
Code: Matrix-Vector Multiplication
// Multiply matrix by vector: result = mat × vec
fn matrixVectorMultiply(comptime rows: usize, comptime cols: usize,
mat: *const [rows][cols]f64,
vec: *const [cols]f64) [rows]f64 {
var result: [rows]f64 = undefined;
for (0..rows) |i| {
var sum: f64 = 0.0;
for (0..cols) |j| {
sum += mat[i][j] * vec[j];
}
result[i] = sum;
}
return result;
}
// Usage
const x_B = matrixVectorMultiply(3, 3, &B_inv, &b);
// Result: [18.0, 4.0, 6.0]
Step 4: Extract Basic Costs c_B
Code: Extract Basic Costs
// Extract cost coefficients for basic variables
var c_B: [3]f64 = undefined;
for (0..3) |i| {
c_B[i] = c[basisIndex[i]];
}
// Result: [0.0, 0.0, 0.0] for basis [2, 3, 4]
Step 5: Compute Simplex Multipliers π = c_B^T × B⁻¹
Code: Compute Simplex Multipliers
// Compute π = c_B^T × B^(-1)
var pi: [3]f64 = undefined;
for (0..3) |j| {
var sum: f64 = 0.0;
for (0..3) |i| {
sum += c_B[i] * B_inv[i][j];
}
pi[j] = sum;
}
// Result: [0.0, 0.0, 0.0]
Step 6: Compute Reduced Costs c̄_j = c_j - π × A_j
For each column j of A:
j=0 (x₁): c̄₀ = -5 - [0 0 0]·[2 0 1]^T = -5 - 0 = -5.0 ⬅️ Most negative!
j=1 (x₂): c̄₁ = -3 - [0 0 0]·[3 1 0]^T = -3 - 0 = -3.0
j=2 (x₃): c̄₂ = 0 - [0 0 0]·[1 0 0]^T = 0 - 0 = 0.0
j=3 (x₄): c̄₃ = 0 - [0 0 0]·[0 1 0]^T = 0 - 0 = 0.0
j=4 (x₅): c̄₄ = 0 - [0 0 0]·[0 0 1]^T = 0 - 0 = 0.0
Reduced costs: [-5.0, -3.0, 0.0, 0.0, 0.0]
Code: Compute Reduced Costs
// Helper: Extract column from matrix
fn extractColumn(comptime rows: usize, comptime cols: usize,
mat: *const [rows][cols]f64, col: usize) [rows]f64 {
var result: [rows]f64 = undefined;
for (0..rows) |i| {
result[i] = mat[i][col];
}
return result;
}
// Helper: Vector dot product
fn vectorDotProduct(comptime n: usize,
vec_a: *const [n]f64,
vec_b: *const [n]f64) f64 {
var sum: f64 = 0.0;
for (0..n) |i| {
sum += vec_a[i] * vec_b[i];
}
return sum;
}
// Compute reduced costs for all variables
var reduced_costs: [5]f64 = undefined;
var min_reduced_cost: f64 = 0.0;
var entering_var: usize = 0;
for (0..5) |j| {
const col_j = extractColumn(3, 5, &A, j);
const pi_A_j = vectorDotProduct(3, &pi, &col_j);
reduced_costs[j] = c[j] - pi_A_j;
// Track minimum (most negative)
if (reduced_costs[j] < min_reduced_cost) {
min_reduced_cost = reduced_costs[j];
entering_var = j;
}
}
// Result: [-5.0, -3.0, 0.0, 0.0, 0.0], entering_var = 0
Step 7: Check Optimality
Since we have negative reduced costs (c̄₀ = -5), we are not optimal.
The most negative reduced cost is -5.0, so x₀ (x₁) enters the basis.
Code: Check Optimality
// Check if optimal (all reduced costs ≥ 0)
if (min_reduced_cost >= -1e-10) {
std.debug.print("Optimal solution found!\n", .{});
// Construct full solution and report
break;
}
std.debug.print("Entering variable: x{}\n", .{entering_var});
Step 8: Compute Direction d = B⁻¹ × A₀
Extract column 0 from A:
This tells us: if we increase x₁ by θ, then x₃ decreases by 2θ, x₄ stays the same, and x₅ decreases by θ.
Code: Compute Direction
// Compute direction d = B^(-1) × A_entering
const A_entering = extractColumn(3, 5, &A, entering_var);
const d = matrixVectorMultiply(3, 3, &B_inv, &A_entering);
// Result: [2.0, 0.0, 1.0]
Step 9: Ratio Test (Find Leaving Variable)
We can increase x₁ until one of the basic variables hits zero:
i=0: x₃ → 0 ⟹ 18 - 2θ = 0 ⟹ θ = 9.0
i=1: x₄ → 0 ⟹ 4 - 0θ = 0 ⟹ θ = ∞ (d₁ = 0, skip)
i=2: x₅ → 0 ⟹ 6 - 1θ = 0 ⟹ θ = 6.0 ⬅️ Minimum!
Minimum ratio = 6.0 at index i=2, so x₄ (x₅) leaves the basis.
Code: Ratio Test
// Find leaving variable via minimum ratio test
var min_ratio: f64 = std.math.inf(f64);
var leaving_idx: usize = 0;
for (0..3) |i| {
if (d[i] > 1e-10) { // Only consider positive directions
const ratio = x_B[i] / d[i];
if (ratio < min_ratio) {
min_ratio = ratio;
leaving_idx = i;
}
}
}
if (min_ratio == std.math.inf(f64)) {
std.debug.print("Problem is unbounded!\n", .{});
return;
}
const leaving_var = basisIndex[leaving_idx];
// Result: leaving_idx = 2, leaving_var = 4 (x₅)
Step 10: Update Basis
Replace x₅ with x₁:
New Basis: [2, 3, 0] → [x₃, x₄, x₁]
Code: Update Basis
// Swap leaving variable with entering variable
basisIndex[leaving_idx] = entering_var;
// New basis: [2, 3, 0]
Iteration 1: Second Improvement
Now we repeat the process with the new basis [2, 3, 0].
Step 1: Extract Basis Matrix B
Step 2: Compute B⁻¹
Using Gauss-Jordan elimination:
Step 3: Compute Basic Solution x_B = B⁻¹b
Current solution: x = [6, 0, 6, 4, 0] Objective value: z = (-5)(6) + (-3)(0) = -30
We've improved from z=0 to z=-30!
Step 4: Extract Basic Costs c_B
Step 5: Compute Simplex Multipliers π = c_B^T × B⁻¹
Step 6: Compute Reduced Costs
j=0 (x₁): c̄₀ = -5 - [0 0 -5]·[2 0 1]^T = -5 - (-5) = 0.0
j=1 (x₂): c̄₁ = -3 - [0 0 -5]·[3 1 0]^T = -3 - 0 = -3.0 ⬅️ Most negative!
j=2 (x₃): c̄₂ = 0 - [0 0 -5]·[1 0 0]^T = 0 - 0 = 0.0
j=3 (x₄): c̄₃ = 0 - [0 0 -5]·[0 1 0]^T = 0 - 0 = 0.0
j=4 (x₅): c̄₄ = 0 - [0 0 -5]·[0 0 1]^T = 0 - (-5) = 5.0
Reduced costs: [0.0, -3.0, 0.0, 0.0, 5.0]
Step 7: Check Optimality
Still not optimal (c̄₁ = -3.0 < 0).
x₁ (x₂) enters the basis.
Step 8: Compute Direction d = B⁻¹ × A₁
Step 9: Ratio Test
i=0: x₃ → 0 ⟹ 6 - 3θ = 0 ⟹ θ = 2.0 ⬅️ Minimum!
i=1: x₄ → 0 ⟹ 4 - 1θ = 0 ⟹ θ = 4.0
i=2: x₁ → 0 ⟹ 6 - 0θ = 0 ⟹ θ = ∞ (d₂ = 0, skip)
Minimum ratio = 2.0 at index i=0, so x₂ (x₃) leaves the basis.
Step 10: Update Basis
Replace x₃ with x₂:
New Basis: [1, 3, 0] → [x₂, x₄, x₁]
Code Summary for Iteration 1
The code is identical to Iteration 0 - we simply run the same loop again with the updated basis. The key calculations are:
// Same code structure, now with basis = [2, 3, 0]
const B = extractBasisMatrix(3, 5, &A, basisIndex);
const B_inv = try matrixInverse(3, &B);
const x_B = matrixVectorMultiply(3, 3, &B_inv, &b);
// ... compute π, reduced costs, direction, ratio test, update basis
All functions remain the same - we just feed in the new basis!
Iteration 2: Checking Optimality
Step 1: Extract Basis Matrix B
Step 2: Compute B⁻¹
Or in exact fractions:
Step 3: Compute Basic Solution x_B = B⁻¹b
Current solution: x = [6, 2, 0, 2, 0] Objective value: z = (-5)(6) + (-3)(2) = -30 - 6 = -36
Step 4: Extract Basic Costs c_B
Step 5: Compute Simplex Multipliers π
Step 6: Compute Reduced Costs
j=0 (x₁): c̄₀ = -5 - [-1 0 -3]·[2 0 1]^T = -5 - (-2-3) = 0.0 ✓
j=1 (x₂): c̄₁ = -3 - [-1 0 -3]·[3 1 0]^T = -3 - (-3) = 0.0 ✓
j=2 (x₃): c̄₂ = 0 - [-1 0 -3]·[1 0 0]^T = 0 - (-1) = 1.0 ✓
j=3 (x₄): c̄₃ = 0 - [-1 0 -3]·[0 1 0]^T = 0 - 0 = 0.0 ✓
j=4 (x₅): c̄₄ = 0 - [-1 0 -3]·[0 0 1]^T = 0 - (-3) = 3.0 ✓
Reduced costs: [0.0, 0.0, 1.0, 0.0, 3.0]
Step 7: Check Optimality
All reduced costs ≥ 0! We have reached the optimal solution!
Code: Reporting Optimal Solution
// Construct full solution vector from basic solution
var x: [5]f64 = [_]f64{0.0} ** 5; // Start with all zeros
for (0..3) |i| {
x[basisIndex[i]] = x_B[i]; // Set basic variables
}
// Compute objective value
var obj_value: f64 = 0.0;
for (0..5) |i| {
obj_value += c[i] * x[i];
}
std.debug.print("Solution: x = [{d:.4}, {d:.4}, {d:.4}, {d:.4}, {d:.4}]\n",
.{ x[0], x[1], x[2], x[3], x[4] });
std.debug.print("Objective value: {d:.4}\n", .{obj_value});
Optimal Solution
Decision Variables:
- x₁ = 6
- x₂ = 2
Slack Variables:
- x₃ = 0 (constraint 1 is tight)
- x₄ = 2 (constraint 2 has slack)
- x₅ = 0 (constraint 3 is tight)
Objective Value: z = -36
Since we're minimizing z = -5x₁ - 3x₂, this is equivalent to maximizing 5x₁ + 3x₂ = 36.
Verification
Let's verify the constraints:
- 2(6) + 3(2) + 0 = 12 + 6 = 18 ✓
- 2 + 2 = 4 ✓
- 6 + 0 = 6 ✓
All constraints satisfied!
Conclusion
Hopefully this gives you a clear view the steps needed to implement a simple simplex solver in zig. Im still learning the simplex algorythm and Zig. So I dont expect this code to be optimsal in the slighest
Whats Next?
- Implementing dual simplex
- Branch and bound
- Testing and verifying against known problems and solutions
- Cutting Plane
- Optimisation passes, dive into some proper performance anaylis and assembly
- Python bindings
- .mps parser