type: COP

difficulty: easy

# Problem *Flow Shop Scheduling*

*From WikiPedia*: “There are n machines and m jobs. Each job contains exactly n operations. The ith operation of the job must be executed on the ith machine. No machine can perform more than one operation simultaneously. For each operation of each job, execution time is specified. Operations within one job must be performed in the specified order. The first operation gets executed on the first machine, then (as the first operation is finished) the second operation on the second machine, and so on until the nth operation. Jobs can be executed in any order, however. Problem definition implies that this job order is exactly the same for each machine. The problem is to determine the optimal such arrangement, i.e. the one with the shortest possible total job execution makespan.”

Here is an example of (no-wait) flow-shop scheduling with five jobs on two machines A and B. A comparison of total makespan is given for two different job sequences. Image from commons.wikimedia.org.

To build a COP (Constraint Optimization Problem) model, we need first to import the library PyCSP$^3$:

```
from pycsp3 import *
```

Then, we need some data. Here, this is a two-dimensional array of integers $\mathtt{durations}$ such that $\mathtt{durations}$[i][j] is the duration of operation/machine j for job i.

```
durations = [
[34, 2, 54, 61],
[15, 89, 70, 9],
[38, 19, 28, 87],
[95, 7, 34, 29]
]
```

From these data, we can compute a time horizon, i.e., a line with time points where jobs can be started. We also denote by $n$ and $m$ the number of jobs and the number of operations, respectively.

```
horizon = sum(sum(t) for t in durations) # time horizon
n = len(durations) # number of jobs
m = len(durations[0]) # number of operations
```

We can check the computed values.

```
print(f"horizon: {horizon} n: {n} m: {m}")
```

```
horizon: 671 n: 4 m: 4
```

We start our COP model with a two dimensional array $x$ of $n\times m$ variables, each one with the time horizon, i.e., ${0,1,2,\dots, 671}$, as domain.

```
# x[i][j] is the start time of the jth operation for the ith job
x = VarArray(size=[n, m], dom= range(horizon+1))
```

To control that everything is fine, we can display the structure of the array of variables, as well as the domain of the first variable (here, remember that all variables have the same domain).

```
print("Array of variables:", x)
print("Domain of variables: ", x[0][0].dom)
```

```
Array of variables: [
[x[0][0], x[0][1], x[0][2], x[0][3]]
[x[1][0], x[1][1], x[1][2], x[1][3]]
[x[2][0], x[2][1], x[2][2], x[2][3]]
[x[3][0], x[3][1], x[3][2], x[3][3]]
]
Domain of variables: 0..671
```

Concerning the constraints, we post a first group of constraints for ensuring that the starting times of all successive operations concerning a job are in increasing order, while ensuring that the durations of these operations is taken into account. We use the constraint Increasing that takes as first parameter a list of variables, and possibly as second parameter (named parameter *lengths*) a list of integers. It ensures that the value of the ith variable of the specified list plus the value of the ith integer is less than or equal to the value of the i+1th variable. This gives:

```
satisfy(
# operations must be ordered on each job
Increasing(x[i], lengths=durations[i]) for i in range(n)
);
```

Interestingly, by calling the function *solve()*, we can check that the problem is satisfiable (SAT). We can also display, per job, the values assigned to the variables in the solution that has been found. Here, we call the function *values()* that collects the values assigned to a specified list of variables.

```
if solve() is SAT:
for i in range(n):
print(f"Starting times of operations of job {i}: {values(x[i])}")
```

```
Starting times of operations of job 0: [0, 34, 36, 90]
Starting times of operations of job 1: [0, 15, 104, 174]
Starting times of operations of job 2: [0, 38, 57, 85]
Starting times of operations of job 3: [0, 95, 102, 136]
```

Note that the constraint Increasing is mainly an ease of use. Instead, one could have written:

```
satisfy(
# operations must be ordered on each job
[x[i][j] + durations[i][j] <= x[i][j+1] for j in range(m-1)]
for i in range(n)
);
```

Clearly, operations of different jobs on the same machine overlap. This is why we have to post a constraint NoOverlap per machine (kind of operation); to do this, we reason from the columns of both arrays $x$ and $\mathtt{durations}$.

```
satisfy(
# no overlap on resources
NoOverlap(origins=columns(x)[j], lengths=columns(durations)[j]) for j in range(m)
);
```

We can display the internal representation of the constraints that have been posted so far. Although this is a little bit technical, it allows us to control that constraints are correctly posted.

```
print(posted())
```

```
ordered(list:[x[0][0], x[0][1], x[0][2], x[0][3]], lengths:[34, 2, 54], operator:le)
ordered(list:[x[1][0], x[1][1], x[1][2], x[1][3]], lengths:[15, 89, 70], operator:le)
ordered(list:[x[2][0], x[2][1], x[2][2], x[2][3]], lengths:[38, 19, 28], operator:le)
ordered(list:[x[3][0], x[3][1], x[3][2], x[3][3]], lengths:[95, 7, 34], operator:le)
noOverlap(origins:[x[0][0], x[1][0], x[2][0], x[3][0]], lengths:[34, 15, 38, 95])
noOverlap(origins:[x[0][1], x[1][1], x[2][1], x[3][1]], lengths:[2, 89, 19, 7])
noOverlap(origins:[x[0][2], x[1][2], x[2][2], x[3][2]], lengths:[54, 70, 28, 34])
noOverlap(origins:[x[0][3], x[1][3], x[2][3], x[3][3]], lengths:[61, 9, 87, 29])
```

We can now check that we get a valid solution to our problem.

```
if solve() is SAT:
for i in range(n):
print(f"Starting times of operations of job {i}: {values(x[i])}")
```

```
Starting times of operations of job 0: [0, 34, 36, 90]
Starting times of operations of job 1: [34, 49, 138, 208]
Starting times of operations of job 2: [49, 138, 208, 236]
Starting times of operations of job 3: [87, 182, 236, 323]
```

Unfortunately, this solution is not totally satisfying, because the makespan (i.e., the time point at which the scheduling is finished) is rather large. We need to minimize it using the function *Maximum()*. We consider the ending time of each job (i.e., the ending time of the last operation of each job), and compute the maximal value. This is what we want to minimize as follows:

```
minimize(
# minimizing the makespan
Maximum(x[i][-1] + durations[i][-1] for i in range(n))
);
```

We can run again the solver, with this optimization task. Note that we need to check that the status returned by the solver is now OPTIMUM.

```
if solve() is OPTIMUM:
for i in range(n):
print(f"Starting times of operations of job {i}: {values(x[i])}")
print("Makespan: ", bound())
```

```
Starting times of operations of job 0: [38, 72, 85, 172]
Starting times of operations of job 1: [72, 87, 223, 293]
Starting times of operations of job 2: [0, 38, 57, 85]
Starting times of operations of job 3: [87, 182, 189, 233]
Makespan: 302
```

The makespan, obtained here by calling the function *bound()*, is 302 because the last operation starts at 293 with a duration equal to 9.

Finally, we give below the model in one piece. Here the data is expected to be given by the user (in a command line). Note that the notation [:, j] stands for the jth column of a two-dimensional array (list); here, we can use this notation because the arrays have specific types (the fact of externally loading the data guarantees this for the arry $\mathtt{durations}$).

```
from pycsp3 import *
durations = data
horizon = sum(sum(t) for t in durations) + 1
n, m = len(durations), len(durations[0])
# s[i][j] is the start time of the jth operation for the ith job
s = VarArray(size=[n, m], dom=range(horizon))
satisfy(
# operations must be ordered on each job
[Increasing(s[i], lengths=durations[i]) for i in range(n)],
# no overlap on resources
[NoOverlap(origins=s[:, j], lengths=durations[:, j]) for j in range(m)]
)
minimize(
# minimizing the makespan
Maximum(s[i][-1] + durations[i][-1] for i in range(n))
)
```