Repeated planning

1. Introduction to repeated planning

The problem facts used to create a solution may change before or during the execution of that solution. Delaying planning in order to lower the risk of problem facts changing is not ideal, as an incomplete plan is preferable to no plan.

The following examples demonstrate situations where planning solutions need to be altered due to unpredictable changes:

  • Unforeseen fact changes

    • An employee assigned to a shift calls in sick.

    • An airplane scheduled to take off has a technical delay.

    • One of the machines or vehicles break down.

      Unforeseen fact changes benefit from using backup planning.

  • Cannot assign all entities immediately

    Leave some unassigned. For example:

    • There are 10 shifts at the same time to assign but only nine employees to handle shifts.

      For this type of planning, use overconstrained planning.

  • Unknown long term future facts

    For example:

    • Hospital admissions for the next two weeks are reliable, but those for week three and four are less reliable, and for week five and beyond are not worth planning yet.

      This problem benefits from continuous planning.

  • Constantly changing problem facts

More CPU time results in a better planning solution.

Timefold allows you to start planning earlier, despite unforeseen changes, as the optimization algorithms support planning a solution that has already been partially planned. This is known as repeated planning.

2. Backup planning

Backup planning adds extra score constraints to create space in the planning for when things go wrong. That creates a backup plan within the plan itself.

An example of backup planning is as follows:

  1. Create an extra score constraint. For example:

    • Assign an employee as the spare employee (one for every 10 shifts at the same time).

    • Keep one hospital bed open in each department.

  2. Change the planning problem when an unforeseen event occurs.

    For example, if an employee calls in sick:

    • Delete the sick employee and leave their shifts unassigned.

    • Restart the planning, starting from that solution, which now has a different score.

The construction heuristics fills in the newly created gaps (probably with the spare employee) and the metaheuristics will improve it even further.

3. Overconstrained planning

When there is no feasible solution to assign all planning entities, it is preferable to assign as many entities as possible without breaking hard constraints. This is called overconstrained planning.

By default, Timefold assigns all planning entities, overloads the planning values, and therefore breaks hard constraints. There are two ways to avoid this:

  • Use nullable planning variables, so that some entities are unassigned.

  • Add virtual values to catch the unassigned entities.

3.1. Overconstrained planning with nullable variables

If we handle overconstrained planning with nullable variables, the overload entities will be left unassigned:

overconstrainedPlanning

To implement this:

  1. Add a score level (usually a medium level between the hard and soft level) by switching Score type.

  2. Make the planning variable nullable.

  3. Add a score constraint on the new level (usually a medium constraint) to penalize the number of unassigned entities (or a weighted sum of them).

3.2. Overconstrained planning with virtual values

In overconstrained planning it is often useful to know which resources are lacking. In overconstrained planning with virtual values, the solution indicates which resources to buy.

To implement this:

  1. Add an additional score level (usually a medium level between the hard and soft level) by switching Score type.

  2. Add a number of virtual values. It can be difficult to determine a good formula to calculate that number:

    • Do not add too many, as that will decrease solver efficiency.

    • Importantly, do not add too few as that will lead to an infeasible solution.

  3. Add a score constraint on the new level (usually a medium constraint) to penalize the number of virtual assigned entities (or a weighted sum of them).

  4. Optionally, change all soft constraints to ignore virtual assigned entities.

4. Continuous planning (windowed planning)

Continuous planning is the technique of planning one or more upcoming planning periods at the same time and repeating that process monthly, weekly, daily, hourly, or even more frequently. However, as time is infinite, planning all future time periods is impossible.

continuousPlanningEmployeeRostering

In the employee rostering example above, we re-plan every four days. Each time, we actually plan a window of 12 days, but we only publish the first four days, which is stable enough to share with the employees, so they can plan their social life accordingly.

continuousPlanningPatientAdmissionSchedule

In the hospital bed planning example above, notice the difference between the original planning of November 1st and the new planning of November 5th: some problem facts (F, H, I, J, K) changed in the meantime, which results in unrelated planning entities (G) changing too.

The planning window can be split up in several stages:

  • History

    Immutable past time periods. It contains only pinned entities.

    • Recent historic entities can also affect score constraints that apply to movable entities. For example, in nurse rostering, a nurse that has worked the last three historic weekends in a row should not be assigned to three more weekends in a row, because she requires a one free weekend per month.

    • Do not load all historic entities in memory: even though pinned entities do not affect solving performance, they can cause out of memory problems when the data grows to years. Only load those that might still affect the current constraints with a good safety margin.

  • Published

    Upcoming time periods that have been published. They contain only pinned and/or semi-movable planning entities.

    • The published schedule has been shared with the business. For example, in nurse rostering, the nurses will use this schedule to plan their personal lives, so they require a publish notice of for example 3 weeks in advance. Normal planning will not change that part of schedule.

      Changing that schedule later is disruptive, but were exceptions force us to do them anyway (for example someone calls in sick), do change this part of the planning while minimizing disruption with non-disruptive replanning.

  • Draft

    Upcoming time periods after the published time periods that can change freely. They contain movable planning entities, except for any that are pinned for other reasons (such as being pinned by a user).

    • The first part of the draft, called the final draft, will be published, so these planning entities can change one last time. The publishing frequency, for example once per week, determines the number of time periods that change from draft to published.

    • The latter time periods of the draft are likely change again in later planning efforts, especially if some of the problem facts change by then (for example nurse Ann doesn’t want to work on one of those days).

      Despite that these latter planning entities might still change a lot, we can’t leave them out for later, because we would risk painting ourselves into a corner. For example, in employee rostering we could have all our rare skilled employees working the last 5 days of the week that gets published, which won’t reduce the score of that week, but will make it impossible for us to deliver a feasible schedule the next week. So the draft length needs to be longer than the part that will be published first.

    • That draft part is usually not shared with the business yet, because it is too volatile and it would only raise false expectations. However, it is stored in the database and used as a starting point for the next solver.

  • Unplanned (out of scope)

    Planning entities that are not in the current planning window.

continuousPublishingWithRotation

4.1. Pinned planning entities

A pinned planning entity doesn’t change during solving. This is commonly used by users to pin down one or more specific assignments and force Timefold to schedule around those fixed assignments.

4.1.1. Pin down planning entities with @PlanningPin

To pin some planning entities down, add an @PlanningPin annotation on a boolean getter or field of the planning entity class. That boolean is true if the entity is pinned down to its current planning values and false otherwise.

  1. Add the @PlanningPin annotation on a boolean:

    @PlanningEntity
    public class Lecture {
    
        private boolean pinned;
        ...
    
        @PlanningPin
        public boolean isPinned() {
            return pinned;
        }
    
        ...
    }

In the example above, if pinned is true, the lecture will not be assigned to another period or room (even if the current period and rooms fields are null).

4.1.2. Configure a PinningFilter

Alternatively, to pin some planning entities down, add a PinningFilter that returns true if an entity is pinned, and false if it is movable. This is more flexible and more verbose than the @PlanningPin approach.

For example on the nurse rostering example:

  1. Add the PinningFilter:

    public class ShiftAssignmentPinningFilter implements PinningFilter<NurseRoster, ShiftAssignment> {
    
        @Override
        public boolean accept(NurseRoster nurseRoster, ShiftAssignment shiftAssignment) {
            ShiftDate shiftDate = shiftAssignment.getShift().getShiftDate();
            return nurseRoster.getNurseRosterInfo().isInPlanningWindow(shiftDate);
        }
    
    }
  2. Configure the PinningFilter:

    @PlanningEntity(pinningFilter = ShiftAssignmentPinningFilter.class)
    public class ShiftAssignment {
        ...
    }

4.2. Nonvolatile replanning to minimize disruption (semi-movable planning entities)

Replanning an existing plan can be very disruptive. If the plan affects humans (such as employees, drivers, …​), very disruptive changes are often undesirable. In such cases, nonvolatile replanning helps by restricting planning freedom: the gain of changing a plan must be higher than the disruption it causes. This is usually implemented by taxing all planning entities that change.

nonDisruptiveReplanning

In the machine reassignment example, the entity has both the planning variable machine and its original value originalMachine:

@PlanningEntity(...)
public class ProcessAssignment {

    private MrProcess process;
    private Machine originalMachine;
    private Machine machine;

    public Machine getOriginalMachine() {...}

    @PlanningVariable(...)
    public Machine getMachine() {...}

    public boolean isMoved() {
        return originalMachine != null && originalMachine != machine;
    }

    ...
}

During planning, the planning variable machine changes. By comparing it with the originalMachine, a change in plan can be penalized:

rule "processMoved"
    when
        ProcessAssignment(moved == true)
    then
        scoreHolder.addSoftConstraintMatch(kcontext, -1000);
end

The soft penalty of -1000 means that a better solution is only accepted if it improves the soft score for at least 1000 points per variable changed (or if it improves the hard score).

5. Real-time planning

To do real-time planning, combine the following planning techniques:

  • Backup planning - adding extra score constraints to allow for unforeseen changes.

  • Continuous planning - planning for one or more future planning periods.

  • Short planning windows.

    This lowers the burden of real-time planning.

As time passes, the problem itself changes. Consider the vehicle routing use case:

realTimePlanningVehicleRouting

In the example above, three customers are added at different times (07:56, 08:02 and 08:45), after the original customer set finished solving at 07:55, and in some cases, after the vehicles have already left.

Timefold can handle such scenarios with ProblemChange (in combination with pinned planning entities).

5.1. ProblemChange

While the Solver is solving, one of the problem facts or planning entities may be changed by an outside event. For example, an airplane is delayed and needs the runway at a later time.

Do not change the problem fact instances used by the Solver while it is solving (from another thread or even in the same thread), as that will corrupt it.

Add a ProblemChange to the Solver, which it executes in the solver thread as soon as possible. For example:

public interface Solver<Solution_> {

    ...

    void addProblemChange(ProblemChange<Solution_> problemChange);

    boolean isEveryProblemChangeProcessed();

    ...

}

Similarly, you can pass the ProblemChange to the SolverManager:

public interface SolverManager<Solution_, ProblemId_> {

    ...

    CompletableFuture<Void> addProblemChange(ProblemId_ problemId, ProblemChange<Solution_> problemChange);

    ...

}

and the SolverJob:

public interface SolverJob<Solution_, ProblemId_> {

    ...

    CompletableFuture<Void> addProblemChange(ProblemChange<Solution_> problemChange);

    ...

}

Notice the method returns CompletableFuture<Void>, which is completed when a user-defined Consumer accepts the best solution containing this problem change.

public interface ProblemChange<Solution_> {

    void doChange(Solution_ workingSolution, ProblemChangeDirector problemChangeDirector);

}

The ScoreDirector must be updated with any change on the problem facts of planning entities in a ProblemChange.

To write a ProblemChange correctly, it is important to understand the behavior of a planning clone.

A planning clone of a solution must fulfill these requirements:

  • The clone must represent the same planning problem. Usually it reuses the same instances of the problem facts and problem fact collections as the original.

  • The clone must use different, cloned instances of the entities and entity collections. Changes to an original Solution entity’s variables must not affect its clone.

5.1.1. Cloud balancing ProblemChange example

Consider the following example of a ProblemChange implementation in the cloud balancing use case:

    public void deleteComputer(final CloudComputer computer) {
        solver.addProblemChange((cloudBalance, problemChangeDirector) -> {
            CloudComputer workingComputer = problemChangeDirector.lookUpWorkingObject(computer);
            if (workingComputer == null) {
                throw new IllegalStateException("A computer " + computer + " does not exist. Maybe it has been already deleted.");
            }
            // First remove the problem fact from all planning entities that use it
            for (CloudProcess process : cloudBalance.getProcessList()) {
                if (process.getComputer() == workingComputer) {
                    problemChangeDirector.changeVariable(process, "computer",
                            workingProcess -> workingProcess.setComputer(null));
                }
            }
            // A SolutionCloner does not clone problem fact lists (such as computerList)
            // Shallow clone the computerList so only workingSolution is affected, not bestSolution or guiSolution
            ArrayList<CloudComputer> computerList = new ArrayList<>(cloudBalance.getComputerList());
            cloudBalance.setComputerList(computerList);
            // Remove the problem fact itself
            problemChangeDirector.removeProblemFact(workingComputer, computerList::remove);
        });
    }
  1. Any change in a ProblemChange must be done on the @PlanningSolution instance of scoreDirector.getWorkingSolution().

  2. The workingSolution is a planning clone of the BestSolutionChangedEvent's bestSolution.

    • The workingSolution in the Solver is never the same solution instance as in the rest of your application: it is a planning clone.

    • A planning clone also clones the planning entities and planning entity collections.

      Thus, any change on the planning entities must happen on the workingSolution instance passed to the ProblemChange.doChange(Solution_ workingSolution, ProblemChangeDirector problemChangeDirector) method.

  3. Use the method ProblemChangeDirector.lookUpWorkingObject() to translate and retrieve the working solution’s instance of an object. This requires annotating a property of that class as the @PlanningId.

  4. A planning clone does not clone the problem facts, nor the problem fact collections. Therefore the workingSolution and the bestSolution share the same problem fact instances and the same problem fact list instances.

    Any problem fact or problem fact list changed by a ProblemChange must be problem cloned first (which can imply rerouting references in other problem facts and planning entities). Otherwise, if the workingSolution and bestSolution are used in different threads (for example a solver thread and a GUI event thread), a race condition can occur.

5.1.2. Cloning solutions to avoid race conditions in real-time planning

Many types of changes can leave a planning entity uninitialized, resulting in a partially initialized solution. This is acceptable, provided the first solver phase can handle it.

All construction heuristics solver phases can handle a partially initialized solution, so it is recommended to configure such a solver phase as the first phase.

realTimePlanningConcurrencySequenceDiagram

The process occurs as follows:

  1. The Solver stops.

  2. Runs the ProblemChange.

  3. restarts.

    This is a warm start because its initial solution is the adjusted best solution of the previous run.

  4. Each solver phase runs again.

    This implies the construction heuristic runs again, but because little or no planning variables are uninitialized (unless you have a nullable planning variable), it finishes much quicker than in a cold start.

  5. Each configured Termination resets (both in solver and phase configuration), but a previous call to terminateEarly() is not undone.

    Termination is not usually configured (except in daemon mode); instead, Solver.terminateEarly() is called when the results are needed. Alternatively, configure a Termination and use the daemon mode in combination with BestSolutionChangedEvent as described in the following section.

5.2. Daemon: solve() does not return

In real-time planning, it is often useful to have a solver thread wait when it runs out of work, and immediately resume solving a problem once new problem fact changes are added. Putting the Solver in daemon mode has the following effects:

  • If the Solver's Termination terminates, it does not return from solve(), but blocks its thread instead (which frees up CPU power).

    • Except for terminateEarly(), which does make it return from solve(), freeing up system resources and allowing an application to shutdown gracefully.

    • If a Solver starts with an empty planning entity collection, it waits in the blocked state immediately.

  • If a ProblemChange is added, it goes into the running state, applies the ProblemChange and runs the Solver again.

To use the Solver in daemon mode:

  1. Enable daemon mode on the Solver:

    <solver xmlns="https://timefold.ai/xsd/solver" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="https://timefold.ai/xsd/solver https://timefold.ai/xsd/solver/solver.xsd">
      <daemon>true</daemon>
      ...
    </solver>

    Do not forget to call Solver.terminateEarly() when your application needs to shutdown to avoid killing the solver thread unnaturally.

  2. Subscribe to the BestSolutionChangedEvent to process new best solutions found by the solver thread.

    A BestSolutionChangedEvent does not guarantee that every ProblemChange has been processed already, nor that the solution is initialized and feasible.

  3. To ignore BestSolutionChangedEvents with such invalid solutions, do the following:

        public void bestSolutionChanged(BestSolutionChangedEvent<CloudBalance> event) {
            if (event.isEveryProblemChangeProcessed()
                    // Ignore infeasible (including uninitialized) solutions
                    && event.getNewBestSolution().getScore().isFeasible()) {
                ...
            }
        }
  4. Use Score.isSolutionInitialized() instead of Score.isFeasible() to only ignore uninitialized solutions, but do accept infeasible solutions too.

6. Multi-stage planning

In multi-stage planning, complex planning problems are broken down in multiple stages. A typical example is train scheduling, where one department decides where and when a train will arrive or depart and another department assigns the operators to the actual train cars or locomotives.

Each stage has its own solver configuration (and therefore its own SolverFactory):

multiStagePlanning

Planning problems with different publication deadlines must use multi-stage planning. But problems with the same publication deadline, solved by different organizational groups are also initially better off with multi-stage planning, because of Conway’s law and the high risk associated with unifying such groups.

Similarly to Partitioned Search, multi-stage planning leads to suboptimal results. Nevertheless, it might be beneficial in order to simplify the maintenance, ownership, and help to start a project.

Do not confuse multi-stage planning with multi-phase solving.