Constraints and Score: Overview

1. Score terminology

1.1. What is a score?

Every @PlanningSolution class has a score. The score is an objective way to compare two solutions. The solution with the higher score is better. The Solver aims to find the solution with the highest Score of all possible solutions. The best solution is the solution with the highest Score that Solver has encountered during solving, which might be the optimal solution.

Timefold Solver cannot automatically know which solution is best for your business, so you need to tell it how to calculate the score of a given @PlanningSolution instance according to your business needs. If you forget or are unable to implement an important business constraint, the solution is probably useless:

optimalWithIncompleteConstraints

1.2. Formalize the business constraints

To implement a verbal business constraint, it needs to be formalized as a score constraint. Luckily, defining constraints in Timefold Solver is very flexible through the following score techniques:

  • Score signum (positive or negative): maximize or minimize a constraint type

  • Score weight: put a cost/profit on a constraint type

  • Score level (hard, soft, …​): prioritize a group of constraint types

  • Pareto scoring (rarely used)

Take the time to acquaint yourself with the first three techniques. Once you understand them, formalizing most business constraints becomes straightforward.

Do not presume that your business knows all its score constraints in advance. Expect score constraints to be added, changed or removed after the first releases.

1.3. Score constraint signum (positive or negative)

All score techniques are based on constraints. A constraint can be a simple pattern (such as Maximize the apple harvest in the solution) or a more complex pattern. A positive constraint is a constraint you want to maximize. A negative constraint is a constraint you want to minimize

positiveAndNegativeConstraints

The image above illustrates that the optimal solution always has the highest score, regardless if the constraints are positive or negative.

Most planning problems have only negative constraints and therefore have a negative score. In that case, the score is the sum of the weight of the negative constraints being broken, with a perfect score of 0. For example, in vehicle routing, the score is the negative of the total distance driven by all vehicles.

Negative and positive constraints can be combined, even in the same score level.

When a constraint activates (because the negative constraint is broken or the positive constraint is fulfilled) on a certain planning entity set, it is called a constraint match.

1.4. Score constraint weight

Not all score constraints are equally important. If breaking one constraint is equally bad as breaking another constraint x times, then those two constraints have a different weight (but they are in the same score level). For example, in vehicle routing, you can make one unhappy driver constraint match count as much as two fuel tank usage constraint matches:

scoreWeighting

Score weighting is easy in use cases where you can put a price tag on everything. In that case, the positive constraints maximize revenue and the negative constraints minimize expenses, so together they maximize profit. Alternatively, score weighting is also often used to create social fairness. For example, an employee, who requests a free day, pays a higher weight on New Years eve than on a normal day.

The weight of a constraint match can depend on the planning entities involved. For example, in vehicle routing, a weight of using an 18-wheeler truck to make the delivery will be higher than a weight of a delivery van, as the latter will be cheaper to run.

Putting a good weight on a constraint is often a difficult analytical decision, because it is about making choices and trade-offs against other constraints. Different stakeholders have different priorities. Don’t waste time with constraint weight discussions at the start of an implementation, instead add a constraint configuration and allow users to change them through a UI. A non-accurate weight is less damaging than mediocre algorithms:

scoreTradeoffInPerspective

Most use cases use a Score with int weights, such as HardSoftScore.

1.5. Score constraint level (hard, soft, …​)

Sometimes a score constraint outranks another score constraint, no matter how many times the latter is broken. In that case, those score constraints are in different levels. For example, an employee cannot do two shifts at the same time (due to the constraints of physical reality), so this outranks all employee happiness constraints.

Most use cases have only two score levels, hard and soft. The levels of two scores are compared lexicographically. The first score level gets compared first. If those differ, the remaining score levels are ignored. For example, a score that breaks 0 hard constraints and 1000000 soft constraints is better than a score that breaks 1 hard constraint and 0 soft constraints.

scoreLevels

If there are two (or more) score levels, for example HardSoftScore, then a score is feasible if no hard constraints are broken.

By default, Timefold Solver will always assign all planning variables a planning value. If there is no feasible solution, this means the best solution will be infeasible. To instead leave some of the planning entities unassigned, apply overconstrained planning.

For each constraint, you need to pick a score level, a score weight and a score signum. For example: -1soft which has score level of soft, a weight of 1 and a negative signum. Do not use a big constraint weight when your business actually wants different score levels. That hack, known as score folding, is broken:

scoreFoldingIsBroken

Your business might tell you that your hard constraints all have the same weight, because they cannot be broken (so the weight does not matter). This is not true because if no feasible solution exists for a specific dataset, the least infeasible solution allows the business to estimate how many business resources they are lacking.

Furthermore, it will likely create a score trap. For example, in vehicle routing if a vehicle exceeds its capacity by 15 tons, it must be penalized three times as much as if it had only exceeded its capacity by five tons. (Possibly even exponentially weighted.)

Three or more score levels are also supported. For example: a company might decide that profit outranks employee satisfaction (or vice versa), while both are outranked by the constraints of physical reality.

To model fairness or load balancing, there is no need to use lots of score levels, even though Timefold Solver can handle many score levels.

Most use cases use a Score with two or three weights, such as HardSoftScore and HardMediumSoftScore.

1.6. Pareto scoring (AKA multi-objective optimization scoring)

Far less common is the use case of pareto optimization, which is also known as multi-objective optimization. In pareto scoring, score constraints are in the same score level, yet they are not weighted against each other. When two scores are compared, each of the score constraints are compared individually and the score with the most dominating score constraints wins. Pareto scoring can even be combined with score levels and score constraint weighting.

Consider this example with positive constraints, where we want to get the most apples and oranges. Since it is impossible to compare apples and oranges, we cannot weigh them against each other. Yet, despite that we cannot compare them, we can state that two apples are better than one apple. Similarly, we can state that two apples and one orange are better than just one orange. So despite our inability to compare some Scores conclusively (at which point we declare them equal), we can find a set of optimal scores. Those are called pareto optimal.

paretoOptimizationScoring

Scores are considered equal far more often. It is left up to a human to choose the better out of a set of best solutions (with equal scores) found by Timefold Solver. In the example above, the user must choose between solution A (three apples and one orange) and solution B (one apple and six oranges). It is guaranteed that Timefold Solver has not found another solution which has more apples or more oranges or even a better combination of both (such as two apples and three oranges).

Pareto scoring is currently not supported in Timefold Solver.

A pareto Score's compareTo method is not transitive because it does a pareto comparison. For example: having two apples is greater than one apple. One apple is equal to One orange. Yet, two apples are not greater than one orange (but actually equal). Pareto comparison violates the contract of the interface java.lang.Comparable's compareTo method, but Timefold Solver’s systems are pareto comparison safe, unless explicitly stated otherwise in this documentation.

1.7. Combining score techniques

All the score techniques mentioned above, can be combined seamlessly:

scoreComposition

1.8. Score interface

A score is represented by the Score interface, which naturally extends Comparable:

public interface Score<...> extends Comparable<...> {
    ...
}

The Score implementation to use depends on your use case. Your score might not efficiently fit in a single long value. Timefold Solver has several built-in Score implementations. Most use cases tend to use HardSoftScore.

scoreClassDiagram

All Score implementations also have an initScore (which is an int). It is mostly intended for internal use in Timefold Solver: it is the negative number of uninitialized planning variables. From a user’s perspective this is 0, unless a Construction Heuristic is terminated before it could initialize all planning variables (in which case Score.isSolutionInitialized() returns false).

The Score implementation (for example HardSoftScore) must be the same throughout a Solver runtime. The Score implementation is configured in the solution domain class:

@PlanningSolution
public class VehicleRoutePlan {
    ...

    @PlanningScore
    private HardSoftScore score;

}

1.9. Avoid floating point numbers in score calculation

Avoid the use of float or double in score calculation. Use BigDecimal or scaled long instead.

Floating point numbers (float and double) cannot represent a decimal number correctly. For example: a double cannot hold the value 0.05 correctly. Instead, it holds the nearest representable value. Arithmetic (including addition and subtraction) with floating point numbers, especially for planning problems, leads to incorrect decisions:

scoreWeightType

Additionally, floating point number addition is not associative:

System.out.println( ((0.01 + 0.02) + 0.03) == (0.01 + (0.02 + 0.03)) ); // returns false

This leads to score corruption.

Decimal numbers (BigDecimal) have none of these problems.

BigDecimal arithmetic is considerably slower than int, long or double arithmetic. In experiments we have seen the score calculation take five times longer.

Therefore, in many cases, it can be worthwhile to multiply all numbers for a single score weight by a plural of ten, so the score weight fits in a scaled int or long. For example, if we multiply all weights by 1000, a fuelCost of 0.07 becomes a fuelCostMillis of 70 and no longer uses a decimal score weight.

2. Choose a score type

Depending on the number of score levels and type of score weights you need, choose a Score type. Most use cases use a HardSoftScore.

To properly write a Score to a database (with JPA/Hibernate) or to XML/JSON (with JAXB/Jackson), see the integration chapter.

2.1. SimpleScore

A SimpleScore has a single int value, for example -123. It has a single score level.

    @PlanningScore
    private SimpleScore score;

Variants of this Score type:

  • SimpleLongScore uses a long value instead of an int value.

  • SimpleBigDecimalScore uses a BigDecimal value instead of an int value.

2.2. HardSoftScore (Recommended)

A HardSoftScore has a hard int value and a soft int value, for example -123hard/-456soft. It has two score levels (hard and soft).

    @PlanningScore
    private HardSoftScore score;

Variants of this Score type:

  • HardSoftLongScore uses long values instead of int values.

  • HardSoftBigDecimalScore uses BigDecimal values instead of int values.

2.3. HardMediumSoftScore

A HardMediumSoftScore which has a hard int value, a medium int value and a soft int value, for example -123hard/-456medium/-789soft. It has three score levels (hard, medium and soft). The hard level determines if the solution is feasible, and the medium level and soft level score values determine how well the solution meets business goals. Higher medium values take precedence over soft values irrespective of the soft value.

    @PlanningScore
    private HardMediumSoftScore score;

Variants of this Score type:

  • HardMediumSoftLongScore uses long values instead of int values.

  • HardMediumSoftBigDecimalScore uses BigDecimal values instead of int values.

2.4. BendableScore

A BendableScore has a configurable number of score levels. It has an array of hard int values and an array of soft int values, for example with two hard levels and three soft levels, the score can be [-123/-456]hard/[-789/-012/-345]soft. In that case, it has five score levels. A solution is feasible if all hard levels are at least zero.

A BendableScore with one hard level and one soft level is equivalent to a HardSoftScore, while a BendableScore with one hard level and two soft levels is equivalent to a HardMediumSoftScore.

    @PlanningScore(bendableHardLevelsSize = 2, bendableSoftLevelsSize = 3)
    private BendableScore score;

The number of hard and soft score levels need to be set at compilation time. It is not flexible to change during solving.

Do not use a BendableScore with seven levels just because you have seven constraints. It is extremely rare to use a different score level for each constraint, because that means one constraint match on soft 0 outweighs even a million constraint matches of soft 1.

Usually, multiple constraints share the same level and are weighted against each other. Use score explanations to get the weight of individual constraints in the same level.

Variants of this Score type:

  • BendableLongScore uses long values instead of int values.

  • BendableBigDecimalScore uses BigDecimal values instead of int values.

3. Calculate the Score

3.1. Score calculation types

There are several ways to calculate the Score of a solution in Hava or another JVM language:

Every score calculation type can work with any Score definition (such as HardSoftScore or HardMediumSoftScore). All score calculation types are Object Oriented and can reuse existing Java code.

The score calculation must be read-only. It must not change the planning entities or the problem facts in any way. For example, it must not call a setter method on a planning entity in the score calculation.

Timefold Solver does not recalculate the score of a solution if it can predict it (unless an environmentMode assertion is enabled). For example, after a winning step is done, there is no need to calculate the score because that move was done and undone earlier. As a result, there is no guarantee that changes applied during score calculation actually happen.

To update planning entities when the planning variable change, use shadow variables instead.

3.2. InitializingScoreTrend

The InitializingScoreTrend specifies how the Score will change as more and more variables are initialized (while the already initialized variables do not change). Some optimization algorithms (such Construction Heuristics and Exhaustive Search) run faster if they have such information.

For the score (or each score level separately), specify a trend:

  • ANY (default): Initializing an extra variable can change the score positively or negatively. Gives no performance gain.

  • ONLY_UP (rare): Initializing an extra variable can only change the score positively. Implies that:

    • There are only positive constraints

    • And initializing the next variable cannot unmatch a positive constraint that was matched by a previous initialized variable.

  • ONLY_DOWN: Initializing an extra variable can only change the score negatively. Implies that:

    • There are only negative constraints

    • And initializing the next variable cannot unmatch a negative constraint that was matched by a previous initialized variable.

Most use cases only have negative constraints. Many of those have an InitializingScoreTrend that only goes down:

  <scoreDirectorFactory>
    <constraintProviderClass>...MyConstraintProvider</constraintProviderClass>
    <initializingScoreTrend>ONLY_DOWN</initializingScoreTrend>
  </scoreDirectorFactory>

Alternatively, you can also specify the trend for each score level separately:

  <scoreDirectorFactory>
    <constraintProviderClass>...MyConstraintProvider</constraintProviderClass>
    <initializingScoreTrend>ONLY_DOWN/ONLY_DOWN</initializingScoreTrend>
  </scoreDirectorFactory>

3.3. Invalid score detection

When you put the environmentMode in FULL_ASSERT (or FAST_ASSERT), it will detect score corruption in the incremental score calculation. However, that will not verify that your score calculator actually implements your score constraints as your business desires. For example, one constraint might consistently match the wrong pattern. To verify the constraints against an independent implementation, configure a assertionScoreDirectorFactory:

  <environmentMode>FAST_ASSERT</environmentMode>
  ...
  <scoreDirectorFactory>
    <constraintProviderClass>...ConstraintProvider</constraintProviderClass>
    <assertionScoreDirectorFactory>
      <easyScoreCalculatorClass>...EasyScoreCalculator</easyScoreCalculatorClass>
    </assertionScoreDirectorFactory>
  </scoreDirectorFactory>

This way, the ConstraintProvider implementation is validated by the EasyScoreCalculator.

This works well to isolate score corruption, but to verify that the constraint implement the real business needs, a unit test with a ConstraintVerifier is usually better.