Explainability (Glassbox)

This page discusses the specific metrics that have been developed to get a better understanding of what pushed the optimized values to their current value. These metrics are computed by the agents, but are not used by them. They are only meant for the “glass box”.

Types of Metrics

There are two types of metrics:

  • Impact indicates the pressure that a given criterion applies on a given Value Finder.

    • It has two metrics: instant and total.

  • Induced movement indicates the variation of the Value Finder induced by this pressure.

    • It has two metrics: instant and total.

These two types of metrics are complementary, similarly to mean and standard deviation in statistics. One could not draw relevant conclusions by looking at only one of them. They are computed for each pair Value Finder / neighbor (i.e. a criterion the VF is able to influence), and for the “probe” of each Value Finder (i.e. the action the VF does on its own, without being pushed by any neighbor).

This page focuses only on these specific metrics, but keep in mind that two other metrics are also useful to the glass box:

  • Criterion dissatisfaction

  • Value Finder’s influence on a criterion

These ones are actually used by agents to make decisions (see Multi-Agent Optimization Engine to read more about agents).

Impact

The impact of a criterion on a Value Finder is the ratio over the previous N decisions (by default N=10) of how many times this criterion was chosen to be helped. It reads as a metrics of the “share of responsibilities” the criterion has regarding the current value of the Value Finder. Hence, for a given Value Finder at any given time the sum of the impacts of its neighbors equals one.

Impact is calculated at each lifecycle of the agent.

The total impact is also calculated. This is the same ratio, but over all the decisions since the start of the run. This way impact shows recent responsibilities whereas total impact shows long term responsibilities.

Induced Movement

The movement induced by criterion C on Value Finder V is the sum over the last N decisions (by default, N=10) of the changes decided by V when C was chosen to be helped. It can be interpreted as the “result” of the impact of C on V. For instance, C has pressured V 90% of the time, but by how much did it make V move? Induced movement answers this question.

Induced movement is calculated at each lifecycle.

The total induced movement is also calculated. It is the same metrics, but calculated from the start of the run. Thus, the sum of total induced movements for a given Value Finder is always equal to the difference between its initial value and its current value.

Automatic Detection of Conflictual Neighbors

Impact and induced movement can be used to detect conflictual neighbors, i.e. neighbors that simultaneously push the Value Finder each in different directions. The next paragraph explains how the detection is done.

For a given Value Finder, among all its dissatisfied neighbors, those meeting these conditions are deemed conflictual:

  • Impact is greater than 0.05.

  • There is another dissatisfied neighbor with an induced movement of an opposite sign.

  • For a duration greater than a threshold inversely proportional to the impact.

The condition on impact filters out false positives that would arise in case of a punctual mistake when choosing the neighbor to help. The condition on the duration filters out false positives in case of a nominal change of helped neighbor, for instance when the convergence is pivoting around criteria.

The main problem that needs to be solved before being able to confidently push this detection towards the end-user is how to decide whether or not a neighbor is satisfied. The easy answer is to look at its dissatisfaction level: zero means satisfied. This works for the agents to make adequate decisions. But it is not so good when it is intended for a human user, we would have too many false positives. Indeed, target criteria rarely stay at a zero dissatisfaction level. This is because a target is by definition narrow and the precision depends on Value Finders' settings, especially their minimal amplitude.

Note that this method detects without differentiation conflicts that can – and will – be solved during convergence, and conflicts that results from an overconstrained problem that will leave the agents stuck.

Interpretation

While really informative, these metrics can be counterintuitive sometimes and should be interpreted with care. For instance, the total impact and total induced movement are highly dependent on the time the system is stopped. They provide a different information depending on when you look at them regarding when the system found a solution.

Indeed, an optimization run can usually be separated into two phases:

  • Pre-convergence – Global dissatisfaction decreases, local dissatisfactions may intersect with more or less oscillations, but the envelope of dissatisfaction curves is decreasing.

  • Post-convergence – Global dissatisfaction is stable, usually local dissatisfactions oscillate periodically.

Impacts and induced movements are very different during each of these phases. The criteria that bring the multi-agent system from its initial state to the solution are often not the same than the ones that make it stay at the solution.

If you look at the metrics after the multi-agent system has stayed a long time post-convergence, the total impacts of criteria that make it stay at solution will be very large although they may have nothing to do with the system being in this state in the first place. On the opposite, if you look at the metrics right at the moment the system converges, criteria with the highest total impact will not necessarily be the ones currently keeping it in this state. It is important to keep these phases in mind when we analyze total impact and total induced movement.

For instance, convergence happens around step 1000 in the case shown in the figure below. It is particularly obvious on the impacts chart (top left). These charts are extracted from a Value Finder in a simple test case. Each curve on a given chart represents a neighbor.


If you only look at the final value of total impact (bottom left), you could think that the yellow neighbor had a limited impact, and that would be even worse with more timesteps, while it was the second most impactful neighbor during the pre-convergence phase. You have to also look at the total induced movement (bottom right) to realize how much it was important.

In practice, during experimentations and explorations with the multi-agent system, two of these four metrics are primarily used to explain a result:

  • Impact (top left) to know which neighbors are currently blocking the Value Finder.

  • Total Induced Movement (bottom right) to know which neighbor made the Value move and by how much.

Induced movement (top right) is sometimes used to know when a neighbor made the Value Finder move. The chart efficiently shows pivots in the convergence (first, blue is helped, then bright pink, then yellow).

Total impact (bottom left) is only used when the system has been stopped right at the convergence time to identify the weight of each criterion on each Value Finder.

Glossary

See this page.

Found an issue in documentation? Write to us.