Note

Message Priotization and Sequencers

October 31, 2024 - State Machine Replication, Sequencer - growing
2 mins

It is only possible to send priority messages that skip over any existing in-flight messages over a sequencer by breaking the global total order of messages. Breaking the global total order of messages means that the state machines running on the sequencer will no longer be fault-tolerant or correct. However, we can solve specific business problems without prioritizing and breaking global total ordering by using a side channel.

Let’s examine a simplified trading system with three key components:

  1. a sequencer,
  2. a smart order router that directs execution requests, and
  3. an Algo trading engine that breaks trades into smaller requests and routes them through the smart order router.

Both the router and Algo engine operate as replicated state machines.

The problem with sequencers

But if the Algo engine malfunctions and floods the router with requests, those requests will cascade to execution venues. We, therefore, need a mechanism to bypass the sequencer’s ordering rules specifically for emergency halt messages that can stop this flood of requests.

However, bypassing the sequencer’s total ordering guarantee also disrupts the state machine replication constraints. When these constraints are broken, the system loses its consistency guarantees, rendering it incorrect and no longer fault-tolerant.

So how do we solve this problem?

The solution to the problem with sequencers

To solve this problem, we bypass the sequencer’s total ordering guarantee and use a side channel to send priority messages to the execution gateways. These gateways are not subject to the sequencer’s total ordering guarantee, are not built as state machines, and can, therefore, prioritize specific messages.

In addition, we also need to calm the Algo trading engine down so that it doesn’t send so many requests to the smart order router. We do this by sequencing an administrative message to the Algo trading engine that tells it to stop sending requests to the smart order router. Because the Execution Gateways are disconnected, they will decline any requests for execution, which will send messages back to the sequencer and allow us to retain the total ordering guarantee and the state machine replication constraints.


Changelog

  • November 9, 2024Downgraded from Essay to Note
  • October 31, 2024Initial version

The colors used in the diagrams in this post are sourced from a bar in Agra, India.