HangukQuant Research

HangukQuant Research

Market Making - Tooling for Modelling Latency Requirements and Microstructural Behavior - I

a tad on hft tooling, with Python code

HangukQuant's avatar
HangukQuant
Dec 04, 2025
∙ Paid

One of the biggest differences between professionals and amateurs are in approach of conceiving trading strategies. A typical hft system can be anywhere between 10k~200k LOC, so what is everyone else doing in a hft firm?

Other than coming up with connectivity gateways, a large bulk of the time is spent on model iterations and challenging assumptions about active strategies. On the other hand, for most ‘retail traders’, a large proportion of time is spent on guesswork. For instance

  • I should be fast enough since my software t2t is sub10 micros.

  • This strategy worked on exchange A, so it should work on exchange B.

  • Network latency is the bulk of e2e latency so software optimisation is secondary.

These are all just assumptions based on hunches, and while most of them may hold, the reality of competitive market making be that all of the important assumptions about competition be satisfied. And they rarely are.

The consequence being that it is often unfathomable to a ‘retail trader’ why said market making operation seems to bleed money, and in a fantastically consistent manner. They tweak position limits, change spreads, quote sizes, cancel thresholds and do everything conceivable without addressing any of the core issues. I have seen live, bleeding trading systems where algorithmic traders try to keep afloat when simple, diagnostic tools would have convinced them to shut it down immediately.

Imagine a blind man rubbing his wife’s kneecap saying “are you close?”.

You can sort of think about trading strategies at short durations as the following:

A larger threshold is required for lower cadence strategies due to the reduced number of bets, and the shape is related to the law of active portfolio management. I will speak on where a trading strategy typically lies on the duration axis.

At some duration delta, a market maker may have some internal fair pricing that gives the EV of a trade. That may look like the following:

The latency sensitivity of the strategy is typically a function of the fudge factors and competition in executing the trade. For instance you can imagine that a triangular arbitrage on a single exchange has very little friction in terms of margin management and exchange-specific contract pricing. These will be extremely sensitive to latency conditions - every trader on the platform basically operate on the same bps thresholds.

Somewhere in the middle lies funding arbitrage. You can imagine that multi-exchange operations like this has fudge factors in terms of modelling (e.g. normalisation for variable funding mechanisms/margin/denominator assets) and execution (quoting, optimal leverage and margin problem, spot borrow). It is a kind-of statistical arbitrage problem where differences in the model imply competitors running similar strategies can have different fairprices - consequently these are less sensitive in terms of execution speed. If we take it abit further, such as beta-neutral funding portfolios (crypto) or fundamental pairs trading (equities in the same GICs classification) the latency requirements can become a secondary operational concern.

At any given delta, though, a well calibrated model (in shorter term trading) often faces the same execution problem: the projected model’s EV half-life is extremely dependent on which band of the threshold it is currently in. A model that suggests it is in taker-taker threshold (if well calibrated) should quickly converge to a less profitable band as other traders running similar strategies capitalise on that opportunity. It would be rather foolish to assume that such an outsized PNL basis is left hanging for more than a single second - a more reasonable conclusion is that the model’s estimation errors are correlated to some extraneous variable.

It should now be clear that different execution strategies incur different costs, and these result in different thresholds (again here different participants are subject to different costs). The latency requirements then depend on the correlated strategies of the other participants in relation to the size of the expected return.

It is not too far of a step in concluding that understanding competition in a particular market (platform + instrument) for a particular strategy is crucial in deciding whether a trading strategy should be deployed in the first place.

That was abit of a long rant, but that’s the whole point of this post. We shall take a look at how to model (both visually and quantitatively) the propagation latency of information between correlated assets. It will be a useful exercise for many strategies, including lead-lag/statistical/funding arbitrage type strategies, as well cross-exchange market making and derivative pricing scenarios (e.g. option price movements on Deribit after btcusdt move on Binance).

Today we will just look at writing a software harness for visually inspecting lead-lag behaviour on different markets. This can give useful information as to how traders react on different exchanges, which can be quite distinct based on factors such as venue’s dominance, tick size (and hence queue priority), and (physical) distance to the leader exchange.

The idea is just to model leader exchange as pushing triggers on these parameterisations - b, eta, m, d. The lagger reacts to these triggers.

\(b, \quad \eta, \quad m,\quad d\)

b- represents the minimum move in basis points on a single bba tick at time t.
eta - parameterises the sensitivity/beta of response at lagger to the b-sized moved on leader. useful for comparing markets with unique betas or lead-lag correlation.
m - the minimum lookahead to accept a response from the lagger venue. responses after t + m are invalid.
d - debouncing parameter to ignore duplicated triggers on source exchange up to t + d.

Here is what such a modelling exercise may look like: (code below)

In this case the tick size is identical. We can take a zoom-ed in look:

And you can see that the reaction-trigger latencies occur well inside of 40ms between the two exchange timestamps. It doesn’t take too much effort to spin up two ec2 instances and run a ping test from Tokyo to Singapore:

64 bytes from 13.230.31.102: icmp_seq=454 ttl=117 time=68.8 ms
64 bytes from 13.230.31.102: icmp_seq=455 ttl=117 time=68.8 ms
64 bytes from 13.230.31.102: icmp_seq=456 ttl=117 time=68.7 ms
64 bytes from 13.230.31.102: icmp_seq=457 ttl=117 time=68.8 ms ...

Roughly between 34~35ms between the two AZs. Considering that binance fapi-streams often experience jitter up to high single digits, the latency budget is incredibly tight, and in fact a pure market making system running ultra low-latency algorithms between these two exchanges have low to zero latency budgets. Having access to fapi-mm, fstream-mm is essentially a minimum ticket.

Let’s try another one, say Deribit btc-perps with primary servers in Equinix LD4 (London). Tick sizes are five times thicker, so multiple ‘triggers’ are consolidated into one ‘reaction’. This one is already zoomed-in, and you can see propagation latencies <75ms.

64 bytes from 13.230.31.102: icmp_seq=314 ttl=117 time=210 ms
64 bytes from 13.230.31.102: icmp_seq=315 ttl=117 time=210 ms
64 bytes from 13.230.31.102: icmp_seq=316 ttl=117 time=210 ms
64 bytes from 13.230.31.102: icmp_seq=317 ttl=117 time=210 ms...

A simply proxy from eu-west2 and ap-northeast1 gives >100ms on a one-way trip. The latency budget is negative. That can’t be right! Except, remember our previous post on appreciating hft network infra landscapes:

Nimble Market Maker - appreciating hft landscape

Nimble Market Maker - appreciating hft landscape

HangukQuant
·
Nov 25
Read full story

Between these two venues, information takes roughly 55~70ms cross-continent on a dedicated fiber path…

research.hangukquant.com sale~ into 2026

research.hangukquant.com sale~ into 2026

HangukQuant
·
Dec 1
Read full story

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 QUANTA GLOBAL PTE. LTD. 202328387H. · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture