high frequency trading - reading the pmu for code profiling
lightweight benchmarking harness in python
Optimizing code for ultra-low latency requires precise profiling tools that can measure performance counters without introducing significant overhead. Linux’s perf is a fantastic tool but can be problematic for micro-benchmarking scenarios common in HFT pipelines.
Article explores the Performance Monitoring Unit (PMU), its role in code profiling, and introduces a lightweight Python-based controller for direct PMU access. While demonstrated in Python for clarity, the underlying syscalls are straightforward to port to C or C++.
In the next post we will showcase using tools written here - to benchmark a custom bba parse against orjson.

