Recent Content Updates
November 2, 2024
Failure Detection in Agrona Streaming Applications
October 31, 2024
Sharing State in a Sequencer Environment
October 31, 2024
The Problem with Sequencers
October 26, 2024
Simple Binary Encoding: An Overview
February 4, 2024
Read Heavy Aeron Cluster Workloads
January 28, 2024
Increasing Replicated State Machine Throughput in Trading Systems
December 31, 2023
Can We Reduce Coordination in Cloud Trading Venues?
October 19, 2023
Is the cloud ready for low-latency messaging?
December 28, 2020
Chain replication, CRAQ & Hermes
December 20, 2020
Primary-Backup Replication
December 19, 2020
Three Types of Sequencers
December 13, 2020
Natural Batching
December 12, 2020
Backpressure
December 12, 2020
State Machine Replication
December 12, 2020
State Machine Replication: A Brief History
Recent Stream Updates
An interesting post from the Google Project Zero team on how they used an LLM to find real-world vulnerabilities in SQLite. I expect the cost implications of this approach will be significant, but it's an interesting outcome.
Today, we're excited to share the first real-world vulnerability discovered by the Big Sleep agent: an exploitable stack buffer underflow in SQLite, a widely used open source database engine. We discovered the vulnerability and reported it to the developers in early October, who fixed it on the same day. Fortunately, we found this issue before it appeared in an official release, so SQLite users were not impacted.
The approach uses an agent model, which I expect will a growing trend across business in the coming years.
It appears LLMs have developed quite the appetite for LeetCode-style puzzles, becoming remarkably good at solving them. Given that most LLM benchmarks test against similar puzzle-based tests, this overfitting is not that surprising. Unfortunately for actual software developers—who spend surprisingly little time reversing binary trees in production—this expertise is somewhat less useful.
Enter LiveCodeBench: a benchmark that continuously harvests fresh programming problems, ensuring LLMs can't simply memorize their way to algorithmic enlightenment while leaving real-world programming challenges in the dark.
LiveCodeBench is a holistic and contamination-free evaluation benchmark of LLMs for code that continuously collects new problems over time. Particularly, LiveCodeBench also focuses on broader code-related capabilities, such as self-repair, code execution, and test output prediction, beyond mere code generation
Another usage of formal verification to great, claimed, benefit in the realm of distributed systems. I am yet to see anyone in finance using any formal verification techniques.
Design reviews, code audits, stress testing, and fault injection are all invaluable tools we use regularly, and always will. However, we've found that we need to supplement these techniques in order to confirm correctness in many cases. Subtle bugs can still escape detection, particularly in large-scale, fault-tolerant architectures. And some issues might even be rooted in the original system design, rather than implementation flaws. As our services have grown in scale and complexity, we've had to supplement traditional testing approaches with more powerful techniques based on math and logic. This is where the branch of artificial intelligence (AI) called automated reasoning comes into play.
Their use of automated reasoning focuses on code correctness and maintainability rather than performance optimization. While I initially expected it might automatically identify efficiency improvements, its primary value lies in making the codebase more robust and adaptable to changes.
The reason is that the bug fixes we make during the process of formal verification often positively impact the code's runtime. Automated reasoning also gives our builders confidence to explore additional optimizations that improve system performance even further. We've found that formally verified code is easier to update, modify, and operate, leading to fewer late-night log analysis and debugging sessions.
One other challenge I have with formal verification is that I am yet to see source code that is a perfect implementation of the formal specification.
When running multiple distributed services that need to snapshot their state, the tried and tested Chandy-Lamport snapshot algorithm remains the standard. This team claim to have a new protocol that extends the capability of distributed system snapshots to cloud services with external interactions, allowing partial snapshots and yet retaining causal consistency.
This work presents Beaver, the first 'partial' snapshot protocol that extends the capability of distributed snapshots to cloud services with external interactions. Beaver provides the same basic abstraction as other snapshot protocols—for any event whose effects are observed in the snapshot, all other events that 'happened-before' are also included.
I will add the paper to my list of papers to dig deeper into. Aleksey Charapko also has this on his distributed systems reading list for Fall 2024.