- Google’s network load balancer
- distributed system on commodity servers
- no physical deployment
- horizontally scalable
- load distribution via ECMP by network routers
- consistent hashing and connection tracking
- serving Google since 2008
- Downside of hardware LB
- limited vertical scale
- only 1+1 redundancy
- lack of flexibility + programmability
- expensive to upgrade

- larger clusters have hardware encapsulators between maglev and router so they don’t need to be in the same L2 domain
- Linux kernel is circumvented for fast packet processing
- steering module consistent hashes between rx queues with packet rewrites threads
- rewrites checks for VIP match, connection table
- 5 tuple host over round robin for stability of connections (rr is fallback)
- 813 kpps for 1500 bytes IP packets
- processing speed of 9.06 Mpps for smaller packets
Fast Packet Processing
- userspace app circumventing kernel
- shared packet pool with NIC
- ring queue of pointers pointing to packets in pool
- NIC and forwarder advance pointers in ring (no copying)
- backend selection consistency across Maglev machines is accomplished by consistent hashing based on backend names (simplified algorithm!!)
- kernel bypass 5x performance
- handling fragments via special backend pool containing all Maglevs
- special packet tracer packets that trigger Maglev to send debug info to SRC
- steering module bottleneck before moving to 40Gbps
- 5 packet threads saturate 10Gbps NIC
Questions