Link: https://code.fb.com/data-infrastructure/accelerate-large-scale-applications-with-bolt/
Abstract: Highly complex services, such as those here at Facebook, have large source code bases in order to deliver a wide range of features and functionality. Even after the machine code for one of these services is compiled, it can range from 10s to 100s of megabytes in size, which is often too large to fit in any modern CPU instruction cache. As a result, the hardware spends a considerable amount of processing time — nearly 30 percent, in many cases — getting an instruction stream from memory to the CPU. To address this issue, which is commonly known as instruction starvation, we have developed and deployed BOLT, a binary optimization and layout tool.