Code generation

Code generation or "codegen" is the part of the compiler that actually generates an executable binary. rustc uses LLVM for code generation.

NOTE: If you are looking for hints on how to debug code generation bugs, please see this section of the debugging chapter.

What is LLVM?

All of the preceding chapters of this guide have one thing in common: we never generated any executable machine code at all! With this chapter, all of that changes.

Like most compilers, rustc is composed of a "frontend" and a "backend". The "frontend" is responsible for taking raw source code, checking it for correctness, and getting it into a format X from which we can generate executable machine code. The "backend" then takes that format X and produces (possibly optimized) executable machine code for some platform. All of the previous chapters deal with rustc's frontend.

rustc's backend is LLVM, "a collection of modular and reusable compiler and toolchain technologies". In particular, the LLVM project contains a pluggable compiler backend (also called "LLVM"), which is used by many compiler projects, including the clang C compiler and our beloved rustc.

LLVM's "format X" is called LLVM IR. It is basically assembly code with additional low-level types and annotations added. These annotations are helpful for doing optimizations on the LLVM IR and outputted machine code. The end result of all this is (at long last) something executable (e.g. an ELF object or wasm).

There are a few benefits to using LLVM:

  • We don't have to write a whole compiler backend. This reduces implementation and maintenance burden.
  • We benefit from the large suite of advanced optimizations that the LLVM project has been collecting.
  • We automatically can compile Rust to any of the platforms for which LLVM has support. For example, as soon as LLVM added support for wasm, voila! rustc, clang, and a bunch of other languages were able to compile to wasm! (Well, there was some extra stuff to be done, but we were 90% there anyway).
  • We and other compiler projects benefit from each other. For example, when the Spectre and Meltdown security vulnerabilities were discovered, only LLVM needed to be patched.

Generating LLVM IR

TODO