The document discusses creating a high-performance virtual machine (VM) architecture that can easily implement VMs for new languages. It introduces Truffle, a language implementation framework, and Graal, a compilation infrastructure. With Truffle and Graal, language implementers can write interpreters for their languages and reusable services will provide a high-performance VM without requiring extensive work in compiler optimization. The guest language interpreter is partially evaluated and optimized based on profiling information during execution.
2. Background
¡ñ A small number of languages have high-perf VMs
¡ð E.g. Java, C#
¡ñ Many languages have low-perf VMs
¡ð Ruby, Python, R, MATLAB, PHP¡
¡ñ High-perf VMs are currently:
¡ð Highly complex
¡ð Biased towards a single language or family of languages
3. Big Idea
¡ñ Create architecture for easily implementing high-perf VMs for
new languages
1. Language implementer writes interpreter
2. *A miracle occurs*
3. High-performance VM
4. Truffle and Graal
¡ñ Truffle--language implementation framework
¡ñ Graal--compilation infrastructure
¡ñ VMs for many guest languages implemented in a managed host
language
¡ð Only guest-specific parts are written by language implementer
¡ð Reusable host services are provided by framework
¡ñ Guest language (usually) dynamically typed
6. AST and Optimization
¡ñ Guest language implements AST interpreter
¡ñ Node rewriting--during interpretation, a node can replace itself w/
another, based on profiling feedback:
¡ð Dynamic type info
¡ð Execution freqs
¡ð Rate of node rewrites
¡ñ When compiler is evoked, it uses profiling info to partially evaluate
the AST interpreter
7. AST and Deoptimization
¡ñ Parts of interpreter responsible for node rewriting are omitted from
compilation
¡ð Branches that perform rewriting cause deoptimization
¡ñ Deoptimization--when type specialization fails, optimized code is
discarded and execution reverts to interpreter
¡ñ More info is gathered during interpretation, followed by another round
of partial evaluation, and so on
9. Node Rewriting Conditions
¡ñ Guest language developer is responsible for fulfilling conditions of
node rewriting:
¡ð Completeness--node provides rewrites for all cases it does not
handle
¡ð Finiteness--After finite # of node rewrites, must end up in state
that handles full language semantics
¡ð Locality--a node can only rewrite itself and its subtree
10. Branch probabilities and loop freqs
¡ñ Guest lang interpreter author is also (optionally) responsible for
providing:
¡ð Branch probabilities
¡ð Loop frequencies
¡ñ Interpreter can do profiling for optimizing compiler
¡ð If not, values are derived from host system during partial
evaluation
11. Related Work
¡ñ PyPy
¡ð Big idea: write interpreter in high-level (less dynamic) lang, then
generate custom VM, tracing dynamic compiler form interpreter
¡ð Python interpreter, VM written in RPython -> translated to C
¡ñ Self-optimizing interpreters
¡ð Compilers can perform optimizations beyond reach of interpreters
¡ñ Compile to IR (e.g. LLVM), e.g. Rubinius
12. TruffleClojure
¡ñ TruffleClojure
¡ð Master¡¯s thesis, not ready for use
¡ñ Features:
¡ð Implements TCO (by avoiding Java bytecode)
¡ð Perf improvements via type specialization (e.g. of contains?)