I’ve been exploring GraalVM lately, and it’s a fascinating technology. However, it took a deep dive into several articles to fully grasp GraalVM’s intricacies. Initially, it might appear to be a completely new Java Virtual Machine, but that’s not entirely accurate. It introduces specific enhancements to the existing Java VM, rather than replacing it entirely. Components like Garbage Collectors, Heap Memory structure, Class Loader, and ByteCode format remain unchanged. As stated in its description, “GraalVM is a high-performance JDK distribution designed to accelerate the execution…” Downloading the Community edition essentially provides the familiar HotSpot/OpenJDK bundled with significant additions. The key components include:
- GraalVM Compiler: This is the heart of GraalVM and the primary focus for most users. It’s not a replacement for the Java source code compiler (javac), but rather a new Just-In-Time (JIT) compiler. Its role is to compile Java Bytecodes directly into native machine code.
- GraalVM Native Image: This technology allows ahead-of-time compilation of Java applications into native executables. These native applications don’t rely on a traditional JVM, eliminating the need for an interpreter, JIT compiler, or even class loaders. However, they still require runtime components, known as “Substrate VM,” for tasks like garbage collection and thread scheduling. Whether Substrate VM can be classified as a true VM is debatable.
- Truffle Language Implementation Framework: This is a powerful component worthy of its own discussion. Truffle simplifies the creation of programming language implementations as interpreters (written in Java) using self-modifying Abstract Syntax Trees. This framework has enabled the development of highly efficient Ruby, Python, and JavaScript implementations that run on GraalVM. While these implementations can run on a standard JVM, they perform significantly better with GraalVM’s JIT compiler. This provides a comprehensive and insightful resource on this topic.
GraalVM Compiler in Depth
The Java HotSpot VM uses a hybrid approach to execute bytecode generated by javac and stored in .class files. Initially, it interprets the code, then employs a fast but non-optimizing JIT compiler (C1) to convert frequently used (hot) methods into native code. Subsequently, methods subjected to very high invocation rates (very hot methods) undergo JIT compilation by C2, a slower but highly optimizing JIT compiler. This process is well-explained in here.
While this hybrid model is effective, it’s not exclusive to HotSpot JVM, as discussed in here and here. Despite the sophistication of the C2 JIT compiler, particularly its use of Profile Guided Optimization, a team of brilliant minds sought to develop an even more powerful replacement. This led to the creation of the GraalVM Compiler, a cutting-edge speculative, profile-guided JIT Compiler designed to supersede C2.
The HotSpot JVM incorporates the JVMCI (Java Virtual Machine Compiler Interface), enabling code to interact with a JIT compiler. Through JVMCI, bytecodes can be passed to a JIT compiler for compilation, and the resulting native code can be integrated into the VM. This interface allows plugging in different JIT compilers, including the GraalVM Compiler, effectively replacing the default C2 JIT.
While the GraalVM Compiler is included in standard OpenJDK and Oracle JDK distributions, activating it requires specific VM options: -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler. A more common approach is to install the GraalVM JDK distribution, which comes pre-configured with the GraalVM Compiler enabled alongside other associated technologies like Native Image, Truffle, and Substrate.
The GraalVM Compiler itself is written in Java, necessitating the conversion of its own bytecodes into native code. This process, known as bootstrapping, can be achieved through two primary methods. The first involves ahead-of-time compilation using Native Image. Alternatively, C1 can be utilized for initial compilation into non-optimized native code, allowing the compiler to self-optimize as its methods become frequently invoked.
here highlights the GraalVM Compiler’s utilization of an Intermediate Representation (IR). To clarify its role in relation to standard Java bytecodes, this article provides a clear explanation. Essentially, the process follows this sequence: Java ByteCode -> IR -> Machine Code. The JIT compiler translates Java bytecode into SSA IR, represented as a graph encompassing control flow and data flow. Each bytecode maps to multiple nodes in the IR graph (note: some bytecodes don’t have direct IR node counterparts). Optimization then takes place on this IR graph.
While I’ve primarily referred to the GraalVM Compiler as a replacement for the C2 JIT compiler, this article suggests that it replaces both C1 and C2. However, most resources, such as here, explicitly state that it solely replaces C2. This discrepancy requires further investigation.