6 Support for Non-Java Languages

This chapter describes the Non-Java Language features in the Java Virtual Machine.

Introduction to Non-Java Language Features

The Java Platform, Standard Edition (Java SE) enables the development of applications that have the following features:

  • They can be written once and run anywhere
  • They can be run securely because of the Java sandbox security model
  • They are easy to package and deliver

The Java SE platform provides robust support in the following areas:

  • Concurrency
  • Garbage collection
  • Reflective access to classes and objects
  • JVM Tool Interface (JVM TI): A native programming interface for use by tools. It provides both a way to inspect the state and to control the execution of applications running in the JVM.

Oracle's HotSpot JVM provides the following tools and features:

  • DTrace: A dynamic tracing utility that monitors the behavior of applications and the operating system.
  • Performance optimizations
  • PrintAssembly: A Java HotSpot option that prints assembly code for bytecoded and native methods.

The Java SE 7 platform enables non-Java languages to use the infrastructure and potential performance optimizations of the JVM. The key mechanism is the invokedynamic instruction, which simplifies the implementation of compilers and runtime systems for dynamically-typed languages on the JVM.

Static and Dynamic Typing

A programming language is statically-typed if it performs type checking at compile time. Type checking is the process of verifying that a program is type safe. A program is type safe if the arguments of all of its operations are the correct type.

Java is a statically-typed language. Type information is available for class and instance variables, method parameters, return values, and other variables when a program is compiled. The compiler for the Java programming language uses this type information to produce strongly typed bytecode, which can then be efficiently executed by the JVM at runtime.

The following example of a Hello World program demonstrates static typing. Types are shown in bold.


import java.util.Date;

public class HelloWorld {
    public static void main(String[] argv) {
        String hello = "Hello ";
        Date currDate = new Date();
        for (String a : argv) {
            System.out.println(hello + a);
            System.out.println("Today's date is: " + currDate);
        }
    }
}

A programming language is dynamically-typed if it performs type checking at runtime. JavaScript and Ruby are examples of dynamically typed languages. These languages verify at runtime, rather than at compile time, that values in an application conform to expected types. Typically, type information for these languages is not available when an application is compiled. The type of an object is determined only at runtime. In the past, it was difficult to efficiently implement dynamically-typed languages on the JVM.

The following is an example of the Hello World program written in the Ruby programming language:


#!/usr/bin/env ruby
require 'date'

hello = "Hello "
currDate = DateTime.now
ARGV.each do|a|
  puts hello + a
  puts "Date and time: " + currDate.to_s
end

In the example, every name is introduced without a type declaration. The main program is not located inside a holder type (the Java class HelloWorld). The Ruby equivalent of the Java for loop is inside the dynamic type ARGV variable. The body of the loop is contained in a block called a closure, which is a common feature in dynamic languages.

Statically-Typed Languages Are Not Necessarily Strongly-Typed Languages

Statically-typed programming languages can employ strong typing or weak typing. A programming language that employs strong typing specifies restrictions on the types of values supplied to its operations, and it prevents the execution of an operation if its arguments have the wrong type. A language that employs weak typing would implicitly convert (or cast) arguments of an operation if those arguments have the wrong or incompatible types.

Dynamically-typed languages can employ strong typing or weak typing. For example, the Ruby programming language is dynamically-typed and strongly-typed. When a variable is initialized with a value of some type, the Ruby programming language does not implicitly convert the variable into another data type.

In the following example, the Ruby programming language does not implicitly cast the number 2, which has a Fixnum type, to a string.


a = "40"
b = a + 2

The Challenge of Compiling Dynamically-Typed Languages

Consider the following dynamically-typed method, addtwo, which adds any two numbers (which can be of any numeric type) and returns their sum:


def addtwo(a, b)
       a + b;
end

Suppose your organization is implementing a compiler and runtime system for the programming language in which the method addtwo is written. In a strongly-typed language, whether typed statically or dynamically, the behavior of + (the addition operator) depends on the operand types. A compiler for a statically-typed language chooses the appropriate implementation of + based on the static types of a and b. For example, a Java compiler implements + with the iadd JVM instruction if the types of a and b are int. The addition operator is compiled to a method call because the JVM iadd instruction requires the operand types to be statically known.

A compiler for a dynamically-typed language must defer the choice until runtime. The statement a + b is compiled as the method call +(a, b), where + is the method name. A method named + is permitted in the JVM but not in the Java programming language. If the runtime system for the dynamically-typed language is able to identify that a and b are variables of integer type, then the runtime system would prefer to call an implementation of + that is specialized for integer types rather than arbitrary object types.

The challenge of compiling dynamically-typed languages is how to implement a runtime system that can choose the most appropriate implementation of a method or function — after the program has been compiled. Treating all variables as objects of Object type would not work efficiently; the Object class does not contain a method named +.

In Java SE 7 and later, the invokedynamic instruction enables the runtime system to customize the linkage between a call site and a method implementation. In this example, the invokedynamic call site is +. An invokedynamic call site is linked to a method by means of a bootstrap method, which is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site. Assuming the compiler emitted an invokedynamic instruction that invokes +, and assuming that the runtime system knows about the method adder(Integer,Integer), the runtime can link the invokedynamic call site to the adder method as follows:

IntegerOps.java


class IntegerOps {

  public static Integer adder(Integer x, Integer y) {
    return x + y;
  }
}

Example.java


import java.util.*;
import java.lang.invoke.*;
import static java.lang.invoke.MethodType.*;
import static java.lang.invoke.MethodHandles.*;

class Example {

  public static CallSite mybsm(
    MethodHandles.Lookup callerClass, String dynMethodName, MethodType dynMethodType)
    throws Throwable {

    MethodHandle mh =
      callerClass.findStatic(
        Example.class,
        "IntegerOps.adder",
        MethodType.methodType(Integer.class, Integer.class, Integer.class));

    if (!dynMethodType.equals(mh.type())) {
      mh = mh.asType(dynMethodType);
    }

    return new ConstantCallSite(mh);
  }
}

In this example, the IntegerOps class belongs to the library that accompanies runtime system for the dynamically-typed language.

The Example.mybsm method is a bootstrap method that links the invokedynamic call site to the adder method.

The callerClass object is a lookup object, which is a factory for creating method handles.

The MethodHandles.Lookup.findStatic method (called from the callerClass lookup object) creates a static method handle for the method adder.

Note: This bootstrap method links an invokedynamic call site to only the code that is defined in the adder method. It assumes that the arguments given to the invokedynamic call site are Integer objects. A bootstrap method requires additional code to properly link invokedynamic call sites to the appropriate code to execute if the parameters of the bootstrap method (in this example, callerClass, dynMethodName, and dynMethodType) vary.

The java.lang.invoke.MethodHandles class and java.lang.invoke.MethodHandle class contain various methods that create method handles based on existing method handles. This example calls the asType method if the method type of the mh method handle does not match the method type specified by the dynMethodType parameter. This enables the bootstrap method to link invokedynamic call sites to Java methods whose method types don’t exactly match.

The ConstantCallSite instance returned by the bootstrap method represents a call site to be associated with a distinct invokedynamic instruction. The target for a ConstantCallSite instance is permanent and can never be changed. In this case, one Java method, adder, is a candidate for executing the call site. This method does not have to be a Java method. Instead, if several such methods are available to the runtime system, each handling different argument types, the mybsm bootstrap method could dynamically select the correct method based on the dynMethodType argument.

The invokedynamic Instruction

You can use the invokedynamic instruction in implementations of compilers and runtime systems for dynamically typed languages on the JVM. The invokedynamic instruction enables the language implementer to define custom linkage. This contrasts with other JVM instructions such as invokevirtual, in which linkage behavior specific to Java classes and interfaces is hard-wired by the JVM.

Each instance of an invokedynamic instruction is called a dynamic call site. When an instance of the dynamic call site is created, it is in an unlinked state, with no method specified for the call site to invoke. The dynamic call site is linked to a method by means of a bootstrap method. A dynamic call site's bootstrap method is a method specified by the compiler for the dynamically-typed language. The method is called once by the JVM to link the site. The object returned from the bootstrap method permanently determines the call site's activity.

The invokedynamic instruction contains a constant pool index (in the same format as for the other invoke instructions). This constant pool index references a CONSTANT_InvokeDynamic entry. This entry specifies the bootstrap method (a CONSTANT_MethodHandle entry), the name of the dynamically-linked method, and the argument types and return type of the call to the dynamically-linked method.

In the following example, the runtime system links the dynamic call site specified by the invokedynamic instruction (which is +, the addition operator) to the IntegerOps.adder method by using the Example.mybsm bootstrap method. The adder method and mybsm method are defined in The Challenge of Compiling Dynamically Typed Languages (line breaks have been added for clarity):


invokedynamic   InvokeDynamic
  REF_invokeStatic:
    Example.mybsm:
      "(Ljava/lang/invoke/MethodHandles/Lookup;
        Ljava/lang/String;
        Ljava/lang/invoke/MethodType;)
      Ljava/lang/invoke/CallSite;":
    +:
      "(Ljava/lang/Integer;
        Ljava/lang/Integer;)
      Ljava/lang/Integer;";

Note:

The bytecode examples use the syntax of the ASM Java bytecode manipulation and analysis framework.

Invoking a dynamically-linked method with the invokedynamic instruction involves the following steps:

  1. Defining the Bootstrap Method
  2. Specifying Constant Pool Entries
  3. Using the invokedynamic Instruction

Defining the Bootstrap Method

At runtime, the first time the JVM encounters an invokedynamic instruction, it calls the bootstrap method. This method links the name that the invokedynamic instruction specifies with the code to execute the target method, which is referenced by a method handle. The next time the JVM executes the same invokedynamic instruction, it does not call the bootstrap method; it automatically calls the linked method handle.

The bootstrap method's return type must be java.lang.invoke.CallSite. The CallSite object represents the linked state of the invokedynamic instruction and the method handle to which it is linked.

The bootstrap method takes three or more of the following parameters:

  • MethodHandles.Lookup object: A factory for creating method handles in the context of the invokedynamic instruction.
  • String object: The method name mentioned in the dynamic call site.
  • MethodType object: The resolved type signature of the dynamic call site.
  • One or more additional static arguments to the invokedynamic instruction: Optional arguments, drawn from the constant pool, are intended to help language implementers safely and compactly encode additional metadata useful to the bootstrap method. In principle, the name and extra arguments are redundant because each call site could be given its own unique bootstrap method. However, such a practice is likely to produce large class files and constant pools

See The Challenge of Compiling Dynamically Typed Languages for an example of a bootstrap method.

Specifying Constant Pool Entries

The invokedynamic instruction contains a reference to an entry in the constant pool with the CONSTANT_InvokeDynamic tag. This entry contains references to other entries in the constant pool and references to attributes. See, java.lang.invoke package documentation and The Java Virtual Machine Specification.

Example Constant Pool

The following example shows an excerpt from the constant pool for the class Example, which contains the bootstrap method Example.mybsm that links the method + with the Java method adder:


    class #159; // #47
    Utf8 "adder"; // #83
    Utf8 "(Ljava/lang/Integer;Ljava/lang/Integer;)Ljava/lang/Integer;"; // #84
    Utf8 "mybsm"; // #87
    Utf8 "(Ljava/lang/invoke/MethodHandles/Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)
      java/lang/invoke/CallSite;"; // #88
    Utf8 "Example"; // #159
    Utf8 "+"; // #166

    // ...

    NameAndType #83 #84; // #228
    Method #47 #228; // #229
    MethodHandle 6b #229; // #230
    NameAndType #87 #88; // #231
    Method #47 #231; // #232
    MethodHandle 6b #232; // #233
    NameAndType #166 #84; // #234
    Utf8 "BootstrapMethods"; // #235
    InvokeDynamic 0s #234; // #236

The constant pool entry for the invokedynamic instruction in this example contains the following values:

  • CONSTANT_InvokeDynamic tag
  • Unsigned short of value 0
  • Constant pool index #234.

The value, 0, refers to the first bootstrap method specifier in the array of specifiers that are stored in the BootstrapMethods attribute. Bootstrap method specifiers are not in the constant pool table. They are contained in this separate array of specifiers. Each bootstrap method specifier contains an index to a CONSTANT_MethodHandle constant pool entry, which is the bootstrap method itself.

The following example shows an excerpt from the same constant pool that shows the BootstrapMethods attribute, which contains the array of bootstrap method specifiers:


  [3] { // Attributes

    // ...

    Attr(#235, 6) { // BootstrapMethods at 0x0F63
      [1] { // bootstrap_methods
        {  //  bootstrap_method
          #233; // bootstrap_method_ref
          [0] { // bootstrap_arguments
          }  //  bootstrap_arguments
        }  //  bootstrap_method
      }
    } // end BootstrapMethods
  } // Attributes

The constant pool entry for the bootstrap method mybsm method handle contains the following values:

  • CONSTANT_MethodHandle tag
  • Unsigned byte of value 6
  • Constant pool index #232.

The value, 6, is the REF_invokeStatic subtag. See, Using the invokedynamic Instruction, for more information about this subtag.

Using the invokedynamic Instruction

The following example shows how the bytecode uses the invokedynamic instruction to call the mybsm bootstrap method, which links the dynamic call site (+, the addition operator) to the adder method. This example uses the + method to add the numbers 40 and 2 (line breaks have been added for clarity):


bipush  40;
invokestatic    Method java/lang/Integer.valueOf:"(I)Ljava/lang/Integer;";
iconst_2;
invokestatic    Method java/lang/Integer.valueOf:"(I)Ljava/lang/Integer;";
invokedynamic   InvokeDynamic
  REF_invokeStatic:
    Example.mybsm:
      "(Ljava/lang/invoke/MethodHandles/Lookup;
        Ljava/lang/String;
        Ljava/lang/invoke/MethodType;)
      Ljava/lang/invoke/CallSite;":
    +:
      "(Ljava/lang/Integer;
        Ljava/lang/Integer;)
      Ljava/lang/Integer;";

The first four instructions put the integers 40 and 2 in the stack and boxes them in the java.lang.Integer wrapper type. The fifth instruction invokes a dynamic method. This instruction refers to a constant pool entry with a CONSTANT_InvokeDynamic tag:


REF_invokeStatic:
  Example.mybsm:
    "(Ljava/lang/invoke/MethodHandles/Lookup;
      Ljava/lang/String;
      Ljava/lang/invoke/MethodType;)
    Ljava/lang/invoke/CallSite;":
  +:
    "(Ljava/lang/Integer;
      Ljava/lang/Integer;)
    Ljava/lang/Integer;";

Four bytes follow the CONSTANT_InvokeDynamic tag in this entry.

  • The first two bytes form a reference to a CONSTANT_MethodHandle entry that references a bootstrap method specifier:

    
    REF_invokeStatic:
      Example.mybsm:
        "(Ljava/lang/invoke/MethodHandles/Lookup;
          Ljava/lang/String;
          Ljava/lang/invoke/MethodType;)
        Ljava/lang/invoke/CallSite;"
    

    This reference to a bootstrap method specifier is not in the constant pool table. It is contained in a separate array defined by a class file attribute named BootstrapMethods. The bootstrap method specifier contains an index to a CONSTANT_MethodHandle constant pool entry, which is the bootstrap method itself.

    Three bytes follow this CONSTANT_MethodHandle constant pool entry:

    • The first byte is the REF_invokeStatic subtag. This means that this bootstrap method will create a method handle for a static method; note that this bootstrap method is linking the dynamic call site with the static Java adder method.

    • The next two bytes form a CONSTANT_Methodref entry that represents the method for which the method handle is to be created:

      
      Example.mybsm:
        "(Ljava/lang/invoke/MethodHandles/Lookup;
          Ljava/lang/String;
          Ljava/lang/invoke/MethodType;)
        Ljava/lang/invoke/CallSite;"
      

      In this example, the fully qualified name of the bootstrap method is Example.mybsm . The argument types are MethodHandles.Lookup, String, and MethodType. The return type is CallSite.

  • The next two bytes form a reference to a CONSTANT_NameAndType entry:

    
    +:
      "(Ljava/lang/Integer;
        Ljava/lang/Integer;)
      Ljava/lang/Integer;"
    

    This constant pool entry specifies the method name (+), the argument types (two Integer instances), and return type of the dynamic call site (Integer).

In this example, the dynamic call site is presented with boxed integer values, which exactly match the type of the eventual target, the adder method. In practice, the argument and return types don’t need to exactly match. For example, the invokedynamic instruction could pass either or both of its operands on the JVM stack as primitive int values. Either or both operands could be untyped Object values. The invokedynamic instruction could receive its result as a primitive int value, or an untyped Object value. In any case, the dynMethodType argument to mybsm accurately describes the method type that is required by the invokedynamic instruction.

The adder method could be given primitive or untyped arguments or return values. The bootstrap method is responsible for making up any difference between the dynMethodType and the type of the adder method. As shown in the code, this is easily done with an asType call on the target method.