Annotations on Java typesJSR 308 working document |
The JSR 308 webpage is https://checkerframework.org/jsr308/. This document is available in PDF at https://checkerframework.org/jsr308/java-annotation-design.pdf.
We propose an extension to Java's annotation system [Bra04a] that permits annotations to appear on any use of a type, not just on class/method/field/variable declarations, as is the case in Java SE 6. Such a generalization removes arbitrary limitations of Java's annotation system, and it enables new uses of annotations. The proposal also notes a few other possible extensions to annotations (see Section 6.
This document specifies the syntax of extended Java annotations, but it makes no commitment as to their semantics. As with Java's existing annotations [Bra04a], the semantics is dependent on annotation processors (compiler plug-ins), and not every annotation is necessarily sensible in every location where it is syntactically permitted to appear. This proposal is compatible with existing annotations, such as those specified in JSR 250, “Common Annotations for the Java Platform” [Mor06], and JSR 305, “Annotations for Software Defect Detection” [Pug06].
This proposal does not change the compile-time, load-time, or run-time semantics of Java. It does not change the abilities of Java annotation processors [Dar06]. The proposal merely makes annotations more general — and thus more useful for their current purposes, and also usable for new purposes that are compatible with the original vision for annotations [Bra04a].
JSR 308 “Annotations on Java Types” [EC06] has the goal of refining the ideas presented here. This proposal will serve as a starting point for the JSR 308 expert group, but the expert group has the freedom to explore other approaches or to modify the proposal outlined here.
This document first motivates annotations on types by presenting one possible use, type qualifiers (section 2). Then, it proposes minor changes to the Java language annotation syntax (section 3) and class file format (section 4). Finally, section 5 describes the modifications to the JDK necessitated by the Java and class file changes.
One example use of annotation on types is to create custom type qualifiers in the Java language. Type qualifiers are modifiers that provide extra information about a type or variable; they can be thought of as a form of subtyping. A designer can define new type qualifiers using Java annotations, and can provide plug-ins to check their semantics (for instance, by issuing lint-like warnings during compilation). A programmer can then use these type qualifiers throughout a program to obtain additional guarantees at compile time about the program. A system for custom type qualifiers requires extensions to Java's annotation system, described in this document; the existing Java SE 6 annotations are inadequate. Similarly to type qualifiers, other pluggable type systems [Bra04b] and similar lint-like checkers also require these extensions to Java's annotation system.
Type qualifiers can help prevent errors and make possible a variety of program analyses. Since they are user-defined, developers can create and use the type qualifiers that are most appropriate for their software.
Our key goal is to create a type qualifier system that is compatible with the Java language, VM, and toolchain. Previous proposals for Java type qualifiers are incompatible with the existing Java language and tools, are too inexpressive, or both. The use of annotations for custom type qualifiers has a number of benefits over new Java keywords or special comments. First, Java already implements annotations, and Java SE 6 features a framework for compile-time annotation processing. This allows our system to build upon existing stable mechanisms and integrate with the Java toolchain, and it promotes the maintainability and simplicity of our modifications. Second, since annotations do not affect the runtime semantics of a program, applications written with custom type qualifiers are backward-compatible with the vanilla JDK. No modifications to the virtual machine are necessary.
The ability to place annotations on arbitrary occurrences of a type improves the expressiveness of annotations, which has many benefits for Java programmers. Here we mention just one use that is enabled by extended annotations, namely the creation of type qualifiers.
As an example of how our system might be used, consider a @NonNull type qualifier that signifies that a variable should never be assigned null [Det96, Eva96, DLNS98, FL03, CMM05]. The code in figure 1 demonstrates @NonNull in a sample method. A programmer can annotate any use of a type with the @NonNull annotation. A compiler plug-ins would check that a @NonNull variable is never assigned a possibly-null value, thus enforcing the @NonNull type system.
1 @NonNullDefault 2 class DAG { 3 4 Set<Edge> edges; 5 6 // ... 7 8 List<Vertex> getNeighbors(@Interned @Readonly Vertex v) @Readonly { 9 List<Vertex> neighbors = new LinkedList<Vertex>(); 10 for (Edge e : edges) 11 if (e.from() == v) 12 neighbors.add(e.to()); 13 return neighbors; 14 } 15 }
@Readonly is another example of a useful type qualifier [BE04, TE05, GF05, KT01, SW01, PBKM00]. Similar to C's const, an object's internal state may not be modified through references that are declared @Readonly. A type qualifier designer would create a plug-in to check the semantics of @Readonly. For instance, a method may only be called on a @Readonly object if the method was declared with a @Readonly receiver. @Readonly's immutability guarantee can help developers avoid accidental modifications, which are often manifested as run-errors.
Additional examples of useful type qualifiers abound. We mention just a few others. C uses the const, volatile, and restrict type qualifiers. Type qualifiers YY for two-digit year strings and YYYY for four-digit year strings helped to detect, then verify the absence of, Y2K errors [EFA99]. Type qualifiers can indicate data that originated from an untrustworthy source [PØ95, VS97]; examples for C include user vs. kernel indicating user-space and kernel-space pointers in order to prevent attacks on operating systems [JW04], and tainted for strings that originated in user input and that should not be used as a format string [STFW01]. A localizable qualifier can indicate where translation of user-visible messages should be performed; similarly, annotations can indicate other properties of its contents, such as the format or encoding of a string. An interned qualifier can indicate which objects have been converted to canonical form and thus may be compared via object equality. Type qualifiers such as unique and unaliased can express properties about pointers and aliases [Eva96, CMM05]; other qualifiers can detect and prevent deadlock in concurrent programs [FTA02, AFKT03]. Flow-sensitive type qualifiers [FTA02] can express typestate properties such as whether a file is in the open, read, write, readwrite, or closed state, and can guarantee that a file is opened for reading before it is read, etc. The Vault language's type guards and capability states are similar [DF01].
Our system extends Java to allow annotations on any use of a type (whether explicit or implicit, as in the case of method receivers). In Java SE 6, annotations can be written on method parameters and the declarations of classes, methods, fields, and local variables. Our system additionally allows annotations to be written in the following locations, with the given syntax. (The specific annotation names, such as @NonNull, are examples only; this document does not propose any annotations, merely specifying where they can appear in Java code.)
public int size() @Readonly { ... }
Map<@NonNull String, @NonEmpty List<@Readonly Document>> files;
Document[@Readonly][] docs4 = new Document[@Readonly 2][12]; Document[][@Readonly] docs5 = new Document[2][@Readonly 12];This syntax permits independent annotations for each distinct level of array, and for the elements.
myString = (@NonNull String) myObject;
boolean isNonNull = myString instanceof @NonNull String;
new @NonEmpty @Readonly List<String>(myNonEmptyStringSet)
class Folder<F extends @Existing File> { ... }
class UnmodifiableList<T> implements @Readonly List<@Readonly T> { ... }
void monitorTemperature() throws @Critical TemperatureException { ... }
These annotations are necessary in order to fully specify the signatures and the implementations of Java classes and methods.
There is no need for new syntax for annotations on return types, because Java already permits an annotation to appear before a method return type. Currently, such annotations are interpreted as on the method declaration — for example, the @Deprecated annotation indicates that the method is deprecated. The person who defines the annotation decides whether an annotation that appears before the return value applies to the method declaration or to the return type. (A third possibility is that the annotation is not sensible in this position, in which case the annotation processing plug-in should issue a warning or error.)
We now expand on how some of the annotations on types may be used. Each of these uses is either impossible or extremely inconvenient in the absence of the new locations for annotations proposed in this document. For brevity, we do not give examples of uses for every type annotation. It is worthwhile to permit annotations on all uses of types (even those for which no immediate use is apparent) for consistency, expressiveness, and support of unforeseen future uses.
As with annotations on formal parameters (which are already permitted by Java), annotations on the receiver may not affect compile-time resolution of overloading nor run-time resolution of overriding.
Generic collection classes are declared one level at a time, so it is easy to annotate each level individually. It is desirable that the syntax for arrays be equally expressive, for uniformity.
The primary purpose of annotations on type casts is to prevent, not to enable, run-time checks. For example, an annotation processor could prohibit code such as this:
@Readonly Object x = new Date(); ... (Date) x ... // annotation processor error to cast away @Readonly
This cast uses x as a non-@Readonly object, which changes its type and would require a run-time mechanism to enforce type safety. By contrast, an annotation processor would permit a use such as
@Readonly Object x = new Date(); ... (@Readonly Date) x ... // legal; use of @Readonly has no run-time effect
which preserves the annotation part of the type and thus guarantees type safety without run-time checks.
Another potential use for annotations on type casts is — like ordinary Java casts — to provide the compiler with information that is beyond the ability of its typing rules. An annotation processing tool could trust such type casts, perhaps issuing a warning to remind users to verify their safety by hand. An alternative approach would be to check the type cast dynamically, as Java casts are, but we do not endorse such an approach, because annotations are not intended to change the run-time behavior of a Java program.
As with type cast annotations, the presence of an annotation on a type test does not imply a run-time test. The intention is that an annotation processor would require the annotation parts of the type and of the argument (the expression being tested) to be the same. This permits the idiom
if (x instanceof T) { ... (T) x ... }
to be used with the same type T in both occurrences; by contrast, using different types in the type test and the type cast would be confusing.
Because annotations are not intended to change the run-time behavior of a Java program, this proposal does not endorse dynamically evaluate the type test to determine that a qualifier is present on a type. In the implementation, there is no run-time representation of the annotations on an object's type, so such a test would be impossible to implement, even if it were desired; there is no danger that a qualifier on the type test will change the run-time behavior.
These annotations also provide a convenient way to alias otherwise cumbersome types. For instance, a programmer might declare
final class MyStringMap extends @Readonly Map<@NonNull String, @NonEmpty List<@NonNull @Readonly String>> {}
so that MyStringMap may be used in place of the full, unpalatable supertype.
Not every type system (or other system using annotations) may utilize every possible annotation location. For example, a system that specifies signatures but infers types for implementations [GF05] may not need annotations on typecasts, object creation, local variables, or certain other locations. Other systems may forbid top-level (non-type-argument, non-array) annotations on object creation (new) expressions, such as new @Interned Object(). However, the annotation system proposed here is expressive enough to handle arbitrary type qualifiers.
Consider the following annotated arrays.
@Readonly Document[] docs1; Document[@Readonly] docs2; @Readonly Document[][] docs3 = new Document[2][12]; Document[@Readonly][] docs4 = new Document[@Readonly 2][12]; Document[][@Readonly] docs5 = new Document[2][@Readonly 12];
This syntax permits independent annotations for each distinct level of array, and for the elements.
There are two (incompatible) ways to interpret the syntax, and we must choose one of them. Here, we present both.
Option 1:
An annotation before the entire array type binds to the member type that it
abuts;
@Readonly Document[][] docs4 can be interpreted as
(@Readonly Document)[][] docs4.
An annotation within brackets refers to the array that is accessed using
those brackets.
The type of elements of @A Object[@B][@C] is @A Object[@C].
For example, the declarations above have the following meanings:
docs1 is a mutable one-dimensional array of immutable Documents.
docs2 is an immutable one-dimensional array of mutable Documents.
docs3 is a mutable array, whose elements are mutable
one-dimensional arrays of immutable Documents.
docs4 is an immutable array, whose elements are mutable
one-dimensional arrays of mutable Documents.
docs5 is a mutable array, whose elements are immutable
one-dimensional arrays of mutable Documents.
Option 2:
An annotation before the entire array type refers to the (reference to the)
top-level array
itself; @Readonly Document[][] docs4 indicates that the array is
non-modifiable (not that the Documents in it are non-modifiable).
An annotation within brackets applies to the elements that are
accessed using those brackets.
The type of elements of @A Object[@B][@C] is @B Object[@C].
For example, The declarations above have the following meanings:
docs1 is a inmutable one-dimensional array of mutable Documents.
docs2 is a mutable one-dimensional array of immutable Documents.
docs3 is an immutable array, whose elements are mutable
one-dimensional arrays of mutable Documents.
docs4 is a mutable array, whose elements are immutable
one-dimensional arrays of mutable Documents.
docs5 is a mutable array, whose elements are mutable
one-dimensional arrays of immutable Documents.
Java annotations (including the extended annotations) must be stored in the class file for two reasons. First, they may be part of the interface of a class and, if so, must be available to the compiler (really, to the type-checking plug-in [Dar06]) when compiling clients of the class. Second, since class files may originate from any source, the information may be useful in other contexts, such as compile-time verification.
This document proposes conventions for storing the annotations described in section 3, as well as for storing local variable annotations, which are permitted in Java syntax but currently discarded by the compiler. Class files already store annotations in the form of “attributes” [Bra04a, LY]. JVMs ignore unknown attributes. For backward compatibility, we use new attributes for storing the type annotations. In other words, our proposal merely reserves the names of a few attributes and specifies their layout. Our proposal does not alter the way that existing annotations on classes, methods, method parameters, and fields are stored in the class file. Class files generated from programs that use no new annotations will be identical to those generated by a standard Java SE 6 (that is, pre-extended-annotations) compiler.
In Java SE 6, annotations are stored in the class file in attributes of the classes, fields, or methods they target. Attributes are sections of the class file that associate data with a program element (a method's bytecodes, for instance, are stored in a Code attribute). The RuntimeVisibleAnnotations attribute is used for annotations that are accessible at runtime using reflection, and the RuntimeInvisibleAnnotations attribute is used for annotations that are not accessible at runtime. These attributes contain arrays of annotation structure elements, which in turn contain arrays of element_value pairs. The element_value pairs store the names and values of an annotation's arguments.
Our proposal introduces two new attributes: RuntimeVisibleTypeAnnotations and RuntimeInvisibleTypeAnnotations. These attributes are structurally identical to the RuntimeVisibleAnnotations and RuntimeInvisibleAnnotations attributes described above with one exception: rather than an array of annotation elements, RuntimeVisibleTypeAnnotations and RuntimeInvisibleTypeAnnotations contain an array of extended_annotation elements, which are described in section 4.1 below.
The Runtime[In]visibleTypeAnnotations attributes store annotations written in the new locations described in section 3, and on local variables. For annotations in the types of a field, the field_info structure (see JVMS 4.6) corresponding to that field stores the Runtime[In]visibleTypeAnnotations attributes. In all other cases, the method_info structure (see JVMS 4.7) that corresponds to the annotations' containing method stores the Runtime[In]visibleTypeAnnotations attributes.
The extended_annotation structure has the following format, which adds target_type and reference_info to the annotation structure defined in JVMS 4.8.16:
extended_annotation { u2 type_index; u2 num_element_value_pairs; { u2 element_name_index; element_value value; } element_value_pairs[num_element_value_pairs]; u1 target_type; // new in our proposal { ... } reference_info; // new in our proposal }
The following sections describe the fields of the extended_annotation structure that differ from annotation.
The target_type field denotes the type of program element that the annotation targets. As described above, annotations in any of the following locations are written to Runtime[In]visibleTypeAnnotations attributes in the class file:
The corresponding values for each of these cases are shown in Figure 2. Some locations are assigned numbers even though annotations in those locations are prohibited or are actually written to Runtime[In]visibleAnnotations or Runtime[In]visibleParameterAnnotations. While those locations will never appear in a target_type field, including them in the enumeration may be convenient for software that processes extended annotations. They are marked * in Figure 2.
Annotation Target target_type Value typecast 0x00 typecast generic/array 0x01 type test (instanceof) 0x02 type test (instanceof) generic/array 0x03 object creation (new) 0x04 object creation (new) generic/array 0x05 method receiver 0x06 method receiver generic/array 0x07* local variable 0x08 local variable generic/array 0x09 method return type 0x0A* method return type generic/array 0x0B method parameter 0x0C* method parameter generic/array 0x0D field 0x0E* field generic/array 0x0F class type parameter bound 0x10 class type parameter bound generic/array 0x11 method type parameter bound 0x12 method type parameter bound generic/array 0x13 class extends/implements 0x14 class extends/implements generic/array 0x15 exception type in throws 0x16 exception type in throws generic/array 0x17*
The reference_info field is used to reference the annotation's target in bytecode. The contents of the reference_info field is determined by the value of target_type.
When the annotation's target is a typecast, an instanceof expression, or a new expression, reference_info has the following structure:
{ u2 offset; } reference_info;
The offset field denotes the offset (i.e., within the bytecodes of the containing method) of the checkcast bytecode emitted for the typecast, the instanceof bytecode emitted for the type tests, or of the new bytecode emitted for the object creation expression.
For annotated typecasts, the attribute may be attached to a checkcast bytecode, or to any other bytecode. The rationale for this is that the Java compiler is permitted to omit checkcast bytecodes for typecasts that are guaranteed to be no-ops. For example, a cast from String to @NonNull String may be a no-op for the underlying Java type system (which sees a cast from String String). If the compiler omits the checkcast bytecode, the @NonNull attribute would be attached to the (last) bytecode that creates the target expression instead. This approach permits code generation for existing compilers to be unaffected.
One technical challenge is that two differently-annotated expressions might be combined via common subexpression elimination.
When the annotation's target is a local variable, reference_info has the following structure:
{ u2 start_pc; u2 length; u2 index; } reference_info;
The start_pc and length fields specify the variable's live range in the bytecodes of the local variable's containing method (from offset start_pc to offset start_pc + length). The index field stores the local variable's index in that method. These fields are similar to those of the optional LocalVariableTable attribute defined in JVMS 4.8.13.
Storing local variable annotations in the class file raises certain challenges. For example, live ranges are not isomorphic to local variables. Further, a local variable with no live range may not appear in the class file (but it is also irrelevant to the program).
When the annotation's target is a method receiver, reference_info is empty.
When the annotation's target is a bound of a type parameter of a class or method, reference_info has the following structure:
{ u1 param_index; u1 bound_index; } reference_info;
param_index specifies the index of the type parameter, while bound_index specifies the index of the bound. Consider the following example:
<T extends @A Object & @B Comparable, U extends @C Cloneable>
Here @A has param_index 0 and bound_index 0, @B has param_index 0 and bound_index 1, and @C has param_index 1 and bound_index 0.
When the annotation's target is a type in an extends or implements clause, reference_info has the following structure:
{ u1 type_index; } reference_info;
type_index specifies the index of the type in the clause: -1 (255) is used if the annotation is on the superclass type, and the value i is used if the annotation is on the ith superinterface type.
When the annotation's target is a type in a throws clause, reference_info has the following structure:
{ u1 type_index; } reference_info
type_index specifies the index of the exception type in the clause: the value i denotes an annotation on the ith exception type.
When the annotation's target is a generic type argument or array type, reference_info contains what it normally would for the raw type (i.e., offset for an annotation on a type argument in a typecast), plus the following fields at the end:
u2 location_length; u1 location[location_length];
The location_length field specifies the number of elements in the variable-length location field. location encodes which type argument or array element the annotation targets. Specifically, the ith item in location denotes the index of the type argument or array dimension at the ith level of the hierarchy. Figure 3 shows the values of the location_length and location fields for the annotations in a sample field declaration.
Declaration: @A Map<@B Comparable<@C Object[@D][@E][@F]>, @G List<@H Document>>
Annotation location_length location @A not applicable @B 1 0 @C 2 0, 0 @D 3 0, 0, 0 @E 3 0, 0, 1 @F 3 0, 0, 2 @G 1 1 @H 2 1, 0
The implementation of extended annotations builds on the existing framework for Java annotations.
The syntax extensions described in section 3 require the javac Java compiler to accept annotations in the proposed locations and to add them to the program's AST. The relevant AST node classes must also be modified to store these annotations.
When generating code, the compiler must emit the attributes described in section 4.
Similar modifications need to be made to other compilers, IDEs, and related tools, such as Eclipse, IDEA, and ASM (https://asm.ow2.io).
The java.lang.reflect.* APIs give access to annotations on classes, the signatures of methods, etc. They must be updated to give the same access to the new extended annotations. For example, to parallel the existing Method.getAnnotations (for the return value) and Method.getParameterAnnotations (for the formal parameters), we would add Method.getReceiverAnnotation (for the receiver this). We do not plan to provide access to annotations on casts, type parameter names, or other implementation details. Suppose that a method is declared as:
@NonEmpty List<@Interned String> foo(@NonNull List<@Opened File> files) @Readonly {...}
Then Method.getAnnotations() returns the @NonEmpty annotation, just as in Java SE 6, and likewise Method.getParameterAnnotations() returns the @NonNull annotation. New method Method.getReceiverAnnotations() returns the @Readonly annotation.
The JSR-269 annotation processing API must be modified so that the process method (currently invoked only on class, field, and method annotations) is also invoked on annotations on typecasts, receivers, type arguments, and local variables. Additionally, the Tree API, which exposes the AST (including annotations) to authors of compile-time plug-ins, must be updated to reflect the modifications made to the internal AST node classes described in section 3.
No modifications to the virtual machine are necessary.
A separate document, “Custom type qualifiers via annotations on Java types”, explores implementation strategies for annotation-checking plug-ins. It is not germane to this proposal, both because this proposal does not concern itself with annotation semantics and because writing such plug-ins does not require any changes beyond those described here.
A separate document, “Annotation Index File Specification”, describes a textual format for annotations that is independent of .java or .class files. This textual format can represent annotations for libraries that cannot or should not be modified. We have built or are building a variety of tools for manipulating annotations, including extracting annotations from and inserting annotations in .java and .class files. That file format is not part of this proposal for extending Java's annotations; it is better viewed as an implementation detail of our tools.
The Expert Group will consider whether the proposal should extend annotations in a few other ways that are not directly related to annotations on types. This is especially true if the additional changes are small, that there is no better time to add such an annotation, and the new syntax would permit unanticipated future uses. Two examples follow, for which the proposal does not currently include a detailed design.
Array-valued annotations can be clumsy to write:
@Resources({ @Resource(name = "db1", type = DataSource.class) @Resource(name = "db2", type = DataSource.class) }) public class MyClass { ... }
Likewise, it may be desirable for some (but not all) annotations to be specified more than once at a single location. A cleaner syntax like
@Resource(name = "db1", type = DataSource.class) @Resource(name = "db2", type = DataSource.class) public class MyClass { ... }
may be desirable for both purposes.
Annotations on blocks or loops, which are not currently permitted by Java, could be useful for properties such as atomicity/concurrency. Such an extension would require defining both Java syntax and convention for storing the information in the class file.
The JSR for annotations on Java types should be included under the Java SE 7 umbrella JSR (which lists the JSRs that are part of the Java SE 7 release). However, it should be a separate JSR because it needs a separate expert group. The expert group will have overlap with any others dealing with other added language features that might be annotatable (such as method-reference types or closures), to check impact.
The specification and the TCK will be freely available, most likely licensed under terms that permit arbitrary use. The reference implementation is built on javac, but once javac is open-sourced, we will be able to share the reference implementation more widely.
To ease the transition from standard Java SE 6 code to code with the extended annotations, the reference implementation recognizes the extended annotations when surrounded by comment markers:
/*@Readonly*/ Object x;
This permits use of both standard Java SE 6 tools and the new annotations even before Java SE 7 is released. However, it is not part of the proposal, and the final Java SE 7 implementation will not recognize the new annotations when embedded in comments.
We thank Joshua Bloch, Gilad Bracha, Alex Buckley, Wayne Carr, Bruce Chapman, Joe Darcy, Jeff Foster, Neal Gafter, David Greenfieldboyce, Evan Ireland, Sacha Labourey, Doug Lea, Todd Millstein, R. Matthew McCutchen, Ted Neward, Jens Palsberg, Bill Pugh, Jaime Quinonez, Matthew Tschantz, and Eugene Vigdorchik for their comments and suggestions. We welcome additional feedback.
This document was translated from LATEX by HEVEA.