Contents:
The Checker Framework is an innovative programming tool that prevents bugs at development time, before they escape to production.
Java's type system prevents some bugs, such as int count =
"hello";
. However, it does not prevent other bugs, such as null
pointer dereferences, concurrency errors, disclosure of private
information, incorrect internationalization, out-of-bounds indices, etc.
Pluggable type-checking replaces a
programming language's built-in type system with a more powerful,
expressive one.
We have created over 20 new type systems, and other people have created over 30 more. A type system is not just a bug-finding tool: it is a verification tool that gives a guarantee that no errors (of certain types) exist in your program. Even though it is powerful, it is easy to use. It follows the standard typing rules that programmers already know, and it fits into their workflow.
The Checker Framework is popular: it is used daily at Amazon, Google, Meta, Oracle, Uber, on Wall Street, and in other companies from big to small. It is attractive to programmers who care about their craft and the quality of their code. The Checker Framework is the motivation for Java's type annotations feature. It has received multiple awards. With this widespread use, there is a need for people to help with the project: everything from bug fixes, to new features, to case studies, to integration with other tools. We welcome your contribution!
Why should you join this project? It's popular, so you will have an impact. It makes code more robust and secure, which is a socially important purpose. You will get to scratch your own itch by creating tools that solve problems that frustrate you. It is accessible even to junior software engineers and undergraduates. (Many undergraduate students have published scientific papers, such as Jason Waataja, Vlastimil Dort, Gene Kim, Siwakorn Srisakaokul, Stephanie Dietzel, Nathaniel Mote, Brian Walker, Eric Spishak, Jaime Quinonez, Matthew Papi, Mahmood Ali, and Telmo Correa; and even more have made significant contributions to the tool.) Finally, we have a lot of fun on this project!
Prerequisites: You should be very comfortable with the Java programming language and its type system. You should know how a type system helps you and how it can hinder you. You should be willing to dive into a moderately-sized codebase. You should understand fundamental object-oriented programming concepts, such as behavioral subtyping: subtyping theory permits argument types to change contravariantly (even though Java forbids it for reasons related to overloading), whereas return types may change covariantly both in theory and in Java.
Potential projects: Most of this document lists potential projects. The projects are grouped roughly from easiest to most challenging.
To get started, first do a case study of using the Checker Framework: that is, run the Checker Framework on some program. If you have already done so, you can skip this section. Otherwise, a case study gives you experience in using the Checker Framework, and it may reveal bugs in either the Checker Framework or in the program it is analyzing.
Why should you start with a case study? Before you can contribute to any project, you must understand the tool from a user point of view, including its strengths, weaknesses, and how to use it. Using the Checker Framework is the best way to learn about it and determine whether you would enjoy contributing to it.
What is the purpose of a case study?
The primary result of your case study is that you will discover bugs in the
subject program, or you will verify that it has no bugs (of some particular
type). If you find bugs in open-source code,
and let us know when they are resolved.
Another outcome of your case study is that you may discover bugs, limitations,
or usability problems in the Checker Framework. Please
report them.
We'll try to fix them, or they might give you inspiration for
improvements you would like to make to the Checker Framework.
You might want to start with a small program that you wrote, then repeat the process with a larger open-source program or library.
import
statements. Doing so bloats the size of the
diffs and makes it hard to understand the essential changes.
if
statement that always succeeds, just
to suppress a warning. Convince yourself that both branches can
execute, or else don't add the if
statement.
@SuppressWarnings
annotation,
write
it on the smallest possible scope and
explain
why the checker warning is a false positive and you are certain
the code is safe.
The subject line should be descriptive (not just "Case study", but "Nullness case study of Apache Commons Exec library"). You should give us access to
You can also try to fix problems that you find and submit a pull request, but that is not a requirement to get started, because not all problems are good for new contributors.
We are very happy to answer your questions, and we are eager to interact with you! It's OK to have questions, and your questions can lead to improvements in the documentation and the tool.
Before you ask a question, read this file and the "Troubleshooting" section of the Checker Framework manual (including "How to report problems"), and also search in the Checker Framework manual for the answer. Don't send us a message that says nothing but “please guide me” or “tell me how to fix this issue from the issue tracker”.
When you ask a question, please tell us what you have tried, tell us what went wrong or where you got stuck, and ask a concrete technical question that will help you get past your problem. If you can do that, then definitely ask your question, because we don't want you to be stuck or frustrated.
When you send email, please use standard email etiquette, such as: avoid all-caps; use a descriptive subject line; don't put multiple different topics in a single email message; start a new thread with a new subject line when you change the topic; don't clutter discussions with irrelevant remarks; don't use screenshots (unless there is a problem with a GUI), but instead cut-and-paste the output or code into your message; if you are making a guess, clearly indicate that it is a guess and your grounds for it. Bug reports should be complete and should usually be reported to the issue tracker.
Here are some possible focuses for a project:
This document gives a few suggestions in each category.
These projects evaluate a recently-written type system or a feature used by multiple type systems. Using the type systems on real code is our most important source of new ideas and improvements. Many people have started out “just” doing a case study but have ended up making deep, fundamental contributions and even publishing scientific papers about their discoveries.
One possible outcome is to identify weaknesses in the type-checker so that we can improve it. Another possible outcome is to provide evidence that the type-checker is effective and convince more users to adopt it. You will probably also discover defects (bugs) in the codebase being type-checked.
Java defines six formats for the string representation of a type. (This is a design mistake, but it is too late to fix it now.) Because they all differ, it is an error to use one in place of another. But because some of them are very similar to others, such as differing only for nested classes, those errors might escape to production and lead to incorrect behavior or crashes.
The Signature String Checker ensures that string representations of types are used correctly. It has discovered bugs in the JDK, ASM, BCEL, and in clients of them.
Here are some possible case studies.
java.lang.classfile
.
Annotate it to determine that it is internally consistent and to enable
type-checking of its clients, such as that
in Daikon.
Some challenging aspects of this case study are:
An unsigned integer's bits are interpreted differently than a signed integer's bits. It is meaningless to add a signed and an unsigned integer — the result will be nonsense bits. The same is true of printing and of other numeric operators such as multiplication and comparison.
We have a prototype compile-time verification tool that detects and prevents these errors. The goal of this project is to perform case studies to determine how often programmers make signedness errors (our initial investigation suggests that this is common!) and to improve the verification tool.
The research questions are:
The methodology is:
A good way to find projects that use unsigned arithmetic is to find a library that supports unsigned arithmetic, then search on GitHub for projects that use that library.
Here are some relevant libraries.
Integer
and Long
, these include
compareUnsigned
,
divideUnsigned
,
parseUnsignedInt
,
remainderUnsigned
, and
toUnsignedLong
.
DataInputStream
, ObjectInputStream
,
and RandomAccessFile
have readUnsignedByte
.
Arrays
has compareUnsigned
.
The JDK is already annotated; search for @Unsigned
within
https://github.com/typetools/jdk.
@Unsigned
within
https://github.com/typetools/guava.
Another possibility is to find Java projects that could use an unsigned arithmetic library but do not. For example, bc-java defines its own unsigned libraries, and some other programs might do direct bit manipulation.
A type system is useful because it prevents certain errors. The downside of a type system is the effort required to write the types. Type inference is the process of writing the types for a program.
The Checker Framework includes a whole-program inference that inserts type qualifiers in the user's program. It works well on some programs, but needs more enhancements to work well on all programs.
By default, the Checker Framework is unsound in several circumstances. “Unsound” means that the Checker Framework may report no warning even though the program can misbehave at run time.
The reason that the Checker Framework is unsound is that we believe that enabling these checks would cause too many false positive warnings: warnings that the Checker Framework issues because it cannot prove that the code is safe (even though a human can see that the code is safe). Having too many false positive warnings would irritate users and lead them not to use the checker at all, or would force them to simply disable those checks.
We would like to do studies of these command-line options to see whether our concern is justified. Is it prohibitive to enable sound checking? Or can we think of enhancements that would let us turn on those checks that are currently disabled by default?
There is no need to annotate new code for this project. Just use existing annotated codebases, such as those that are type-checked as part of the Checker Framework's Azure Pipeline. In other words, you can start by enabling Azure Pipelines for your fork and then changing the default behavior in a branch. The Azure Pipelines job will show you what new warnings appear.
Android uses its own annotations that are similar to some in the Checker
Framework. Examples include the
Android
Studio support annotations,
including @NonNull
, @IntRange
, @IntDef
,
and others.
The goal of this project is to implement support for these annotations.
That is probably as simple as creating aliased annotations
by calling method addAliasedTypeAnnotation()
in AnnotatedTypeFactory.
Then, do a case study to show the utility (or not) of pluggable type-checking, by comparison with how Android Studio currently checks the annotations.
These projects annotate a library, so that it is easier to type-check clients of the library. Another benefit is that this may find bugs in the library. It can also give evidence for the usefulness of pluggable type-checking, or point out ways to improve the Checker Framework.
When type-checking a method call, the Checker Framework uses the method declaration's annotations. This means that in order to type-check code that uses a library, the Checker Framework needs an annotated version of the library.
The Checker Framework comes with a few annotated libraries. Increasing this number will make the Checker Framework even more useful, and easier to use.
After you have chosen a library, fork the library's source code, adjust its build system to run the Checker Framework, and add annotations to it until the type-checker issues no warnings.
Before you get started, be sure to read How to get started annotating legacy code. More generally, read the relevant sections of the Checker Framework manual.
There are several ways to choose a library to annotate:
When annotating a library, it is important to type-check both the library and at least one client that uses it. Type-checking the client will ensure that the library annotations are accurate.
Whatever library you choose, you will need to deeply understand its source code. You will find it easier to work with a library that is well-designed and well-documented.
You should choose a library that is not already annotated. There are two exceptions to this.
The Checker Framework ships with extensive annotations for the JDK. These annotations are useful in type-checking any program, since all programs use the JDK to some extent. The JDK annotations need to be updated from JDK 21 to JDK 24. In particular, any new methods and classes that have been recently introduced have no annotations. They need to be annotated.
This case study involves many different type systems rather than just one.
This project would have high impact because the JDK is so widely used and its annotations are so heavily depended on by the Checker Framework.
Guava is already partially annotated with nullness annotations — in part by Guava's developers, and in part by the Checker Framework team. However, Guava does not yet type-check without errors. Doing so could find more errors (the Checker Framework has found nullness and indexing errors in Guava in the past) and would be a good case study to learn the limitations of the Nullness Checker.
The Checker Framework is shipped with about 20 type-checkers. Users can create a new checker of their own. However, some users don't want to go to that trouble. They would like to have more type-checkers packaged with the Checker Framework for easy use.
Each of these projects requires you to design a new type system, implement it, and perform case studies to demonstrate that it is both usable and effective in finding/preventing bugs.
The lightweight
ownership mechanism of the Resource Leak Checker is not implemented as
a type system, but it should be. That would enable writing ownership
annotations on generic type arguments, like List<@Owning
Socket>
. It would also enable changing the Resource Leak Checker
so that non-@Owning
formal parameters do not have
their @MustCall
annotation erased.
We have some notes on possible implementation strategies.
The Nullness Checker issues a false positive warning for this code:
import java.util.PriorityQueue; import org.checkerframework.checker.nullness.qual.NonNull; public class MyClass { public static void usePriorityQueue(PriorityQueue<@NonNull Object> active) { while (!(active.isEmpty())) { @NonNull Object queueMinPathNode = active.peek(); } } }
The Checker Framework does not determine that active.peek()
returns a non-null value in this context.
The contract of peek()
is that it returns a non-null value if the queue is not empty and the queue contains no null values.
To handle this code precisely, the Nullness Checker needs to know, for each queue, whether it is empty. This is analogous to how the Nullness Checker tracks whether a particular value is a key in a map.
It should be handled the same way: by adding a new subchecker, called the Nonempty Checker, to the Nullness Checker. The Nonempty Checker already exists in the Checker Framework, though it is not advertised.
When you are done, the Nullness Checker should issue only the // ::
diagnostics from checker/tests/nullness/IsEmptyPoll.java
— no more and no fewer.
You can test that by running the Nullness Checker on the file, and when you are done you should delete the // @skip-test
line so that the file is run as part of the Checker Framework test suite.
The best approach may be to have the Nullness Checker run the Nonempty Checker as a subchecker but suppress the Nonempty Checker warnings. That is, the Nullness Checker takes advantage of information that the Nonempty Checker verifies, without caring about code that the Nonempty Checker cannot verify. The Optional Checker takes this approach in its integration with the Nonempty Checker.
The Checker Framework contains a checker named the Nonempty Checker (code,
tests
which can be run via gradlew NonemptyTest
).
Its types are:
@UnknownNonEmpty
— the queue might or might not be empty
@NonEmpty
— the queue is definitely non-empty
This type-checker is not yet publicized in the Checker Framework manual. The reason is that the type-checker's false positive rate is too high. The key task for this project is to determine the cause of these false positives and find ways to eliminate them.
For information about what needs to be done, see issue #399.
NoSuchElementException
A Java program that uses an Iterator
can
throw NoSuchElementException
if the program
calls next()
on the Iterator
but
the Iterator
has no more elements to iterate over. Such
exceptions even occur in production code (for example,
in Eclipse's
rdf4j).
We would like a compile-time guarantee that this run-time error will never
happen. Our analysis will statically determine whether
the hasNext()
method would return true. The basic type system
has two type qualifiers: @HasNext
is a subtype
of @UnknownHasNext
.
A variable's type is @HasNext
if the program
calls hasNext()
and it returns true. Implementing this is
easy (see
the dataflow
section in
the "How
to create a new checker" chapter). The analysis can also permit some
calls to next()
even if the programmer has not
called hasNext()
. For example, a call to next()
is permitted on a newly-constructed iterator that is made from a non-empty
collection. (This special case could build upon
the Non-Empty Checker mentioned above.)
There are probably other special cases, which experimentation will reveal.
Parts of this are already implemented, but it needs to be enhanced.
In particular, it depends on the
@SideEffectsOnly
annotation
mentioned elsewhere in this document.
Once
case studies have demonstrated its effectiveness, then it can be released
to the world, and a scientific paper can be written.
Many security vulnerabilities result from use of untrusted data without sanitizing it first. Examples include SQL injection, cross-site scripting, command injection, and many more. Other vulnerabilities result from leaking private data, such as credit card numbers.
We have built a generalized taint analysis that can address any of these problems. However, because it is so general, it is not very useful. A user must customize it for each particular problem.
The goal of this project is to make those customizations, and to evaluate their usefulness. A specific research question is: "To what extent is a general taint analysis useful in eliminating a wide variety of security vulnerabilities? How much customization, if any, is needed?"
The generalized taint analysis is the Checker Framework's a Tainting Checker. It requires customization to a particular domain:
@Tainted
and @Untainted
qualifiers
to something more specific (such as @Private
or @PaymentDetails
or @HtmlQuoted
), and
The first part of this project is to make this customization easier to do — preferably, a user will not have to change any code in the Checker Framework (the Subtyping Checker already works this way). As part of making customization easier, a user should be able to specify multiple levels of taint — many information classification hierarchies have more than two levels. For example, the US government separates information into four categories: Unclassified, Confidential, Secret, and Top Secret.
The second part of this project is to provide several examples, and do case studies showing the utility of compile-time taint checking.
Possible examples include:
@PrivacySource
and @PrivacySink
annotations used by the Meta Infer
static analyzer.
For some microbenchmarks, see the Juliette test suite for Java from CWE.
In Java, some objects do not fully implement their interface; they
throw UnsupportedOperationException
for some operations. One
example is unmodifiable collections. They throw the exception when a
mutating operation is called, such
as add
, addAll
, put
, remove
,
etc.
The goal of this project is to
design a compile-time verification tool to track which operations might not be supported.
This tool will issue a warning whenever
an UnsupportedOperationException
might occur at run time.
This helps programmers to avoid run-time exceptions (crashes) in their Java programs.
The research questions include:
UnsupportedOperationException
? What design is effective?
UnsupportedOperationException
exceptions
pervasive in Java programs? Is it possible to eliminate them?
The methodology is:
Here is a possible design, as a pluggable type system.
@Unmodifiable | @Modifiable
In other words, the @Unmodifiable
type qualifier is a
supertype of @Modifiable
. This means that a @Modifiable
List
can be used where an @Unmodifiable List
is
expected, but not vice versa.
@Modifable
is the default, and
methods such
as Arrays.asList
and Collections.emptyList
must be annotated to return the less-capable supertype.
Overflow is when 32-bit arithmetic differs from ideal arithmetic. For
example, in Java the int
computation 2,147,483,647 + 1 yields
a negative number, -2,147,483,648. The goal of this project is to detect
and prevent problems such as these.
One way to write this is as an extension of the Constant Value Checker, which already keeps track of integer ranges. It even already checks for overflow, but it never issues a warning when it discovers possible overflow. Your variant would do so.
This problem is so challenging that there has been almost no previous
research on static approaches to the problem. (Two relevant papers are
IntScope:
Automatically Detecting Integer Overflow Vulnerability in x86 Binary Using
Symbolic Execution and
Integer Overflow
Vulnerabilities Detection in Software Binary Code.) Researchers are
concerned that users will have to write a lot of annotations indicating the
possible ranges of variables, and that even so there will be a lot of false
positive warnings due to approximations in the conservative analysis.
For example, will every loop that contains i++
cause a warning that i
might overflow?
That would not be acceptable: users would just disable the check.
You can convince yourself of the difficulty by manually analyzing programs to see how clever the analysis has to be, or manually simulating your proposed analysis on a selection of real-world code to learn its weaknesses. You might also try it on good and bad binary search code.
One way to make the problem tractable is to limit its scope: instead of
being concerned with all possible arithmetic overflow, focus on a specific
use case.
As one concrete application,
the Index
Checker is currently unsound in the presence of integer overflow. If
an integer i
is known to be @Positive
, and 1 is
added to it, then the Index Checker believes that its type
remains @Positive
. If i
was
already Integer.MAX_VALUE
, then the result is negative —
that is, the Index Checker's approximation to it is unsound.
This project involves removing this unsoundness by implementing a type system to track when an
integer value might overflow — but this only matters for values that
are used as an array index.
That is, checking can be restricted to computations that involve an operand
of type @IntRange
).
Implementing such an analysis would permit the Index Checker
to extend its guarantees even to programs that might overflow.
This analysis is important for some indexing bugs in practice.
Using the Index Checker, we found 5 bugs in Google
Guava related to overflow. Google marked these as high priority and
fixed them immediately. In practice, there would be a run-time exception
only for an array of size approximately Integer.MAX_INT
.
You could write an extension of the Constant Value Checker, which already keeps track of integer ranges and even determines when overflow is possible. It doesn't issue a warning, but your checker could record whether overflow was possible (this could be a two-element type system) and then issue a warning, if the value is used as an array index. Other implementation strategies may be possible.
Here are some ideas for how to avoid the specific problem
of issuing a warning about potential overflow for every i++
in
a loop (but maybe other approaches are possible):
i == Integer.MAX_VALUE
before
incrementing. This wide-scale, disruptive code change is not
acceptable.
@ArrayLenRange(0, Integer.MAX_VALUE-1)
rather
than @UnknownVal
, which is equivalent
to @ArrayLenRange(0, Integer.MAX_VALUE-1)
. Now, every
array construction requires the client to establish that the length is
not Integer.MAX_VALUE
. I don't have a feel for whether
this would be unduly burdensome to users.
Sometimes, the best way to avoid a checker warning is to use an annotated library method. Consider this code:
@FqBinaryName String fqBinaryName = ...; @ClassGetName String componentType = fqBinaryName.substring(0, fqBinaryName.indexOf('['));
The Signature
String Checker issues a warning, because it does not reason about
arbitrary string manipulations. The code is correct, but it is in bad
style. It is confusing to perform string manipulations to convert between
different string representations. It is clearer and less error-prone (the
above code is buggy when fqBinaryName
is not an array type!)
to use a library method, and the checker accepts this code because the
library method is appropriately annotated:
import org.plumelib.reflection.Signatures; ... @ClassGetName String componentType = Signatures.getArrayElementType(fqBinaryName);
However, users may not know about the library method. Therefore, the
Checker Framework should issue a warning message, along with the error
message, notifying users of the library method. For example, the Signature
String Checker would heuristically mention
the Signatures.getArrayElementType()
method when it issues an error about string manipulation where some input
is
a FqBinaryName
and the output is annotated
as ClassGetName
.
It would behave similarly for other library methods.
The EISOP Checker Framework is an "unfriendly fork" of the Checker Framework. That is, it incorporates improvements from the Checker Framework, but its developers do not make pull requests to the Checker Framework: they only incorporate their improvements and bug fixes in their own version.
The goal of this project is to port EISOP's improvements to the Checker Framework. This has two positive effects. First, it enhances the Checker Framework. Second, it reduces the differences between the two projects, making it more feasible to merge them in the future.
One challenge is that we have discovered that some EISOP enhancements are buggy and should not be incorporated into the Checker Framework. Another challenge is that some EISOP enhancements do not conform to the Checker Framework's code quality and testing standards. Nonetheless, most are worthy of including in the Checker Framework.
Compiler writers have come to realize that clarity of error messages is as important as the speed of the executable (1, 2, 3, 4). This is especially true when the language or type system has rich features.
The goal of this project is to improve a compiler's error messages. Here are some distinct challenges:
@IndexFor("a")
annotation is syntactic sugar for
@NonNegative @LTLengthOf("a")
, and those types are
the ones that currently appear in error messages.
It might be good to show simpler types or ones that the user wrote.
The Checker Framework uses JavaParser to parse Java code. However, JavaParser is buggy and poorly maintained. Furthermore, JavaParser does not handle the latest version of Java. As of this writing, Checker Framework users cannot run whole-program inference, or write stub files, for projects that use Java 24 features — because JavaParser cannot yet parse Java 24 files.
A better Java parser exists: the one in javac! The goal of this project is to replace every use of JavaParser by a use of javac's parser.
Here are three uses of JavaParser in the Checker Framework. Replacing each of them is a different project. The projects are listed from easiest to hardest.
The Checker Framework
uses ajava
files to specify types for unannotated libraries, notably in
whole-program
inference. Every .ajava
file is a
legal .java
file, so it should be straightforward to use
javac's parser instead of JavaParser to read ajava files and query their contents.
Programmers can write Java expressions within annotations, such as in dependent types. The Checker Framework currently uses JavaParser to parse these expressions. We would rather use javac's parser instead.
This project may resolve some of the issues that are labeled "JavaExpression".
Here are two different approaches to replace JavaParser for dependent types.
The Checker Framework parses a Java expression string written by a
programmer into a
JavaExpression
.
This parsing happens in class
JavaExpressionParseUtil
,
in
method parse()
.
You will change the implementation, but not the API (the interface),
of JavaExpressionParseUtil
. You will change the line
expr = JavaParserUtil.parseExpression(expressionWithParameterNames, currentSourceVersion);
to call the javac parser instead (currently, expr
is a
JavaParser datatype), and you will rewrite the
ExpressionToJavaExpressionVisitor
nested class so that it
converts a javac parse tree into a JavaExpression
.
Note that JavaExpressionParseUtil
is also used by class
StringToJavaExpression
.
The approach in "Continue to use the JavaExpression class" (just above) has
these steps: (1) parse a string into a javac tree, (2) convert the javac
tree into a
JavaExpression
,
and (3) use the JavaExpression
throughout the Checker
Framework codebase.
Is the JavaExpression
class needed? Would it be possible to
skip the transformation from a javac tree into
a JavaExpression
, and instead have the Checker Framework
codebase use a javac tree instead?
Here is a possible sequence of steps for this project:
JavaExpression
contains some methods that a standard AST,
like javac's, lacks; an example
is isUnassignableByOtherCode
. Refactor these methods from
instance methods of JavaExpression
into static methods
in JavaExpressions
, making JavaExpression
more like a standard AST that can be replaced by JavaParser classes.
This change can be pull-requested on its own.
JavaExpression
by a use of the javac class
class com.sun.tools.javac.tree.JCTree.JCExpression.html
.
You also need to decide how to store the type
field
of JavaExpression
, when JavaExpression
is eliminated.
Replace every use of a subclass of JavaExpression
(listed in the
"Direct Known Subclasses" section of
the JavaExpression
API documentation) by a use of a
subclass
of JCTree.JCExpression.html
.
For example, replace every use
of MethodCall
by JCTree.JCMethodInvocation
.
JavaExpressionParseUtil
class; ExpressionToJavaExpressionVisitor
will definitely
be deleted. The deletion of code is an advantage of this approach.
Stub files, like
ajava files, specify types for unannotated libraries. A challenge is that
a stub file is not necessarily a legal Java file. A stub file looks like
the concatenation of multiple Java files (for example, it can have
multiple package
declarations). Furthermore, a stub file may
omit method bodies, replacing the body by ";".
Currently, the Checker Framework uses a modified version of JavaParser, called StubParser or AnnotationFileParser. The modified version can parse stub files.
Using javac to parse stub files would likewise require some modifications or wrappers. One possibility would be to (1) split a stub file into many small files, (2) make those files legal Java (or at least parsable), and (3) parse those files. There may be other implementation strategies as well.
The Checker Framework's dataflow framework (manual here) implements flow-sensitive type refinement (local type inference) and other features. It is used in the Checker Framework and also in Error Prone, NullAway, and elsewhere.
There are a number of open issues — both bugs and feature requests — related to the dataflow framework. The goal of this project is to address as many of those issues as possible, which will directly improve all the tools that use it.
The Checker Framework contains
a @SideEffectFree
annotation that improves the precision of its type-checking. The
@SideEffectFree
annotation means that a method has no
side effects. In some cases, it is enough to know that some particular
variable was not modified — it is not necessary that no
variable was modified.
This project would introduce a @SideEffectsOnly
annotation. @SideEffectsOnly
indicates all the expressions
whose value can possibly be modified by a particular method. This will
make the Checker Framework more precise in many cases.
A common and well-known cause of false positives from Checker Framework checkers is calls to impure methods, which unrefine dataflow facts. For example, a simple example is the following, which triggers a false positive in the Resource Leak Checker (RLC):
@EnsuresCalledMethods(value={“this.f1”, “this.f2”}, methods={“close”}) void foo() { try { f1.close(); } catch (Exception e) { } try { f2.close(); } catch (Exception e) { } }
The call f2.close()
unrefines the inferred dataflow fact
that f1.close()
has already been called, because it's possible
that f2.close()
re-assigns f1
. Therefore, the
RLC reports that the @EnsuresCalledMethods
annotation doesn’t
verify because f1
might not be closed at the moment of
procedure exit. Programmers who see the error message are mystified,
because so far as they can see, foo
closes
both f1
and f2
.
Non-experts (even competent engineers!) consistently don’t recognize that purity is even an issue in dataflow until it’s pointed out to them. The goal of this project is to provide better error reporting, when an error message is reported only because unrefinement has removed a dataflow fact. For each warning that could have been avoided by having a purity annotation on some called method, report that fact to the user.
Here is a potential implementation design.
BaseTypeVisitor
, record the
“required” type and the expression that requires it.
f1
.
…postcondition of foo() is not satisfied. found : @CalledMethods({}) Socket required: @CalledMethods({“close”}) Socket caused by: this.f1 has type @CalledMethods({“close”}) Socket before a possible side-effect from the call to f2.close(), which has an implicit @Impure annotation (no purity annotation found for java.lang.AutoCloseable#close()): try { f2.close(); } catch (Exception e) { } ^
A side effect analysis (or inference) reports what side effects a procedure may perform, such as what variable values it may modify. A side effect analysis is essential to other program analyses. A program analysis makes estimates about the current values of expressions. When a method call occurs, the analysis has to throw away most of its estimates, because the method call might change any variable. (This process of discarding information is called "unrefinement".) However, if the method is known to have no side effects, then the analysis doesn't need to throw away its estimates, and the analysis is more precise. Thus, an improvement to the foundational side effect analysis can improve many other program analyses.
The goal of this project is to evaluate existing side effect analysis algorithms and implementations, in order to determine what is most effective and to improve them. The research questions include:
The methodology is to collect existing side effect analysis tools (two examples are Soot and Geffken); run them on open-source projects; examine the result; and then improve them.
Currently, type annotations are only displayed in Javadoc if they are explicitly written by the programmer. However, the Checker Framework provides flexible defaulting mechanisms, reducing the annotation overhead. This project will integrate the Checker Framework defaulting phase with Javadoc, showing the signatures after applying defaulting rules.
There are other type-annotation-related improvements to Javadoc that can be explored, e.g. using JavaScript to show or hide only the type annotations currently of interest.
This section is relevant only to Google Summer of Code Students.
To apply, you will submit a single PDF through the Google Summer of Code website. This PDF should contain two main parts. We suggest that you number the parts and subparts to ensure that you don't forget anything, and to ensure that we don't overlook anything when reading your application. You might find it easiest to create multiple PDFs for the different parts, then concatenate them before uploading to the website, but how you create your proposal is entirely up to you.
The proposal should have a descriptive title, both in the PDF and in the GSoC submission system. Don't use a title like "Checker Proposal" or "Proposal for GSoC". Don't distract from content with gratuitous graphics.
List the tasks or subparts that are required to complete your project. This will help you discover a part that you had forgotten. We do not require a detailed timeline, because you don't yet know enough to create one.
If you want to do a case study, say what program you will do your case study on.
If you want to create a new type system (whether one proposed on this webpage or one of your own devising), then your proposal should include the type system's user manual. You don't have to integrate it in the Checker Framework repository (in other words, use any word processor or text editor you want to create a PDF file you will submit), but you should describe your proposed checker's parts in precise English or simple formalisms, and you should follow the suggested structure.
If you want to do exactly what is already listed on this page, then just say that (but be specific about which one!), and it will not hurt your chances of being selected. However, show us what progress you have made so far. You might also give specific ideas about extensions, about details that are not mentioned on this webpage, about implementation strategies, and so forth.
Never literally cut-and-paste text that was not written by you, because that would be plagiarism. If you quote from text written by someone else, give proper credit. Don't submit a proposal that is just a rearrangement of text that already appears on this page or in the Checker Framework manual, because it does not help us to assess your likelihood of being successful.
.zip
file or
provide a GitHub URL.
You should have shared the case study as soon as you finished it or as soon as you had a question that is not answered in the manual; don't wait until you submit your proposal, because that does not give us a chance to help you with feedback.
The best way to impress us is by doing a thoughtful job in the case study. The case study is even more important than the proposal text, because it shows us your abilities. The case study may result in you submitting issues against the issue tracker of the program you are annotating or of the Checker Framework. Pull requests against our GitHub project are a plus but are not required: good submitted bugs are just as valuable as bug fixes! You can also make a good impression by correctly answering questions from other students on the GSoC mailing list.
Some GSoC projects have a requirement to fix an issue in the issue tracker. We do not, because it is unproductive. Don't try to start fixing issues before you understand the Checker Framework from the user point of view, which will not happen until you have completed a case study on an open-source program. You may discuss your ideas with us by sending mail to checker-framework-gsoc@googlegroups.com.
Get feedback! Feel free to ask questions to make your application more competitive. We want you to succeed. Historically, students who start early and get feedback are most successful. You can submit a draft proposal via the Google Summer of Code website, and we will review it. We do not receive any notification when you submit a draft proposal, so if you want feedback, please tell us that. Also, we can only see draft proposals; we cannot see final proposals until after the application deadline has passed.
Please do not violate the guidelines in the How to get help and ask questions section of this document. If you do so, you are disqualified you from participating in GSoC, because it shows that you do not read instructions, and you haven't thought about the problem nor tried to solve it.