This part of the tutorial shows how the Checker Framework can detect and help correct missing input validation.
The RegexExample.java
program is called with two arguments: a regular expression and a
string. The program prints the text from the string that matches
the first capturing group in the regular expression.
Compile the program:
$ javac RegexExample.java
Run the program with a valid regular expression and a matching string:
$ java RegexExample '[01]?\d-([0123]?\d)-\d{4}+' '01-24-2013' Group 1: 24
Run the program with an invalid regular expression and any string:
$ java RegexExample '[01]?[\d-[0123]?\d-\d{4}+' '01-24-2013' Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 24 [01]?[d-[0123]?d-d{4}+ ^ at java.util.regex.Pattern.error(Pattern.java:1924) at java.util.regex.Pattern.clazz(Pattern.java:2493) at java.util.regex.Pattern.sequence(Pattern.java:2030) at java.util.regex.Pattern.expr(Pattern.java:1964) at java.util.regex.Pattern.compile(Pattern.java:1665) at java.util.regex.Pattern.<init>(Pattern.java:1337) at java.util.regex.Pattern.compile(Pattern.java:1022) at RegexExample.main(RegexExample.java:13)
Good programming style dictates that the user should not see a stack trace, even if the user supplies invalid output.
The Regex Checker prevents, at compile time, use of syntactically invalid regular expressions and access of invalid capturing groups. In other words, it prevents you from writing code that would throw certain exceptions at run time. Next run the Regex Checker to see how it could have spotted this issue at compile time.
$ javacheck -processor org.checkerframework.checker.regex.RegexChecker RegexExample.java RegexExample.java:13: error: [argument.type.incompatible] incompatible types in argument. Pattern pat = Pattern.compile(regex); ^ found : String required: @Regex String RegexExample.java:18: error: [group.count.invalid] invalid groups parameter 1. Only 0 groups are guaranteed to exist for mat. System.out.println("Group 1: " + mat.group(1)); ^ 2 errors
The "incompatible types" warning indicates that variable
regex
is not of type @Regex String
which
is required for strings passed to Pattern.compile()
.
The "group.count.invalid" warning indicates that no groups are
guaranteed to exist at run time, which is required for a Matcher
mat
in order to call mat.group(1)
.
The right way to fix the problems is for the code to issue a
user-friendly message at run time. It should verify the user
input using the
RegexUtil.isRegex(String, int)
method.
If the string is not a valid regular
expression, it should print an error message and halt.
If it is a valid regular expression, perform as before.
You need to make two changes to RegexExample.java to correctly handle invalid user input. At the top of the file, add
import org.checkerframework.checker.regex.RegexUtil;
After variable regex
is defined but before it is
used, add
if (!RegexUtil.isRegex(regex, 1)) { System.out.println("Input is not a regular expression \"" + regex + "\": " + RegexUtil.regexException(regex).getMessage()); System.exit(1); }
$ javacheck -processor org.checkerframework.checker.regex.RegexChecker RegexExample.java
There should be no warnings. This shows that the code will not throw a PatternSyntaxException at compile time.
Run the program exactly as before to verify that the program prints a user-friendly warning.
$ java -cp ".:$CHECKERFRAMEWORK/checker/dist/checker.jar" RegexExample '[01]?[\d-\([0123]?\d\)-\d{4}+' '01-24-2013' Input is not a regular expression "[01]?[d-([0123]?d)-d{4}+": Illegal character range near index 24
(If this issues an error
Exception in thread "main" java.lang.UnsupportedClassVersionError: RegexExample : Unsupported major.minor version 52.0
, then change the javacheck alias to pass the -source 7 -target 7 command-line arguments and rerun the javacheck command.)
For a full discussion of the Regex Checker, please see the Regex Checker chapter of the Checker Framework manual.