Three fully functional language interpreters built from scratch using ANTLR4 — covering the full pipeline from formal grammar specification to working runtime execution in Java.
Each interpreter was implemented as part of a Programming Languages course at the University of Verona. The project spans esoteric, imperative, and joke languages, and demonstrates hands-on mastery of compiler front-end theory: lexing, parsing, AST traversal via the Visitor pattern, runtime environments, and type systems.
An interpreter for ArnoldC, the language where every keyword is an Arnold Schwarzenegger movie quote.
IT'S SHOWTIME
HEY CHRISTMAS TREE result
YOU SET US UP 0
GET TO THE CHOPPER result
HERE IS MY INVITATION 3
GET UP 4
ENOUGH TALK
TALK TO THE HAND result
YOU HAVE BEEN TERMINATED
What it supports:
- Named methods with typed arguments and return values (
LISTEN TO ME VERY CAREFULLY/HASTA LA VISTA, BABY) - Integer, boolean, and string types with a clean
Valuehierarchy (IntValue,BoolValue,StrValue) - Full arithmetic:
+,-,*,/,%— written asGET UP,GET DOWN,YOU'RE FIRED, etc. - Comparisons and logical operators (
AND=KNOCK KNOCK,OR=CONSIDER THAT A DIVORCE) if/elseandwhilecontrol flow- Method calls with argument passing and return value capture
- A sample program suite including FizzBuzz, ASCII art, and utility methods
Architecture: ArnoldC.g4 → ANTLR-generated ArnoldCLexer/ArnoldCParser → IntArnoldC visitor → Conf runtime environment + Method registry
A complete interpreter for Brainfuck, one of the most minimal Turing-complete languages ever designed.
The entire language is 8 symbols:
| Symbol | Operation |
|---|---|
> |
Move tape pointer right |
< |
Move tape pointer left |
+ |
Increment current cell |
- |
Decrement current cell |
. |
Output current cell as ASCII char |
, |
Read one byte of input into current cell |
[ |
Jump past matching ] if current cell is zero |
] |
Jump back to matching [ if current cell is non-zero |
Implementation highlights:
- Sparse tape modeled as a
HashMap<Integer, Integer>— infinite in both directions, zero-initialized - Loop semantics handled recursively through the parse tree, with the ANTLR grammar naturally encoding nesting
- Any non-command character is silently skipped via the
EXTRA : . -> skiplexer rule
Architecture: Brainfuck.g4 → ANTLR-generated parser → Brainfuck.java visitor with tape state
An interpreter for Imp, a textbook-style imperative language used in formal semantics courses to illustrate operational and denotational semantics.
x = 5;
while (x > 0) {
out(x);
x = x - 1
}
What it supports:
- Integer (
NAT), boolean, string, and array types - Full expression grammar with correct operator precedence (including right-associative
^for exponentiation) - Variable assignment and indexed array assignment (
arr[i] = expr) if/else,while,skip(no-op), and sequencing via;out(expr)for output;tostr(expr)for type coercion;.for string concatenation- String escape sequences (
\n,\t,\", etc.) - A rich set of sample programs: factorial, array squaring, operator precedence tests, string manipulation
Architecture: Imp.g4 → IntImp visitor → Conf environment (variable store + array cell map)
source file
│
▼
Lexer (.g4 token rules)
│ tokenizes input
▼
Parser (.g4 grammar rules)
│ builds parse tree
▼
Visitor (IntXxx.java)
│ walks the tree, executes semantics
▼
Runtime (Conf.java + value/* classes)
│ stores variables, tracks state
▼
Output
All three interpreters follow the same structure: a .g4 grammar, ANTLR-generated lexer/parser/visitor base classes, and a hand-written visitor that implements the language's operational semantics.
Each module is self-contained with its own ANTLR4 runtime JAR. Commands are for Windows PowerShell; on Linux/macOS, replace ; with : in classpaths.
cd ArnoldC
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\value\*.java src\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" Main progcd Brainfuck
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\lab3\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" lab3.Main lab3/progcd Imp
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\less5\value\*.java src\less5\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" less5.Main less5/factorialNote: The ANTLR-generated source files (Lexer, Parser, Visitor) are committed to each
src/directory for convenience — no separate code generation step is required to run the interpreters.
ANTLR4-interpreters/
├── ArnoldC/
│ ├── src/
│ │ ├── ArnoldC.g4 # Grammar
│ │ ├── IntArnoldC.java # Interpreter (visitor)
│ │ ├── Conf.java # Runtime environment
│ │ ├── Method.java # Method model
│ │ ├── value/ # Type hierarchy (Int, Bool, Str, Com)
│ │ ├── prog # Sample programs
│ │ └── ArnoldC{Lexer,Parser,BaseVisitor,Visitor}.java # Generated
│ └── antlr4-runtime-4.13.1.jar
├── Brainfuck/
│ ├── src/lab3/
│ │ ├── Brainfuck.g4 # Grammar
│ │ ├── Brainfuck.java # Interpreter (visitor + tape)
│ │ ├── prog # Sample program
│ │ └── Brainfuck{Lexer,Parser,BaseVisitor,Visitor}.java # Generated
│ └── antlr4-runtime-4.13.1.jar
├── Imp/
│ ├── src/less5/
│ │ ├── Imp.g4 # Grammar
│ │ ├── IntImp.java # Interpreter (visitor)
│ │ ├── Conf.java # Variable + array environment
│ │ ├── value/ # Type hierarchy (Nat, Bool, Str, Arr, Com)
│ │ ├── factorial, Square, strings, exp, ... # Sample programs
│ │ └── Imp{Lexer,Parser,BaseVisitor,Visitor}.java # Generated
│ └── antlr4-runtime-4.13.1.jar
├── assets/
│ └── banner.svg
└── docs/
└── FILE_ANALYSIS.md
Typed value hierarchy over raw Java types. Each interpreter uses a sealed Value class hierarchy rather than raw Object maps. This makes type errors visible at the interpreter level and keeps visitor return types uniform.
Sparse tape for Brainfuck. Rather than a fixed-size array, the tape is a HashMap<Integer, Integer>. This means the tape is effectively unbounded in both directions and uses no memory for unvisited cells — semantically correct and more interesting than a naive array implementation.
Grammar-encoded operator precedence in Imp. The Imp.g4 grammar handles full arithmetic precedence — including right-associative exponentiation (<assoc=right> exp POW exp) — directly in ANTLR rule ordering, avoiding any precedence climbing logic in the interpreter itself.
Newline as statement separator in ArnoldC. The ArnoldC grammar treats NL as a significant token (not whitespace), which matches the original language's line-oriented structure and makes the grammar more faithful to its specification.
- Java 17+
- ANTLR4 4.13.1 — grammar specification and parser generation
- Visitor pattern — parse-tree interpretation without modifying generated classes
- IntelliJ IDEA — project structure (
.idea/metadata in each module)
- ANTLR4 — ANother Tool for Language Recognition
- lhartikk/ArnoldC — original ArnoldC specification
- Nipkow & Klein, Concrete Semantics — theoretical foundation for Imp
Released under the MIT License.