Skip to content

smoltuna/ANTLR4-interpreters

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Banner

ANTLR V4 Language interpreters

Java ANTLR4 License

Three fully functional language interpreters built from scratch using ANTLR4 — covering the full pipeline from formal grammar specification to working runtime execution in Java.

Each interpreter was implemented as part of a Programming Languages course at the University of Verona. The project spans esoteric, imperative, and joke languages, and demonstrates hands-on mastery of compiler front-end theory: lexing, parsing, AST traversal via the Visitor pattern, runtime environments, and type systems.


What's Inside

🔴 ArnoldC

An interpreter for ArnoldC, the language where every keyword is an Arnold Schwarzenegger movie quote.

IT'S SHOWTIME
  HEY CHRISTMAS TREE result
  YOU SET US UP 0
  GET TO THE CHOPPER result
    HERE IS MY INVITATION 3
    GET UP 4
  ENOUGH TALK
  TALK TO THE HAND result
YOU HAVE BEEN TERMINATED

What it supports:

  • Named methods with typed arguments and return values (LISTEN TO ME VERY CAREFULLY / HASTA LA VISTA, BABY)
  • Integer, boolean, and string types with a clean Value hierarchy (IntValue, BoolValue, StrValue)
  • Full arithmetic: +, -, *, /, % — written as GET UP, GET DOWN, YOU'RE FIRED, etc.
  • Comparisons and logical operators (AND = KNOCK KNOCK, OR = CONSIDER THAT A DIVORCE)
  • if/else and while control flow
  • Method calls with argument passing and return value capture
  • A sample program suite including FizzBuzz, ASCII art, and utility methods

Architecture: ArnoldC.g4 → ANTLR-generated ArnoldCLexer/ArnoldCParserIntArnoldC visitor → Conf runtime environment + Method registry


🟢 Brainfuck

A complete interpreter for Brainfuck, one of the most minimal Turing-complete languages ever designed.

The entire language is 8 symbols:

Symbol Operation
> Move tape pointer right
< Move tape pointer left
+ Increment current cell
- Decrement current cell
. Output current cell as ASCII char
, Read one byte of input into current cell
[ Jump past matching ] if current cell is zero
] Jump back to matching [ if current cell is non-zero

Implementation highlights:

  • Sparse tape modeled as a HashMap<Integer, Integer> — infinite in both directions, zero-initialized
  • Loop semantics handled recursively through the parse tree, with the ANTLR grammar naturally encoding nesting
  • Any non-command character is silently skipped via the EXTRA : . -> skip lexer rule

Architecture: Brainfuck.g4 → ANTLR-generated parser → Brainfuck.java visitor with tape state


🔵 Imp

An interpreter for Imp, a textbook-style imperative language used in formal semantics courses to illustrate operational and denotational semantics.

x = 5;
while (x > 0) {
  out(x);
  x = x - 1
}

What it supports:

  • Integer (NAT), boolean, string, and array types
  • Full expression grammar with correct operator precedence (including right-associative ^ for exponentiation)
  • Variable assignment and indexed array assignment (arr[i] = expr)
  • if/else, while, skip (no-op), and sequencing via ;
  • out(expr) for output; tostr(expr) for type coercion; . for string concatenation
  • String escape sequences (\n, \t, \", etc.)
  • A rich set of sample programs: factorial, array squaring, operator precedence tests, string manipulation

Architecture: Imp.g4IntImp visitor → Conf environment (variable store + array cell map)


Interpreter Pipeline (all three modules)

source file
    │
    ▼
 Lexer (.g4 token rules)
    │  tokenizes input
    ▼
 Parser (.g4 grammar rules)
    │  builds parse tree
    ▼
 Visitor (IntXxx.java)
    │  walks the tree, executes semantics
    ▼
 Runtime (Conf.java + value/* classes)
    │  stores variables, tracks state
    ▼
 Output

All three interpreters follow the same structure: a .g4 grammar, ANTLR-generated lexer/parser/visitor base classes, and a hand-written visitor that implements the language's operational semantics.


Build & Run

Each module is self-contained with its own ANTLR4 runtime JAR. Commands are for Windows PowerShell; on Linux/macOS, replace ; with : in classpaths.

ArnoldC

cd ArnoldC
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\value\*.java src\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" Main prog

Brainfuck

cd Brainfuck
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\lab3\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" lab3.Main lab3/prog

Imp

cd Imp
javac -cp ".;antlr4-runtime-4.13.1.jar" -d out src\less5\value\*.java src\less5\*.java
java -cp "out;src;antlr4-runtime-4.13.1.jar" less5.Main less5/factorial

Note: The ANTLR-generated source files (Lexer, Parser, Visitor) are committed to each src/ directory for convenience — no separate code generation step is required to run the interpreters.


Repository Structure

ANTLR4-interpreters/
├── ArnoldC/
│   ├── src/
│   │   ├── ArnoldC.g4          # Grammar
│   │   ├── IntArnoldC.java     # Interpreter (visitor)
│   │   ├── Conf.java           # Runtime environment
│   │   ├── Method.java         # Method model
│   │   ├── value/              # Type hierarchy (Int, Bool, Str, Com)
│   │   ├── prog                # Sample programs
│   │   └── ArnoldC{Lexer,Parser,BaseVisitor,Visitor}.java  # Generated
│   └── antlr4-runtime-4.13.1.jar
├── Brainfuck/
│   ├── src/lab3/
│   │   ├── Brainfuck.g4        # Grammar
│   │   ├── Brainfuck.java      # Interpreter (visitor + tape)
│   │   ├── prog                # Sample program
│   │   └── Brainfuck{Lexer,Parser,BaseVisitor,Visitor}.java # Generated
│   └── antlr4-runtime-4.13.1.jar
├── Imp/
│   ├── src/less5/
│   │   ├── Imp.g4              # Grammar
│   │   ├── IntImp.java         # Interpreter (visitor)
│   │   ├── Conf.java           # Variable + array environment
│   │   ├── value/              # Type hierarchy (Nat, Bool, Str, Arr, Com)
│   │   ├── factorial, Square, strings, exp, ...  # Sample programs
│   │   └── Imp{Lexer,Parser,BaseVisitor,Visitor}.java      # Generated
│   └── antlr4-runtime-4.13.1.jar
├── assets/
│   └── banner.svg
└── docs/
    └── FILE_ANALYSIS.md

Key Design Decisions

Typed value hierarchy over raw Java types. Each interpreter uses a sealed Value class hierarchy rather than raw Object maps. This makes type errors visible at the interpreter level and keeps visitor return types uniform.

Sparse tape for Brainfuck. Rather than a fixed-size array, the tape is a HashMap<Integer, Integer>. This means the tape is effectively unbounded in both directions and uses no memory for unvisited cells — semantically correct and more interesting than a naive array implementation.

Grammar-encoded operator precedence in Imp. The Imp.g4 grammar handles full arithmetic precedence — including right-associative exponentiation (<assoc=right> exp POW exp) — directly in ANTLR rule ordering, avoiding any precedence climbing logic in the interpreter itself.

Newline as statement separator in ArnoldC. The ArnoldC grammar treats NL as a significant token (not whitespace), which matches the original language's line-oriented structure and makes the grammar more faithful to its specification.


Tech Stack

  • Java 17+
  • ANTLR4 4.13.1 — grammar specification and parser generation
  • Visitor pattern — parse-tree interpretation without modifying generated classes
  • IntelliJ IDEA — project structure (.idea/ metadata in each module)

References


License

Released under the MIT License.

About

Three language interpreters built with ANTLR4 — ArnoldC, Brainfuck, and Imp — covering grammar design, parse-tree walking, and visitor-pattern execution in Java.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors