FORGE: Framework for Orchestrating Runtime Gen-AI Experiments

FORGE is a comprehensive test harness designed for CI-first, reproducible and observable performance & scale testing of AI/ML workloads, specifically targeting OpenShift platforms. Developed and maintained by the Red Hat PSAP (Performance and Scale for AI Platforms) team.

🎯 Purpose

FORGE enables systematic performance and scale testing of AI/ML workloads with:

Reproducible testing: Consistent test environments and methodologies
Observable results: Comprehensive metrics collection and visualization
CI/CD integration: Automated testing pipelines for continuous validation
Scalability analysis: Performance characteristics across different scales
OpenShift optimization: Tailored for container orchestration platforms

🏗️ Architecture

FORGE works in cooperation with Fournos to provide a complete testing ecosystem for AI workloads.

Core Components

core: Fundamental framework components (DSL, launcher, CI entrypoint, notifications)
caliper: Artifact post-processing engine for parsing, visualization, and KPI analysis
fournos_launcher: Integration with Fournos for orchestration

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/openshift-psap/forge.git
cd forge

# Install core dependencies
pip install -e .

# Install with optional backends
pip install -e '.[caliper]'

# Install development dependencies
pip install -e '.[dev]'

Container Development Environment

FORGE provides a containerized development environment using the forge_launcher:

# Check environment status
./bin/forge_launcher status

# Build the FORGE container image
./bin/forge_launcher build

# Create/recreate the development container
./bin/forge_launcher recreate

# Enter the containerized development environment
./bin/forge_launcher enter

# Run commands directly in the container
./bin/forge_launcher enter "python -m pytest"

Basic Usage

./projects/skeleton/orchestration/ci.py --help
./projects/skeleton/orchestration/ci.py prepare
./projects/skeleton/orchestration/ci.py test

./projects/skeleton/orchestration/cli.py --help
./projects/skeleton/orchestration/cli.py precleanup
./projects/skeleton/orchestration/cli.py prepare
./projects/skeleton/orchestration/cli.py test

📊 Key Features

Core Framework

DSL (Domain Specific Language): Test definition and configuration
CI Integration: Continuous integration entrypoints
Notification System: Alert and reporting mechanisms
Image Management: Container orchestration support

Caliper - Artifact Processing

Parse: Traverse and parse test artifact trees
Visualize: Generate plots and HTML reports from unified models
KPI Management: Generate, import, export, and analyze key performance indicators
Multi-backend Support: Export to OpenSearch, S3, and MLflow
AI Evaluation: Export AI evaluation metrics in JSON format

📁 Project Structure

forge/
├── projects/                    # Main project modules
│   ├── caliper/                # Artifact post-processing
│   ├── core/                   # Framework core components
│   ├── matrix_benchmarking/    # Performance dashboards
│   ├── llm_d/                  # LLM deployment tools
│   ├── fournos_launcher/       # Fournos integration
│   └── skeleton/               # Project templates
├── docs/                       # Documentation
├── specs/                      # Technical specifications
├── bin/                        # Executable scripts
├── tests/                      # Test suites
└── vaults/                     # Configuration vaults

🔧 Configuration

Dependencies

Core Requirements:

Python 3.12+
Click (CLI framework)
PyYAML (configuration)
JSONSchema (validation)
Pydantic (data models)

Optional Backends:

OpenSearch: For KPI indexing and search (opensearch-py)
MLflow: For experiment tracking (mlflow)

Visualization:

Plotly/Dash: Interactive dashboards
Pandas: Data processing

🧪 Testing

# Run unit tests
pytest projects/core/tests/

# Run with coverage
pytest --cov=projects projects/core/tests/

# Run integration tests
pytest -m integration

# Run performance tests (slow)
pytest -m slow

🤝 Contributing

Development Setup

# Install pre-commit hooks
pre-commit install

# Run code formatting
ruff format projects/
ruff check projects/

Code Style

Ruff: An extremely fast Python linter and code formatter
Target: Python 3.12+ compatibility

📖 Documentation

Specifications: /specs/ - Detailed technical specifications
Quickstart Guides: /specs/*/quickstart.md - Getting started guides
API Documentation: Auto-generated from docstrings
Documentation: /docs/ - Usage examples and tutorials

🔗 Related Projects

Fournos: Job orchestration and execution platform

📄 License

Licensed under the Apache License 2.0. See LICENSE for details.

👥 Team & Support

Maintained by: Red Hat PSAP Team (psap@redhat.com)

Key Contributors:

Kevin Pouget (@kpouget)
Alberto Perdomo (@albertoperdomo2)
See OWNERS for the complete list

For issues, feature requests, or contributions, please use the GitHub issue tracker.

Keywords: testing performance scale openshift ai mlops benchmarking ci-cd

Name		Name	Last commit message	Last commit date
Latest commit History 388 Commits
.claude/commands		.claude/commands
.github		.github
.specify		.specify
bin		bin
docs		docs
projects		projects
specs		specs
vaults		vaults
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
OWNERS		OWNERS
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FORGE: Framework for Orchestrating Runtime Gen-AI Experiments

🎯 Purpose

🏗️ Architecture

Core Components

🚀 Quick Start

Installation

Container Development Environment

Basic Usage

📊 Key Features

Core Framework

Caliper - Artifact Processing

📁 Project Structure

🔧 Configuration

Dependencies

🧪 Testing

🤝 Contributing

Development Setup

Code Style

📖 Documentation

🔗 Related Projects

📄 License

👥 Team & Support

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FORGE: Framework for Orchestrating Runtime Gen-AI Experiments

🎯 Purpose

🏗️ Architecture

Core Components

🚀 Quick Start

Installation

Container Development Environment

Basic Usage

📊 Key Features

Core Framework

Caliper - Artifact Processing

📁 Project Structure

🔧 Configuration

Dependencies

🧪 Testing

🤝 Contributing

Development Setup

Code Style

📖 Documentation

🔗 Related Projects

📄 License

👥 Team & Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages