What is Scenario?

Scenario is an Agent Testing Framework based on simulations

Engineers and Researchers use Scenario in combination with evals to guarantee the 3 levels of agent quality:

Level 1: Unit tests
Traditional unit and integration software tests to guarantee that e.g. the agent tools are working correctly from a software point of view
Level 2: Evals, Finetuning and Prompt Optimization
Measuring the performance of individual non-deterministic components of the agent, for example maximizing RAG accuracy with evals, or approximating human preference with GRPO
Level 3: Agent Simulations
End-to-end testing of the agent in different scenarios and edge cases, guaranteeing the whole agent achieves more than the sum of its parts, simulating a wide range of situations

Scenario is a must-have tool for agent development, it allows you to change your agent prompts, tools and structure while making sure regressions don't happen, it does not require a dataset, and it's compatible with all AI Agent Frameworks.

Getting Started

If you are new to Scenario, you can start by writing your first scenario, then learn how to integrate your agent and dive deeper into the core concepts.

Getting Started

Your First Scenario

Agent Integration

Integrate your agent with Scenario

Core Concepts

Learn the core concepts and capabilities

Why Scenario?

Scenario is the most advanced and flexible agent testing framework, the library's agnostic design makes it incredibly simple to learn and use. Here are some of the key features:

Test real agent behavior by simulating users in different scenarios and edge cases
Evaluate and judge at any point of the conversation, powerful multi-turn control
Combine it with any LLM eval framework or custom evals, agnostic by design
Integrate your Agent by implementing just one call() method
Available in Python, TypeScript and Go

Read more about the reasoning behind Scenario in Simulation-Based Testing.

Dive deeper

Writing Scenarios

Learn how to create effective scenario tests

Scripted Simulations

Precise control over the conversation flow

Debug Mode

Step through scenarios interactively

Configuration

Project config, env vars, and logging

Test Runner Integration

Integrate with Vitest and Jest