What is Scenario?

Engineers and Researchers use Scenario in combination with evals to guarantee the 3 levels of agent quality:
-
Level 1: Unit tests
Traditional unit and integration software tests to guarantee that e.g. the agent tools are working correctly from a software point of view -
Level 2: Evals, Finetunning and Prompt Optimization
Measuring the performance of individual non-deterministic components of the agent, for example maximizing RAG accuracy with evals, or approximating human preference with GRPO -
Level 3: Agent Simulations
End-to-end testing of the agent in different scenarios and edge cases, guaranteeing the whole agent achieves more than the sum of its parts, simulating a wide range of situations
Scenario is a must-have tool for agent development, it allows you to change your agent prompts, tools and structure while making sure regressions don't happen, it does not require a dataset, and it's compatible with all AI Agent Frameworks.
Getting Started
If you are new to Scenario, you can start by writing your first scenario, then learn how to integrate your agent and dive deeper into the core concepts.
Getting Started
Your First Scenario
Agent Integration
Integrate your agent with Scenario
Core Concepts
Learn the core concepts and capabilities
Why Scenario?
Scenario is the most advanced and flexible agent testing framework, the library's agnostic design makes it incredibly simple to learn and use. Here are some of the key features:
- Test real agent behavior by simulating users in different scenarios and edge cases
- Evaluate and judge at any point of the conversation, powerful multi-turn control
- Combine it with any LLM eval framework or custom evals, agnostic by design
- Integrate your Agent by implementing just one
call()
method - Available in Python, TypeScript and Go
Read more about the reasoning behind Scenario in Simulation-Based Testing.