Skip to content

Multimodal Use Cases – Overview

Many modern agents must process more than just text. Scenario supports tests where your agent receives images, files, audio, and other modalities – individually or combined.

This section shows how to structure such tests, common pitfalls, and best-practices for reliable evaluation.

Available Guides

  • Images – complete working example with fixture image and judge criteria.
  • Image Generationcoming soon
  • Filescoming soon (PDF, CSV, etc.).
  • Audiocoming soon (voice notes, phone calls, etc.).

Next Steps

Pick a modality from the list above and follow the dedicated guide.