Back to Jobs
Mindrift

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift
Location
Job Type
Contract
Salary
Varies based on expertise, skills assessment, location, project needs, and other factors. Rates up to $40/hour. (Estimated)
Posted
1/23/2026
Career Level
Mid-Senior Level
Qualification
Bachelor's Degree in Computer Science or related field preferred
3+ years of software development experience63 views

Job Description

Crafting Effective AI Agent Evaluation Scenarios

As an Evaluation Scenario Writer, you'll play a crucial role in assessing the performance of AI agents. While each project involves unique tasks, contributors may:

  • Create structured test cases that simulate complex human workflows
  • Define gold-standard behavior and scoring logic to evaluate agent actions
  • Analyze agent logs, failure modes, and decision paths
  • Work with code repositories and test frameworks to validate your scenarios
  • Iterate on prompts, instructions, and test cases to improve clarity and difficulty
  • Ensure that scenarios are production-ready, easy to run, and reusable

Essential Skills for AI Agent Evaluation Scenario Writers

This opportunity is a good fit for software engineers open to part-time, non-permanent projects. Ideally, contributors will have:

  • 3+ years of software development experience with a strong Python focus
  • Experience with Git and code repositories
  • Comfort with structured formats like JSON/YAML for scenario description
  • Understanding of core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design
  • Familiarity with Docker
  • English proficiency - B2

How to Contribute to AI Agent Evaluation with Scenarios

Here’s how it works:

  1. Apply
  2. Pass qualification(s)
  3. Join a project
  4. Complete tasks
  5. Get paid

Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Paid contributions, with rates up to $40/hour*. Fixed project rate or individual rates, depending on the project. Some projects include incentive payments. *Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.

Get notified of similar jobs

We'll send you an email when jobs similar to "Evaluation Scenario Writer - AI Agent Testing Specialist" are posted.

Keyword: Evaluation Scenario Writer - AI Agent Testing SpecialistLocation: Kuwait

No spam ever. Unsubscribe with one click anytime. By subscribing, you agree to our privacy policy.

HomeJobsSign In