Freelance AI Evaluation Engineer (Python/Full-Stack)

Mindrift

Location

Kuwait,Kuwait

Job Type

Contract

Salary

$40 per hour

Posted

4/9/2026

Career Level

Mid-Senior Level

Qualification

Degree in Computer Science, Software Engineering or related fields

Remote5+ years in software development13 views

Job Description

What this opportunity involves

You’ll create challenging coding test cases that push AI coding systems to their limits:
Review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements and information sources
Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks
Craft “fair but hard” challenges where the AI has all the context it needs, but has to work for it (information scattered across files and external sources, complex reasoning required)
Analyze AI failures to understand what the model struggles with vs. what it masters
Iterate based on feedback from expert QA reviewers who score your work on 7 quality criteria

What we look for

Degree in Computer Science, Software Engineering or related fields
5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
Background in Full-Stack development, with an equal focus on building React-based interfaces and robust Back-end systems
Experience writing tests (functional, integration – not just running them)
Docker containers (running evaluations locally in containers)
CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
English proficiency - B2

How it works

Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid

Effort estimate

Tasks for this project are estimated to take 20 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Compensation

On this project, contributors can earn up to $40 per hour equivalent, depending on their level and pace of contribution.Compensation varies across projects depending on scope, complexity, and required expertise. Please note that other projects on the platform may offer different earning levels based on their requirements.

Get notified of similar jobs

We'll send you an email when jobs similar to "Freelance AI Evaluation Engineer (Python/Full-Stack)" are posted.

Related Jobs You Might Like

View all jobs →

Senior Python Engineer - AI Coding Agent Evaluation (Freelance)

Mindrift

KuwaitRemote

Contract

Up to $200/hr equivalent

About MindriftMindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment.What This Opportunity InvolvesWe're building a dataset to evaluate AI coding agents - how well a model handles real-world developer tasks. You'll create challenging tasks and evaluation criteria within realistic simulated environments.Build realistic developer environments - a virtual company with codebase, infrastructure, and context (tickets, docs, conversations) that forms a believable development historyDesign tasks from intermediate states of these environments - craft the prompt, define what 'solved' means, and ensure the task is solvable by an AI agentWrite tests that verify agent solutions - accept all valid approaches and reject incorrect ones, neither too strict nor too lenientIterate on tasks and tests based on QA feedback - review agent solutions, analyze failures, and refine until the evaluation is fair and robustWhat This Is NOTNot data labelingNot prompt engineeringNot writing code from scratch - the agent writes most of the code; you guide and evaluateWhat We Look For8+ years in software developmentCore stack: Python (FastAPI), JavaScript/TypeScript (React), Docker, Postgres, Kafka, RedisExperience writing tests (functional, integration)English proficiency - B2+Why This Is HardFrontier models are already good at coding. Creating a task that genuinely challenges the best models is non-trivial. You need to deeply understand where models fail and what scenarios reveal the difference between a good and a bad solution. Tasks have many valid solutions - writing tests that accept all correct solutions and reject incorrect ones is harder than it sounds.How It WorksApply → Pass qualification(s) → Join a project → Complete tasks → Get paidEffort EstimateTasks for this project are estimated to take 30 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.CompensationUp to $200/hr equivalent, depending on level and pace. Tasks are estimated at ~30 hours each; you set your own schedule.Application InstructionsPlease submit your CV in English and indicate your level of English proficiency.

View Details →

Freelance Graphic Designer

Mindrift

KuwaitRemote

Part-time

Not disclosed

About MindriftMindrift is looking for a versatile, highly skilled Graphic Designer to join the Tendem project (https://tendem.ai/) and create high-quality visual assets for real-world use cases. In this role, you'll apply your expertise in visual communication, layout design, and branding to transform ideas into professional, polished creative assets. This part-time remote opportunity is ideal for creative professionals with hands-on experience in visual communications, layout design, and creating diverse marketing materials.What We DoThe Mindrift platform connects specialists with innovative technology projects. Our mission is to help develop high-quality AI technologies by combining real-world expertise from professionals across the globe with advanced AI development efforts.About the RoleThis is a freelance role for a Tendem project. As a Graphic Designer, your focus will be on layouts, infographics, social media templates, and overall visual polish. We need a versatile, all-around graphic designer to handle diverse visual tasks, utilizing industry-standard tools to structure information cleanly and effectively.Key ResponsibilitiesOwn the creation of clean layouts, modern infographics, and establish a clear visual hierarchy to ensure readable and engaging content.Design highly engaging, reusable templates for various social media channels, as well as impactful one-pagers.Elevate everyday materials by working deeply with advanced typography, grid systems, and thoughtful composition.Transform raw data and concepts into professional, polished visual assets tailored to specific marketing and communication goals.Enforce design quality standards through systematic verification of visual alignment, color usage, and layout consistency prior to delivery.Important NotePlease submit your CV in English. Your CV must include a link to your portfolio with examples of your work — applications without a portfolio link will not be considered.This is project-based freelance work. Tasks are available only when projects are active. You may be invited to one or more projects depending on your profile and current opportunities.How to Get StartedSimply apply to this post, complete the qualification process, and get the chance to contribute to projects aligned with your skills — on your own schedule.RequirementsEducational QualificationsAt least 1-2 years of relevant experience in graphic design, digital design, or visual communications is desirable.Bachelor's or Master's Degree in Graphic Design, Visual Arts, or related creative fields is a plus.Academic and/or Professional ExperienceCandidates must have a strong, diverse portfolio showcasing a wide variety of formats (such as social media graphics, marketing collateral, and one-pagers). We are looking for versatile specialists with an exceptional grasp of typography, visual composition, and grid systems. An adaptable, fast-paced, and detail-oriented approach is essential, along with the ability to work independently to solve visual challenges.Technical Skills (Essential)Layout & Typography: Exceptional grasp of typography, precise grid systems, and visual composition to elevate everyday materials.Visual Asset Creation: Proven ability to structure information cleanly through modern infographics and compelling social media templates.Design Tools: Deep proficiency with Adobe Creative Suite (Illustrator, Photoshop, InDesign) and Figma to handle a wide spectrum of visual tasks.Format Versatility: Seamlessly switch between designing impactful one-pagers and highly engaging digital templates, adapting to the constraints of each medium.Additional RequirementsAn adaptable, fast-paced, and detail-oriented mindset.Strong dedication to delivering visually polished, professional-grade assets.Self-directed work ethic with the ability to manage diverse visual tasks simultaneously.Your CV must include a link to your portfolio with graphic design examples.

View Details →

Freelance Full-Stack Web App Developer

Mindrift

KuwaitRemote

Part-time

Not specified

About MindriftMindrift is a platform connecting specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleThis is a freelance role for the Tendem project. As a Full-Stack Web App Developer (AI Pilot), you will design, build, and refine browser-based applications with real logic, state, persistence, and user input — ranging from habit trackers and budgeting tools to internal dashboards, mini-SaaS tools, and AI-powered apps. You may also work on standalone Python applications and data-processing scripts.Key ResponsibilitiesBuild interactive web applications with frontend (React, Next.js, Vue, or similar) and a backend API (Python/FastAPI/Flask or Node/Express).Design and implement data models, schemas, and persistence layers using SQL (PostgreSQL, SQLite) or NoSQL stores.Implement authentication, sessions, and basic role-based access where needed.Integrate third-party APIs and AI/LLM services (OpenAI, Anthropic, or similar) into product features.Handle state management, user input validation, error states, and loading states cleanly.Build standalone Python tools and scripts where required (data processing, API clients, lightweight backend utilities).Evaluate AI-generated full-stack code and refactor it for correctness, security, performance, and maintainability.Write clear, testable code and debug end-to-end issues across frontend, backend, and database.RequirementsAt least 3 years of relevant experience in full-stack web development or shipping interactive web applications (required).Bachelor's or Master's Degree in Computer Science, Engineering, Information Technology, or related technical fields is a plus.Strong command of JavaScript/TypeScript and at least one modern frontend framework (React, Next.js, Vue, Svelte, or similar).Solid backend experience in Python (FastAPI, Flask, Django) and/or Node.js (Express, NestJS).Hands-on experience with relational databases (PostgreSQL, MySQL, SQLite) and basic schema design.Experience implementing REST APIs, request validation, error handling, and authentication flows.Familiarity with deployment platforms (Vercel, Netlify, Render, Fly.io, Railway, or similar).Experience integrating LLM APIs or other AI services into product features is a strong plus.Comfortable with version control (Git) and basic testing practices.Portfolio of shipped web applications (required).Strong attention to detail and commitment to building working, robust products.Self-directed work ethic with the ability to architect, build, and ship features independently.Strong English communication skills.BenefitsPart-time remote flexibility.Work on cutting-edge AI projects with major tech innovators.Global collaboration with specialists across the world.Professional growth in AI-assisted software engineering.

View Details →