Freelance Agent Evaluation Engineer — Mindrift | joboostr

Přeskočit na hlavní obsah

Práce Životopis Sledované

Freelance Agent Evaluation Engineer — Mindrift | joboostr

5 513 nabídek nalezeno

⌕

+0

⌕

+0

Vymazat filtry

Head of Strategic Operations (Chief of Staff)

Nebius

USARemote

Dnes

165 600 US$ – 206 900 US$ / rok

M

Welsh Audio Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

M

Italian Audio Evaluations Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

M

Portuguese Audio Evaluations Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

Supplier Development Senior Manager - Beauty & Personal Care (Home Based, GB, TBC)

Univar Solutions

Spojené královstvíRemote

Dnes

Senior ICT Risk Specialist (f/m/d) (Prague, CZ)

Deutsche Börse Group

Česko, Praha

Dnes

I

Consultant - Cultural Mediation Training

IOM

IrskoRemote

Dnes

Account Executive, High -Tech Clients, Poland

Gartner

Česko, Praha

Dnes

S

Brand Marketing Specialist

SWARM

USARemote

Dnes

75 000 US$ – 110 000 US$ / rok

S

Front End Developer

SWARM

USARemote

Dnes

90 000 US$ – 130 000 US$ / rok

Osobní bankéř/ka - Břeclav

Erstegroup

Česko, Břeclav

Dnes

Project (Delivery) Manager (B2B)

TripleTen

Remote

Dnes

2 500 € – 3 000 € / měsíc

N

Project Manager QAI

NSF

USARemote

Dnes

49 000 US$ – 83 000 US$ / rok

Senior User Acquisition Manager (Bing)

Ruby Labs

SrbskoRemote

Dnes

Senior Growth Product Manager

Ruby Labs

Remote

Dnes

Lead User Acquisition Manager

Ruby Labs

SrbskoRemote

Dnes

Director, Structures - Design (R5047)

Shield AI

Remote

Dnes

210 000 US$ – 320 000 US$ / rok

Senior User Acquisition Manager (TikTok)

Ruby Labs

SrbskoRemote

Dnes

D

Lead Customer Success Manager - Federal

Dragos

USARemote

Dnes

od 175 000 US$ / rok

F

Senior/Lead Recruiter - Infrastructure

Fuse Energy

Remote

Dnes

T

Director of Technical Solutions

Triple Whale

USARemote

Dnes

160 000 US$ – 185 000 US$ / rok

Full-stack web developer (student) for eHealth Solutions - Job Detail | Careers Marketplace - Siemens

Siemens Industry Software

Remote

Dnes

T

Senior Manager of Implementation

Triple Whale

Remote

Dnes

130 000 US$ – 150 000 US$ / rok

Software Engineer Finite Element Framework - Job Detail | Careers Marketplace - Siemens

Siemens Industry Software

Česko

Dnes

61 500 € – 110 700 € / rok

C

Senior Director, Enterprise Strategic Accounts NAM

Cyncly

USARemote

Dnes

180 000 US$ – 240 000 US$ / rok

Sales Engineer - German Speaking

ConnectWise

NěmeckoRemote

Dnes

M

Head of GTM

mozilla.ai

USARemote

Dnes

F

Norwegian Editor – (Norway - Freelance/Part-Time)

Fanatee

NorskoRemote

Dnes

25 US$ – 28 US$ / hodina

F

Danish Editor – (Denmark - Freelance/Part-Time)

Fanatee

DánskoRemote

Dnes

22 US$ – 25 US$ / hodina

S

Sr. Territory Manager - Northern, Central & Eastern Europe

Synacor

Spojené královstvíRemote

Dnes

‹12…184 ›

Freelance Agent Evaluation Engineer

Mindrift

ČeskoRemote·před 1 dnem

Mám zájem Přizpůsobit životopis

Tip pro vyšší úspěšnostŽivotopis na míru = víc pozvánek na pohovor

O pozici

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment.

Co budeš dělat

We're building a dataset to evaluate AI coding agents - how well a model handles real-world developer tasks.
You'll create challenging tasks and evaluation criteria within realistic simulated environments:
Build realistic developer environments - a virtual company with codebase, infrastructure, and context (tickets, docs, conversations) that forms a believable development history
Design tasks from intermediate states of these environments - craft the prompt, define what "solved" means, and ensure the task is solvable by an AI agent
Write tests that verify agent solutions - accept all valid approaches and reject incorrect ones, neither too strict nor too lenient
Iterate on tasks and tests based on QA feedback - review agent solutions, analyze failures, and refine until the evaluation is fair and robust

Koho hledáme

5+ years in software development
Core stack: Python (FastAPI), JavaScript/TypeScript (React), Docker, Postgres, Kafka, Redis
Experience writing tests (functional, integration)
English proficiency - B2+

Benefity

Compensation: Up to $40/hr equivalent, depending on level and pace. Tasks are estimated at ~20 hours each; you set your own schedule.

Dovednosti:PythonFastAPIJavaScriptTypeScriptReactDockerPostgresKafka

Částečný úvazekSeniorRemote