Site Reliability Engineer - AI Agents — Kraken | joboostr

Přeskočit na hlavní obsah

Site Reliability Engineer - AI Agents — Kraken | joboostr

5 513 nabídek nalezeno

Head of Strategic Operations (Chief of Staff)

Nebius

USARemote

Dnes

165 600 US$ – 206 900 US$ / rok

Welsh Audio Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

Italian Audio Evaluations Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

Portuguese Audio Evaluations Specialist - Freelance AI Trainer Project

Meridial

Remote

Dnes

6 US$ – 65 US$ / hodina

Supplier Development Senior Manager - Beauty & Personal Care (Home Based, GB, TBC)

Univar Solutions

Spojené královstvíRemote

Dnes

Senior ICT Risk Specialist (f/m/d) (Prague, CZ)

Deutsche Börse Group

Česko, Praha

Dnes

Consultant - Cultural Mediation Training

IOM

IrskoRemote

Dnes

Account Executive, High -Tech Clients, Poland

Gartner

Česko, Praha

Dnes

Brand Marketing Specialist

SWARM

USARemote

Dnes

75 000 US$ – 110 000 US$ / rok

Front End Developer

SWARM

USARemote

Dnes

90 000 US$ – 130 000 US$ / rok

Osobní bankéř/ka - Břeclav

Erstegroup

Česko, Břeclav

Dnes

Project (Delivery) Manager (B2B)

TripleTen

Remote

Dnes

2 500 € – 3 000 € / měsíc

Project Manager QAI

NSF

USARemote

Dnes

49 000 US$ – 83 000 US$ / rok

Senior User Acquisition Manager (Bing)

Ruby Labs

SrbskoRemote

Dnes

Senior Growth Product Manager

Ruby Labs

Remote

Dnes

Lead User Acquisition Manager

Ruby Labs

SrbskoRemote

Dnes

Director, Structures - Design (R5047)

Shield AI

Remote

Dnes

210 000 US$ – 320 000 US$ / rok

Senior User Acquisition Manager (TikTok)

Ruby Labs

SrbskoRemote

Dnes

Lead Customer Success Manager - Federal

Dragos

USARemote

Dnes

od 175 000 US$ / rok

Senior/Lead Recruiter - Infrastructure

Fuse Energy

Remote

Dnes

Director of Technical Solutions

Triple Whale

USARemote

Dnes

160 000 US$ – 185 000 US$ / rok

Full-stack web developer (student) for eHealth Solutions - Job Detail | Careers Marketplace - Siemens

Siemens Industry Software

Remote

Dnes

Senior Manager of Implementation

Triple Whale

Remote

Dnes

130 000 US$ – 150 000 US$ / rok

Software Engineer Finite Element Framework - Job Detail | Careers Marketplace - Siemens

Siemens Industry Software

Česko

Dnes

61 500 € – 110 700 € / rok

Senior Director, Enterprise Strategic Accounts NAM

Cyncly

USARemote

Dnes

180 000 US$ – 240 000 US$ / rok

Sales Engineer - German Speaking

ConnectWise

NěmeckoRemote

Dnes

Head of GTM

mozilla.ai

USARemote

Dnes

Norwegian Editor – (Norway - Freelance/Part-Time)

Fanatee

NorskoRemote

Dnes

25 US$ – 28 US$ / hodina

Danish Editor – (Denmark - Freelance/Part-Time)

Fanatee

DánskoRemote

Dnes

22 US$ – 25 US$ / hodina

Sr. Territory Manager - Northern, Central & Eastern Europe

Synacor

Spojené královstvíRemote

Dnes

Site Reliability Engineer - AI Agents

Kraken

RemoteDnes

Mám zájem Přizpůsobit životopis

Tip pro vyšší úspěšnostŽivotopis na míru = víc pozvánek na pohovor

O pozici

Building the Future of Open Finance Payward - the parent company behind Kraken, NinjaTrader, Breakout, xStocks, Payward Services and CF Benchmarks - has spent the last 15 years building one of the most modern and globally accessible financial infrastructure platforms in the industry, built to advance an open, global financial system. Before you apply, we encourage you to explore our culture page to understand what drives us and how we work. The team Founded in 2011, Kraken is one of the world's longest-standing crypto platforms, trusted by over 10 million individuals and institutions across the globe. It offers spot trading, margin, futures, staking, and OTC services, with products built for both individual investors and institutional clients. The AI Infrastructure team sits within the Data organization and is responsible for building, operating, and scaling the systems that power AI agents in production — both internal tools and external-facing products. Working closely with the AI and Agent Systems teams, this group ensures that the orchestration, execution, and model-serving layers underpinning agentic workflows are reliable, observable, and built to scale. This team operates at the intersection of data infrastructure and applied AI — a space that moves fast and demands engineers who can bring production discipline to emerging technology. You'll partner across Data Engineering, ML, and product-facing teams to harden agent infrastructure and keep it running at the standards our users expect. Importantly, this is a platform engineering team. Beyond operating infrastructure, the team is responsible for building the APIs, SDKs, and platform capabilities that enable AI, Data, and Engineering teams to safely and efficiently consume agent infrastructure as a service. Success in this role requires thinking beyond infrastructure operations and toward developer experience, platform adoption, and long-term scalability.

Co budeš dělat

Design, build, and operate the infrastructure layer supporting AI agent workflows in production
Ensure reliability, scalability, and observability of agentic systems across internal and external products
Design and develop platform services, APIs, SDKs, and self-service capabilities that allow engineering teams to easily consume AI infrastructure and agent platform services
Manage and maintain the compute, orchestration, and serving infrastructure powering model inference and agent execution
Implement robust monitoring, alerting, and incident response procedures tailored to AI/ML workloads
Utilize Infrastructure as Code (IaC) tools such as Terraform to provision and manage cloud (AWS) infrastructure components
Build and maintain CI/CD pipelines that support rapid, reliable deployment of AI services and agent workflows
Define and implement guardrails, failure handling, and recovery patterns specific to agentic and LLM-powered systems
Collaborate with AI and Data Engineering teams to translate experimental agent prototypes into hardened production systems
Manage containerized workloads using Kubernetes, ensuring efficient deployment, scaling, and orchestration of AI services
Implement access controls and security best practices across AI infrastructure environments
Document architecture, runbooks, and best practices to support knowledge sharing across the team

Koho hledáme

5+ years of experience as a Site Reliability Engineer, Infrastructure Engineer, Platform Engineer, or similar role in a production environment
Hands-on experience supporting ML infrastructure, model serving, or MLOps workflows in production
Experience building developer platforms, internal tooling, APIs, or SDKs consumed by engineering teams at scale
Strong understanding of platform engineering principles, including developer experience, self-service infrastructure, and API-driven platform design
Proficiency with Infrastructure as Code tools, particularly Terraform
Experience with containerization and orchestration, particularly Kubernetes and Docker
Solid understanding of cloud infrastructure, preferably AWS
Strong scripting skills (bash/shell) and proficiency in at least one programming language (Python preferred)
Experience designing and operating observability, monitoring, and alerting systems
Experience implementing incident response procedures and participating in on-call rotations
Strong collaboration skills working across data, AI, and engineering teams
High ownership mindset in a fast-moving, high-stakes production e

Dovednosti:KubernetesDockerTerraformAWSPythonCI/CDobservabilityInfrastructure as Code

Plný úvazekSeniorRemote