Ref: #69311

Senior Infrastructure Engineer

  • Practice Cloud & Infrastructure

  • Technologies Infrastructure & Cloud

  • Location San Francisco, United States

  • Type Permanent

We need a Senior Infra Engineer with 6+ years of experience in scaling large, reliable systems, particularly with Nomad, Hashicorp, or Kubernetes. You should be comfortable building auto-scaling infrastructure for RL training systems and have a deep understanding/appreciation for AI research and low-level infrastructure. We are looking for high-agency, deeply technical candidates who work fast and can effectively delegate work to fleets of coding agents. Bonus points if you have experience with code sandboxes or running functions in the cloud, provisioning compute for large-scale training runs.

 

What you'll do:

  • Build out and maintain our auto-scaling infrastructure that underpins our RL training systems for major hyperscalers.

  • Take ownership of critical infrastructure components and drive their scalability and reliability.

  • Work closely with AI researchers and other engineers to ensure the infrastructure meets their needs.

  • Design and implement robust systems that can handle large-scale training runs.

  • Guide and mentor junior engineers, bringing your battle scars and experience to operationalize our infrastructure.

Voeg CV toe in DOC, DOCX, PDF, HTML, en TXT.

Wij verwerken momenteel je sollicitatie, een moment geduld a.u.b!