Site Reliability Engineer - Inference

  • Jobright.ai
  • San Francisco, California
  • 04/02/2026
Full time Information Technology Telecommunications

Job Description

Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai

2 days ago Be among the first 25 applicants

Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai

Get AI-powered advice on this job and more exclusive features.

Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.

Job Summary:

Lambda is the GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems.

Responsibilities:

• Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs

• Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale

• Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management.

• Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio)

• Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambda's available capacity

• Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry.

Qualifications:

Required:

• 8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services

• Highly skilled at writing Go and Python

• Experience with bare-metal system installation and administration

• Experience deploying applications and operators on Kubernetes

• Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace

• Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops

• Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance.

Preferred:

• Experience working with machine learning models

• Experience operating large-scale, geographically distributed systems

• Experience developing Kubernetes operators and components

Company:

Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships.

Seniority level
  • Seniority levelMid-Senior level
Employment type
  • Employment typeFull-time
Job function
  • IndustriesSoftware Development

Referrals increase your chances of interviewing at Jobright.ai by 2x

Inferred from the description for this job

Medical insurance

Vision insurance

401(k)

Get notified when a new job is posted.

Sign in to set job alerts for "Site Reliability Engineer" roles.

San Francisco, CA $160,000.00-$180,000.00 4 days ago

Software Engineer, Infrastructure, Early Career

San Francisco, CA $126,000.00-$170,000. hours ago

San Francisco, CA $180,000.00-$280,000.00 3 days ago

San Francisco, CA $130,000.00-$238,000.00 1 day ago

San Francisco, CA $150,000.00-$250,000.00 1 day ago

San Francisco, CA $150,000.00-$230,000.00 4 months ago

San Francisco, CA $99,500.00-$200,000.00 2 weeks ago

Full-Stack Software Engineer (Jr/Mid level)

San Francisco, CA $120,000.00-$180,000.00 1 day ago

San Francisco, CA $56.25-$137,000.00 5 days ago

Software Development Engineer I - Frontend & Mobile

San Francisco, CA $99,500.00-$200,000.00 3 weeks ago

San Francisco, CA $160,000.00-$200,000.00 2 months ago

San Francisco, CA $150,000.00-$176,000.00 3 months ago

San Francisco, CA $120,000.00-$190,000.00 9 months ago

San Francisco, CA $130,000.00-$140,000.00 2 weeks ago

Software Engineer, AI Intern (Summer 2026)

San Francisco, CA $125,000.00-$175,000.00 2 months ago

Software Engineer, AI Intern (Winter 2026)

San Francisco, CA $130,000.00-$240,000.00 2 weeks ago

San Francisco, CA $163,200.00-$223,200.00 3 days ago

Software Engineer, Frontend (All Levels)

San Francisco, CA $150,000.00-$220,000.00 2 weeks ago

San Francisco, CA $150,000.00-$283,000.00 4 days ago

San Francisco, CA $155,000.00-$339,500.00 2 weeks ago

San Francisco, CA $140,000.00-$280,000.00 8 months ago

San Francisco, CA $165,000.00-$165,000.00 2 years ago

San Francisco, CA $120,000.00-$200,000.00 2 years ago

We're unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.