Site Reliability Engineer - Inference (San Francisco) Job at Jobright.ai, San Francisco, CA

L2NTazFmbzJqYTFNRnlPY0c4N3VxVmFrRWc9PQ==
  • Jobright.ai
  • San Francisco, CA

Job Description

Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai

2 days ago Be among the first 25 applicants

Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai

Get AI-powered advice on this job and more exclusive features.

Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.

Job Summary:

Lambda is the #1 GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems.

Responsibilities:

Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs

Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale

Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management.

Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio)

Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambdas available capacity

Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry.

Qualifications:

Required:

8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services

Highly skilled at writing Go and Python

Experience with bare-metal system installation and administration

Experience deploying applications and operators on Kubernetes

Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace

Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops

Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance.

Preferred:

Experience working with machine learning models

Experience operating large-scale, geographically distributed systems

Experience developing Kubernetes operators and components

Company:

Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships.

Seniority level

  • Seniority level

    Mid-Senior level

Employment type

  • Employment type

    Full-time

Job function

  • Industries

    Software Development

Referrals increase your chances of interviewing at Jobright.ai by 2x

Inferred from the description for this job

Medical insurance

Vision insurance

401(k)

Get notified when a new job is posted.

Sign in to set job alerts for Site Reliability Engineer roles.

San Francisco, CA $160,000.00-$180,000.00 4 days ago

Software Engineer, Infrastructure, Early Career

San Francisco, CA $126,000.00-$170,000.00 11 hours ago

San Francisco, CA $180,000.00-$280,000.00 3 days ago

San Francisco, CA $130,000.00-$238,000.00 1 day ago

San Francisco, CA $150,000.00-$250,000.00 1 day ago

San Francisco, CA $150,000.00-$230,000.00 4 months ago

San Francisco, CA $99,500.00-$200,000.00 2 weeks ago

Full-Stack Software Engineer (Jr/Mid level)

San Francisco, CA $120,000.00-$180,000.00 1 day ago

San Francisco, CA $56.25-$137,000.00 5 days ago

Software Development Engineer I - Frontend & Mobile

San Francisco, CA $99,500.00-$200,000.00 3 weeks ago

San Francisco, CA $160,000.00-$200,000.00 2 months ago

San Francisco, CA $150,000.00-$176,000.00 3 months ago

San Francisco, CA $120,000.00-$190,000.00 9 months ago

San Francisco, CA $130,000.00-$140,000.00 2 weeks ago

Software Engineer, AI Intern (Summer 2026)

San Francisco, CA $125,000.00-$175,000.00 2 months ago

Software Engineer, AI Intern (Winter 2026)

San Francisco, CA $130,000.00-$240,000.00 2 weeks ago

San Francisco, CA $163,200.00-$223,200.00 3 days ago

Software Engineer, Frontend (All Levels)

San Francisco, CA $150,000.00-$220,000.00 2 weeks ago

San Francisco, CA $150,000.00-$283,000.00 4 days ago

San Francisco, CA $155,000.00-$339,500.00 2 weeks ago

San Francisco, CA $140,000.00-$280,000.00 8 months ago

San Francisco, CA $165,000.00-$165,000.00 2 years ago

San Francisco, CA $120,000.00-$200,000.00 2 years ago

Were unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Job Tags

Full time, Summer work, H1b, Shift work,

Similar Jobs

Express Employment Professionals

Bilingual Call Center Representative Job at Express Employment Professionals

 ...Job Description Job Description Bilingual Collections Agent French & English We are seeking a Bilingual Collections Agent to join our team! In this role, you will be responsible for securing billing payments from customers and maintaining positive client relationships... 

Petersen Farms Inc

Harvest Help - Semi Truck & Tractor Operators Job at Petersen Farms Inc

Semi Truck Drivers and Tractor Operators needed for sugar beet opening and October sugar beet harvest. Primarily auto shift semis, 2006 and newer. No CDL required. Great housing and meals included. Please call (***) ***-**** for more information. Leave message if no answer... 

Major, Lindsey & Africa

Interim Compliance Analyst Job at Major, Lindsey & Africa

Job Description Major, Lindsey & Africas client is seeking a detail-oriented and technically proficient Interim Compliance Analyst to support their compliance operations during a critical engagement period. This role is ideal for someone with a strong foundation in...

Carters

Store Manager - 24H400 Job at Carters

 ...in the marketplace. These brands are sold in leading department stores, national chains, and specialty retailers domestically and internationally...  ...is available at Walmart, its Just One You brand is available at Target, and its Simple Joys brand is available on Amazon. The Company... 

UPMC - Pittsburgh Medical Center

Cook, Cold Job at UPMC - Pittsburgh Medical Center

Job Description Join Our Team as a Cold Cook UPMC Corporate, Downtown Pittsburgh. UPMC Corporate is seeking a dedicated Cold Cook to join our culinary team at the US Steel Tower in Downtown Pittsburgh! Schedule: Monday through Friday, 5:30 AM 2:00 PM Occasional...