Bayside Solutions

Inference Engineer (Ray)

in Cupertino, California

New Job

Job Description Job Attributes+

  • Req ID

    23949_1755893954

  • Job Category

    IT

  • Job Type

    Contract

  • Hourly Salary

    From $0 to $0

  • Job Location

    Cupertino, California
    United States

Overview

Inference Engineer (Ray)

W2 Contract

Salary Range: $135,200 - $156,000 per year

Location: Cupertino, CA - Remote Role

Job Summary:

You will leverage open-source Ray to offer a unified framework for processing and deployment of complex data+ML pipelines. It enables the next generation of intelligent experiences for our products and services by combining data and processing layers, as well as a model inference platform, into one unified end-to-end workflow that eliminates the complexity of running multiple independent jobs while significantly improving the hardware resource efficiency and development speed. The tight integration of Ray with our data services makes it the go-to solution for serving complex and large-scale data and ML pipelines. The team enables future intelligent products by making a cutting-edge ecosystem of data+ML technologies for large-scale and efficient systems for all data and ML engineers.

Duties and Responsibilities:

  • Designing, implementing, and maintaining distributed systems to build world-class ML platforms/products at scale
  • Experiment with, deploy, and manage LLMs in a production context
  • Benchmark and optimize inference deployments for different workloads, e.g., online vs. batch vs. streaming workloads
  • Diagnose, fix, improve, and automate complex issues across the entire stack to ensure maximum uptime and performance.
  • Design and extend services to improve the functionality and reliability of the platform
  • Monitor system performance, optimize for cost and efficiency, and resolve any issues that arise.
  • Build relationships with stakeholders across the organization to better understand internal customer needs and enhance our product better for end users.

Requirements and Qualifications:

  • 5+ years of experience in distributed systems with deep knowledge in computer science fundamentals
  • Experience managing deployments of LLMs at scale
  • Experience with inference runtimes/engines, e.g., ONNXRT, TensorRT, vLLM, sglang
  • Experience with ML Training/Inference profiling and optimization for different workloads and tasks, e.g., online inference, batch inference, streaming inference
  • Experience with profiling ML models for different end-use cases, e.g., RAG vs. code completion, etc.
  • Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
  • Experience in delivering data and machine learning infrastructure in production environments
  • Experience in configuring, deploying, and troubleshooting large-scale production environments
  • Experience in designing, building, and maintaining scalable, highly available systems that prioritize ease of use
  • Experience with alerting, monitoring, and remediation automation in a large-scale distributed environment
  • Extensive programming experience in Java, Python, or Go
  • Strong collaboration and communication (verbal and written) skills
  • B.S., M.S., or Ph.D. in Computer Science, Computer Engineering, or equivalent practical experience

Preferred Qualifications:

  • Understanding of the ML lifecycle and state-of-the-art ML Infrastructure technologies
  • Familiarity with CUDA + kernel implementation
  • Experience with inference optimization and fine-tuning techniques (e.g., pruning, distilling, quantization)
  • Experience with deploying + optimizing ML models on heterogeneous hardware, e.g., GPUs, TPUs, Inferentia, etc.
  • Experience with GPU and other types of HPC infrastructure
  • Experience with training frameworks like PyTorch, Tensorflow, and JAX
  • Deep understanding of Ray and KubeRay

Bayside Solutions, Inc. is not able to sponsor any candidates at this time. Additionally, candidates for this position must qualify as a W2 candidate.

Bayside Solutions, Inc. may collect your personal information during the position application process. Please reference Bayside Solutions, Inc.'s CCPA Privacy Policy at www.baysidesolutions.com.

Saved Jobs

    © 2025 Bayside Solutions. All Rights Reserved. Privacy Policy. Powered by Adverto Inc.