Browse IT Jobs | IT Job Board

Sr Staff Data Scientist, Simulation Capacity Optimization

Waymo Mountain View, California

Waymo Overview Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver-The World's Most Experienced Driver -to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo's fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states. SCORPIO Team We are establishing a new team called SCORPIO (SimEval Capacity Operations, Resource Planning, Infrastructure Optimization). This team will be at the forefront of ensuring the efficient and effective use of Waymo's large scale simulation compute, storage, and network resources. SCORPIO will develop the data driven models, metrics, and processes to forecast demand, plan capacity, and optimize resource allocation, ultimately improving developer experience and maximizing return on infrastructure investments. What you'll do (Responsibilities) Define the vision, strategy, and technical roadmap for data driven capacity planning and resource optimization within Waymo's simulation environment. Lead the development and implementation of sophisticated forecasting models to predict demand for heterogeneous TI resources (CPU, GPU, Storage, Bandwidth, RAM) across various time horizons and simulation workflows. Design, build, and maintain robust capacity models, key metrics, and insightful dashboards to monitor resource utilization, identify current and future bottlenecks, and inform investment decisions. Develop and propose actionable strategies for resource optimization, cost management, and risk mitigation to senior leadership, finance, and engineering teams. Collaborate deeply with Simulation, Infrastructure, Finance, Product Management, and Engineering teams to understand demand drivers, usage patterns, system changes, and their impacts on resource needs. Spearhead the design and development of automated systems for demand management, quota allocation, and resource reassignment to enhance efficiency and responsiveness. Provide data driven insights to influence the design of simulation products and user guidelines, promoting more efficient resource consumption patterns. Build and mentor a high performing team, potentially including data scientists, business analysts, and software engineers. Minimum Qualifications PhD or Master's degree in Data Science, Statistics, Operations Research, Computer Science, Industrial Engineering, or a related quantitative field. 10+ years of experience in data science or quantitative analysis, with a significant focus on capacity planning, resource optimization, demand forecasting, or a closely related area. 5+ years of experience in a technical leadership role, with a proven track record of defining strategy, setting technical direction, and leading complex projects. Strong expertise in statistical modeling, time series analysis, and forecasting techniques (e.g., ARIMA, Exponential Smoothing, regression models). Demonstrated ability to work with large scale, complex datasets and experience with distributed computing environments. Proficiency in Python or R, including common data science libraries (e.g., pandas, NumPy, SciPy, scikit learn). Expertise in SQL and experience with data warehousing solutions (e.g., BigQuery, etc.). Exceptional communication and collaboration skills, with the ability to convey complex quantitative findings and recommendations clearly to diverse audiences, including executive leadership. Preferred Qualifications Direct experience in CapEx Engineering, Cloud Services Capacity Planning (e.g., AWS, GCP, Azure), or managing resources for large scale compute/HPC infrastructure. Familiarity with simulation workloads, performance analysis, and distributed systems. Experience with financial modeling, cost benefit analysis, and ROI calculations related to technical infrastructure. Experience building and deploying data pipelines and automation tools in a production environment. Experience hiring, growing, and nurturing a technical team. Salary Range The expected base salary range for this full time position across US locations is listed below. Actual starting pay will be based on job related factors, including exact work location, experience, relevant training and education, and skill level. Your recruiter can share more about the specific salary range for the role location or, if the role can be performed remote, the specific salary range for your preferred location, during the hiring process. Waymo employees are also eligible to participate in Waymo's discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements. $281,000-$356,000 USD

04/02/2026

Full time

Senior/Staff Security Engineer

Zettabyte Palo Alto, California

Senior/Staff Security EngineerAbout Zettabyte At Zettabyte , we're building the infrastructure layer for the AI-first world. Our mission is to make AI compute ubiquitous, seamless, and limitless by operating a cloud where AI workloads run securely at massive scale- anywhere, anytime . We run a multi-tenant GPU cloud for AI developers and enterprises. Security isn't a support function here-it's a core platform capability . Why this role exists Zettabyte is scaling a shared, high-performance AI compute platform in a space where traditional cloud security models break down . Multi-tenant GPUs, high-speed networking, and untrusted customer workloads introduce security challenges that don't have off-the-shelf answers. We're hiring a Staff Security Engineer to define and own the security architecture of our platform. You'll operate with wide latitude, shaping how isolation, detection, and trust are built into the system from day one. This role is ideal for someone who thrives in early-to-mid stage environments , enjoys working through ambiguity, and wants to build security systems that scale with the business-not slow it down. What you'll do Own the end-to-end security architecture for multi-tenant Kubernetes GPU clusters Design tenant isolation, egress control, and network segmentation across compute, storage, and networking layers Define and implement runtime security and intrusion detection for untrusted AI workloads Build security primitives (identity, secrets, encryption, policy enforcement) that platform teams build on Secure the software supply chain , from CI/CD pipelines to container admission Lead threat modeling and security design reviews for new platform features Drive compliance readiness (SOC 2, ISO 27001) without slowing engineering velocity Act as a force multiplier : unblock teams, set standards, and raise the security bar across the org Lead security incident response and turn incidents into systemic improvements What we're looking for 7+ years of experience in security engineering for cloud-native, infrastructure, or distributed systems Deep, hands-on expertise in Kubernetes security (RBAC, PSA, network policies, admission controllers) Strong understanding of cloud security primitives in AWS, GCP, or Azure Experience building or operating runtime security and policy enforcement (Falco, Cilium, OPA, Calico, eBPF-based tools) Solid grounding in network security and zero-trust architectures Practical experience with secrets management and key systems (Vault, cloud KMS) Strong automation skills in Go, Python, or Bash Proven ability to operate autonomously, make architectural decisions, and deliver in ambiguous environments Experience partnering deeply with platform, infra, and SRE teams Bonus experience (nice to have, not required) GPU isolation and virtualization security (MIG, SR-IOV) InfiniBand, RDMA, or high-performance networking HPC or large-scale multi-tenant compute platforms Security for AI/ML systems or data-intensive workloads Incident response leadership or red team experience Security certifications (CKS, OSCP, CISSP) Open-source contributions in security or cloud infrastructure What makes this role different You'll define the security model , not just implement tickets You'll work on real isolation problems in GPU and high-speed networking environments You'll influence architecture across the platform, not sit in a silo You'll help build a security culture at a company where speed and safety both matter Compensation Competitive salary - commensurate with your experience and aligned with industry standards Meaningful equity - be part of the upside as we build a category-defining company. Your grant will align with your role and the experience you bring.

04/02/2026

Full time

2 jobs found

Modal Window