A technology startup in Palo Alto is seeking a Senior Software Engineer to design and build scalable backend systems for AI and GPU cluster observability. The ideal candidate will have over 7 years of industry experience with a strong foundation in data structures, algorithms, and proficient in languages like C, C++, Go, Java, or Python. The role involves developing methods to detect complex infrastructure issues and collaborating across teams to ensure system reliability and performance. Competitive compensation and a great benefits package are offered.
04/02/2026
Full time
A technology startup in Palo Alto is seeking a Senior Software Engineer to design and build scalable backend systems for AI and GPU cluster observability. The ideal candidate will have over 7 years of industry experience with a strong foundation in data structures, algorithms, and proficient in languages like C, C++, Go, Java, or Python. The role involves developing methods to detect complex infrastructure issues and collaborating across teams to ensure system reliability and performance. Competitive compensation and a great benefits package are offered.
Senior Software Engineer - AI Infra Visibility About Clockwork Systems Clockwork.io - Software Driven Fabrics to increase GPU cluster utilization Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI fabrics by delivering cross-stack observability to catch and quickly resolve problems, workload fault tolerance to keep jobs running through failures, and performance acceleration that dynamically routes and paces traffic to avoid congestion. We are looking for a strong Senior Software Engineer to help design and build scalable backend systems for AI and GPU cluster observability. In this role, you will work on high-performance distributed systems that power telemetry ingestion, data processing, and APIs for monitoring large-scale GPU clusters and AI workloads. What You'll Do Design and build scalable backend systems for metric collection, processing, and analysis. Develop robust methods to detect complex infrastructure issues that impact AI workloads. Build large distributed systems running in production environments. Collaborate across teams to deliver reliable, performant, and maintainable systems. What We Are Looking For 7+ years of industry experience building and operating production software systems. Strong foundation in data structures, algorithms, and software design. Fluency in one or more programming languages: C, C++, Go, Java, or Python. Experience designing, building, and scaling large distributed systems. Hands-on experience with service-oriented architectures and cloud platforms (AWS, GCP, Azure). Solid understanding of operating systems fundamentals (threads, scheduling, synchronization; kernel programming is a plus). Experience with databases, including design, development, or scaling. Excellent debugging, problem-solving, and communication skills. Nice to Have Knowledge of networking protocols; familiarity with NIC architecture and operation. Understanding of GPU or AI infrastructure (e.g., DCGM, PyTorch). Familiarity with observability systems (metrics, logs, traces); experience with OpenTelemetry, Prometheus, or distributed tracing. A friendly and inclusive workplace culture. Competitive compensation. A great benefits package. Catered lunch. Clockwork Systems is an equal opportunity employer. We are committed to building world-class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together.
04/02/2026
Full time
Senior Software Engineer - AI Infra Visibility About Clockwork Systems Clockwork.io - Software Driven Fabrics to increase GPU cluster utilization Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI fabrics by delivering cross-stack observability to catch and quickly resolve problems, workload fault tolerance to keep jobs running through failures, and performance acceleration that dynamically routes and paces traffic to avoid congestion. We are looking for a strong Senior Software Engineer to help design and build scalable backend systems for AI and GPU cluster observability. In this role, you will work on high-performance distributed systems that power telemetry ingestion, data processing, and APIs for monitoring large-scale GPU clusters and AI workloads. What You'll Do Design and build scalable backend systems for metric collection, processing, and analysis. Develop robust methods to detect complex infrastructure issues that impact AI workloads. Build large distributed systems running in production environments. Collaborate across teams to deliver reliable, performant, and maintainable systems. What We Are Looking For 7+ years of industry experience building and operating production software systems. Strong foundation in data structures, algorithms, and software design. Fluency in one or more programming languages: C, C++, Go, Java, or Python. Experience designing, building, and scaling large distributed systems. Hands-on experience with service-oriented architectures and cloud platforms (AWS, GCP, Azure). Solid understanding of operating systems fundamentals (threads, scheduling, synchronization; kernel programming is a plus). Experience with databases, including design, development, or scaling. Excellent debugging, problem-solving, and communication skills. Nice to Have Knowledge of networking protocols; familiarity with NIC architecture and operation. Understanding of GPU or AI infrastructure (e.g., DCGM, PyTorch). Familiarity with observability systems (metrics, logs, traces); experience with OpenTelemetry, Prometheus, or distributed tracing. A friendly and inclusive workplace culture. Competitive compensation. A great benefits package. Catered lunch. Clockwork Systems is an equal opportunity employer. We are committed to building world-class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together.
A leading software company in Palo Alto is seeking a Full Stack Engineer to join their engineering team. The role involves designing and developing modern web applications that simplify complex data for users. Ideal candidates should have proficiency in both front-end and back-end programming languages, including TypeScript, HTML, Go, and Python. Strong problem-solving skills and a degree in a relevant field are also essential. The company offers competitive compensation and a friendly, inclusive workplace culture.
04/02/2026
Full time
A leading software company in Palo Alto is seeking a Full Stack Engineer to join their engineering team. The role involves designing and developing modern web applications that simplify complex data for users. Ideal candidates should have proficiency in both front-end and back-end programming languages, including TypeScript, HTML, Go, and Python. Strong problem-solving skills and a degree in a relevant field are also essential. The company offers competitive compensation and a friendly, inclusive workplace culture.
Clockwork.io - A Software-Driven Revolution in AI Networking Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI networking, delivering deterministic time, ultra low latency, and seamless scalability for modern distributed systems. About the Role We are looking for a passionate Full Stack Engineer to join our growing engineering team. You'll work across the entire technology stack - designing, building, and scaling modern web applications that bring complex data and infrastructure insights to life. If you're a full-stack programming ace who loves crafting powerful, intuitive user experiences backed by robust backend systems, we want to hear from you! What You Will Do Work in a fast-paced environment that values innovation, diversity of talent, and technical excellence. Collaborate closely with Clockwork engineering teams to design, prototype, and implement end-to-end web applications and interactive visualizations that simplify complex data for network engineers and enterprise clients. Develop front-end interfaces using modern frameworks and back-end services that ensure performance, reliability, and scalability. Write clean, efficient, and maintainable code across multiple languages and layers of the stack. Continuously improve platform performance, usability, and design by staying current with modern technologies and development practices. What We're Looking For A degree in Computer Science, Electrical Engineering, or a related field. Proficiency in both front-end and back-end programming languages such as TypeScript, HTML, Go, and Python. Strong experience building custom data visualizations and interactive user interfaces. Hands on experience with front-end frameworks (e.g., Vue, React, or similar) and visualization libraries such as D3.js or SVG-based tools. Excellent debugging, problem-solving, and collaboration skills. Strong communication and attention to detail. Familiarity with high-performance computing (HPC), cloud, and container technologies (e.g., Kubernetes, Azure) is a plus. A friendly and inclusive workplace culture. Competitive compensation. A great benefits package. Catered lunch. Clockwork Systems is an equal opportunity employer. We are committed to building world class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together.
04/02/2026
Full time
Clockwork.io - A Software-Driven Revolution in AI Networking Clockwork Systems was founded by Stanford researchers and veteran systems engineers who share a vision for redefining the foundations of distributed computing. As AI workloads grow increasingly complex, traditional infrastructure struggles to meet the demands of performance, reliability, and precise coordination. Clockwork is pioneering a software-driven approach to AI networking, delivering deterministic time, ultra low latency, and seamless scalability for modern distributed systems. About the Role We are looking for a passionate Full Stack Engineer to join our growing engineering team. You'll work across the entire technology stack - designing, building, and scaling modern web applications that bring complex data and infrastructure insights to life. If you're a full-stack programming ace who loves crafting powerful, intuitive user experiences backed by robust backend systems, we want to hear from you! What You Will Do Work in a fast-paced environment that values innovation, diversity of talent, and technical excellence. Collaborate closely with Clockwork engineering teams to design, prototype, and implement end-to-end web applications and interactive visualizations that simplify complex data for network engineers and enterprise clients. Develop front-end interfaces using modern frameworks and back-end services that ensure performance, reliability, and scalability. Write clean, efficient, and maintainable code across multiple languages and layers of the stack. Continuously improve platform performance, usability, and design by staying current with modern technologies and development practices. What We're Looking For A degree in Computer Science, Electrical Engineering, or a related field. Proficiency in both front-end and back-end programming languages such as TypeScript, HTML, Go, and Python. Strong experience building custom data visualizations and interactive user interfaces. Hands on experience with front-end frameworks (e.g., Vue, React, or similar) and visualization libraries such as D3.js or SVG-based tools. Excellent debugging, problem-solving, and collaboration skills. Strong communication and attention to detail. Familiarity with high-performance computing (HPC), cloud, and container technologies (e.g., Kubernetes, Azure) is a plus. A friendly and inclusive workplace culture. Competitive compensation. A great benefits package. Catered lunch. Clockwork Systems is an equal opportunity employer. We are committed to building world class teams by welcoming bright, passionate individuals from all backgrounds. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender identity or expression, national origin, disability, or protected veteran status. We believe diversity drives innovation, and we grow stronger together.