Browse IT Jobs | IT Job Board

SRE / Cloud Administrator - MOD DV or SC

Sanderson Tewkesbury, Gloucestershire

Site Reliability Engineer / Cloud Administrator - SC or MOD DV Location : Tewkesbury Salary: £55,000 - £75,000 Clearance: Active MOD DV Preferable, alternatively an active SC. Type: Full time on-site A leading provider of innovative research, data, machine learning and infrastructure solutions to secure UK Defence customers are looking to add to their team. They are global leaders in Internet facing systems and the innovative application of machine intelligence to the complex problems facing their secure customers. They are looking to bring in a Cloud Centric SME to add value to a project in the build phase and solve complex problems. The role: Design, deploy, and manage resilient and scalable infrastructure solutions using cloud technologies. Automate manual tasks and workflows to enhance operational efficiency and agility in cloud environments. Establish and manage comprehensive monitoring and alerting mechanisms to uphold system reliability, performance, and security in the cloud. Perform assessments and root cause analyses to proactively prevent recurrence of cloud-related incidents. Collaborate closely with diverse teams across the organisation to fine-tune application performance and bolster reliability in cloud environments. Participate in rotational on-call duties, promptly addressing and resolving cloud-related incidents to ensure uninterrupted service delivery. Technical Skills: AWS Azure Kubernetes Linux Ansible Security The role comes with an existing team of talented engineers to work alongside with extensive scope to learn new technologies and develop. If you're interested in the above and would like to learn more, apply or reach out to

Apr 18, 2024

Full time

Site Reliability Engineer / Cloud Administrator - SC or MOD DV Location : Tewkesbury Salary: £55,000 - £75,000 Clearance: Active MOD DV Preferable, alternatively an active SC. Type: Full time on-site A leading provider of innovative research, data, machine learning and infrastructure solutions to secure UK Defence customers are looking to add to their team. They are global leaders in Internet facing systems and the innovative application of machine intelligence to the complex problems facing their secure customers. They are looking to bring in a Cloud Centric SME to add value to a project in the build phase and solve complex problems. The role: Design, deploy, and manage resilient and scalable infrastructure solutions using cloud technologies. Automate manual tasks and workflows to enhance operational efficiency and agility in cloud environments. Establish and manage comprehensive monitoring and alerting mechanisms to uphold system reliability, performance, and security in the cloud. Perform assessments and root cause analyses to proactively prevent recurrence of cloud-related incidents. Collaborate closely with diverse teams across the organisation to fine-tune application performance and bolster reliability in cloud environments. Participate in rotational on-call duties, promptly addressing and resolving cloud-related incidents to ensure uninterrupted service delivery. Technical Skills: AWS Azure Kubernetes Linux Ansible Security The role comes with an existing team of talented engineers to work alongside with extensive scope to learn new technologies and develop. If you're interested in the above and would like to learn more, apply or reach out to

SRE / Infrastructure Administrator - SC OR MOD DV

Sanderson Tewkesbury, Gloucestershire

Site Reliability Engineer / Infrastructure Administrator - SC or MOD DV Location : Tewkesbury Salary: £55,000 - £75,000 Clearance: Active MOD DV Preferable, alternatively an active SC. A leading provider of innovative research, data, machine learning and infrastructure solutions to secure UK Defence customers are looking to add to their team. They are global leaders in Internet facing systems and the innovative application of machine intelligence to the complex problems facing their secure customers. They are looking to bring in an infrastructure centric SRE to maintain an existing system and add value across the project lifecyle. The role: Design, implement, and maintain robust and scalable infrastructure solutions. Automate manual processes to streamline operations and improve efficiency. Develop and maintain monitoring and alerting systems to ensure system reliability and performance. Conduct post-incident reviews and root cause analysis to prevent future occurrences. Work closely with cross-functional teams to optimise application performance and reliability. Participate in on-call rotations and respond to incidents in a timely manner. Technical Skills: RedHat Linux OpenShift Kubernetes IaaC Ansible Terraform The role comes with an existing team of talented engineers to work alongside with extensive scope to learn new technologies and develop. If you're interested in the above and would like to learn more, apply or reach out to

Apr 18, 2024

Full time

Site Reliability Engineer / Infrastructure Administrator - SC or MOD DV Location : Tewkesbury Salary: £55,000 - £75,000 Clearance: Active MOD DV Preferable, alternatively an active SC. A leading provider of innovative research, data, machine learning and infrastructure solutions to secure UK Defence customers are looking to add to their team. They are global leaders in Internet facing systems and the innovative application of machine intelligence to the complex problems facing their secure customers. They are looking to bring in an infrastructure centric SRE to maintain an existing system and add value across the project lifecyle. The role: Design, implement, and maintain robust and scalable infrastructure solutions. Automate manual processes to streamline operations and improve efficiency. Develop and maintain monitoring and alerting systems to ensure system reliability and performance. Conduct post-incident reviews and root cause analysis to prevent future occurrences. Work closely with cross-functional teams to optimise application performance and reliability. Participate in on-call rotations and respond to incidents in a timely manner. Technical Skills: RedHat Linux OpenShift Kubernetes IaaC Ansible Terraform The role comes with an existing team of talented engineers to work alongside with extensive scope to learn new technologies and develop. If you're interested in the above and would like to learn more, apply or reach out to

Azure Cloud Engineer - SRE

Akkodis City, London

Azure Site Reliability Engineer Akkodis are currently working in partnership with a leading service provider to recruit an experienced Azure Site Reliability Engineer to join a growing team of talented Cloud Engineers providing high level support and project delivery for a large customer base. Please note this is a fully remote role and you must be eligible to gain security clearance (do not need to hold currently). The Role As an Azure Site Reliability Engineer you will support the cloud infrastructure used to deliver cloud hosted managed services to customers. You will have a high customer focus being actively involved in the support and development of the service including: the resolution of support cases, live service monitoring and maintenance, new service provision and continuous improvement projects. You will provide high quality operational and technical support to customers and will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. The Responsibilities Provide L3/L4 analytical incident management and resolution alongside project-based deliverables Contribute to the planning of application / infrastructure releases and configuration changes Resolve support requests from customers by phone, email and online making use of the call logging system Interact with key internal stakeholders and external third-party vendors to troubleshoot and resolve complex problems Provide input to administering and maintaining all production and development environments Create detailed technical and procedural documentation (e.g. architecture, configuration and setup) Design appropriate metrics for reporting on key performance and quality indicators, particularly in terms of in-depth trend analysis Service transition and complete Operational Acceptance (OA) of new customer services Implementation and delivery of Microsoft Azure projects The Requirements Extensive experience of Microsoft Azure and its relevant build, deployment, automation, networking, and security technologies in cloud and hybrid environments. Microsoft Azure certifications: AZ-103/104 - Azure Administrator Good operational experience supporting Microsoft public cloud technologies and services at an enterprise level (multi-tenant) with in-depth knowledge of the following: Azure Active Directory (RBAC and IAM) Azure Networking Azure Storage Azure Monitor and Log Analytics Azure Security Center Demonstrable career operational experience from one of the following areas: Server Infrastructure Engineering (Virtualisation / Windows / Linux). Office / Microsoft 365 Administration. Network Engineering. DevOps (CI/CD, pipelines and Infrastructure as Code) In-depth knowledge of a scripting language (PowerShell, Bash, Azure Cli) Bright attitude and a deep desire to learn Experience with helpdesk IT Service Management Tools (e.g. BMC Remedy / Service Now). If you are looking for an exciting new challenge to join a leading cloud team team please apply now. Modis International Ltd acts as an employment agency for permanent recruitment and an employment business for the supply of temporary workers in the UK. Modis Europe Ltd provide a variety of international solutions that connect clients to the best talent in the world. For all positions based in Switzerland, Modis Europe Ltd works with its licensed Swiss partner Accurity GmbH to ensure that candidate applications are handled in accordance with Swiss law. Both Modis International Ltd and Modis Europe Ltd are Equal Opportunities Employers. By applying for this role your details will be submitted to Modis International Ltd and/ or Modis Europe Ltd. Our Candidate Privacy Information Statement which explains how we will use your information is available on the Modis website.

Apr 16, 2024

Full time

Azure Site Reliability Engineer Akkodis are currently working in partnership with a leading service provider to recruit an experienced Azure Site Reliability Engineer to join a growing team of talented Cloud Engineers providing high level support and project delivery for a large customer base. Please note this is a fully remote role and you must be eligible to gain security clearance (do not need to hold currently). The Role As an Azure Site Reliability Engineer you will support the cloud infrastructure used to deliver cloud hosted managed services to customers. You will have a high customer focus being actively involved in the support and development of the service including: the resolution of support cases, live service monitoring and maintenance, new service provision and continuous improvement projects. You will provide high quality operational and technical support to customers and will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. The Responsibilities Provide L3/L4 analytical incident management and resolution alongside project-based deliverables Contribute to the planning of application / infrastructure releases and configuration changes Resolve support requests from customers by phone, email and online making use of the call logging system Interact with key internal stakeholders and external third-party vendors to troubleshoot and resolve complex problems Provide input to administering and maintaining all production and development environments Create detailed technical and procedural documentation (e.g. architecture, configuration and setup) Design appropriate metrics for reporting on key performance and quality indicators, particularly in terms of in-depth trend analysis Service transition and complete Operational Acceptance (OA) of new customer services Implementation and delivery of Microsoft Azure projects The Requirements Extensive experience of Microsoft Azure and its relevant build, deployment, automation, networking, and security technologies in cloud and hybrid environments. Microsoft Azure certifications: AZ-103/104 - Azure Administrator Good operational experience supporting Microsoft public cloud technologies and services at an enterprise level (multi-tenant) with in-depth knowledge of the following: Azure Active Directory (RBAC and IAM) Azure Networking Azure Storage Azure Monitor and Log Analytics Azure Security Center Demonstrable career operational experience from one of the following areas: Server Infrastructure Engineering (Virtualisation / Windows / Linux). Office / Microsoft 365 Administration. Network Engineering. DevOps (CI/CD, pipelines and Infrastructure as Code) In-depth knowledge of a scripting language (PowerShell, Bash, Azure Cli) Bright attitude and a deep desire to learn Experience with helpdesk IT Service Management Tools (e.g. BMC Remedy / Service Now). If you are looking for an exciting new challenge to join a leading cloud team team please apply now. Modis International Ltd acts as an employment agency for permanent recruitment and an employment business for the supply of temporary workers in the UK. Modis Europe Ltd provide a variety of international solutions that connect clients to the best talent in the world. For all positions based in Switzerland, Modis Europe Ltd works with its licensed Swiss partner Accurity GmbH to ensure that candidate applications are handled in accordance with Swiss law. Both Modis International Ltd and Modis Europe Ltd are Equal Opportunities Employers. By applying for this role your details will be submitted to Modis International Ltd and/ or Modis Europe Ltd. Our Candidate Privacy Information Statement which explains how we will use your information is available on the Modis website.

Principal DevOps Engineer (SRE)

TripAdvisor Oxford, Oxfordshire

We believe that we are better together, and at Tripadvisor we welcome you for who you are. Our workplace is for everyone, as is our people powered platform. At Tripadvisor, we want you to bring your unique perspective and experiences, so we can collectively revolutionize travel and together find the good out there. Tripadvisor captured the online travel market 20 years ago as a Boston-based startup before an online travel market existed. The fact that we still dominate the industry proves that we know how to operate a fast-moving technology company and hire the right people who allow us to maintain that lead throughout the many advancements in technology. As we enter the era of Large Language Models and mobile-based internet everywhere, we are poised to innovate again. As a Tripadvisor Engineer, you will work with some of the best and brightest minds that technology offers and learn best practices and engineering methodologies that will empower you for the rest of your career. The Site Operations team at Tripadvisor maintains and enhances the core systems that power and support the website. This includes systems in private data centers and over a hundred accounts in AWS. Our scope of responsibilities is vast, and listing them here would take an entire page. Suffice it to say that we are the go-to team for questions about the interface boundaries between these two halves of the company and the deep inner workings of our infrastructure. As a Site Operations Engineer on the SiteOps team, you will be a force multiplier for our engineering and operations teams, delivering tooling & infrastructure that not only has a direct impact on day-to-day operations but also helps contribute to the future evolution of infrastructure and engineering here at Tripadvisor. You'll be part of a dynamic team responsible for ensuring our services' high availability, reliability, and scalability. We seek passionate engineers with experience in Python, Java, Ansible, PostgreSQL, CentOS, and Alma Linux to help us optimize and automate our infrastructure and deployment processes. We are currently involved in several types of systems migrations, within both the scope of on-prem to AWS/cloud-native migrations and on-prem data centers to alternate on-prem data center migrations. As a SiteOps Engineer, you will be involved in designing and implementing how we perform those migrations, testing them, and then performing them with a "no surprises in production" mindset. What You'll Do: Infrastructure Automation: Design, implement, and maintain automated infrastructure provisioning and configuration management using tools like Ansible to ensure consistency and scalability. Monitoring and Alerting: Set up monitoring and logging systems to proactively detect and address potential issues, ensuring optimal performance and reliability in environments like on-prem Prometheus/Thanos, Grafana Cloud, and Grafana Cloud Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, and disaster recovery strategies. Collaboration: Work closely with cross-functional teams, including developers and system administrators, to improve the overall development and deployment processes. Troubleshooting and Incident Management: Assist in identifying and resolving operational issues and participate in on-call rotations. Skills and Experience: Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience). Proven experience as a DevOps Engineer or similar role, focusing on building and maintaining scalable infrastructures. Strong proficiency in Python for scripting and automation tasks. Expertise in configuration management such as Ansible or Puppet. Solid understanding of PostgreSQL and experience in managing PostgreSQL databases. Hands-on experience with CI/CD tools like Jenkins, GitLab CI, and GitHub Actions. Knowledge of containerization technologies like Docker and container orchestration tools like Kubernetes is a plus. Understanding of networking concepts such as load balancing and DNS. Strong problem-solving skills and the ability to work in a fast-paced, agile environment. If you need a reasonable accommodation or support during the application or the recruiting process due to a medical condition or disability, please reach out to your individual recruiter or send an email to and let us know the nature of your request . Please include the job requisition number in your message.

Apr 15, 2024

Full time

We believe that we are better together, and at Tripadvisor we welcome you for who you are. Our workplace is for everyone, as is our people powered platform. At Tripadvisor, we want you to bring your unique perspective and experiences, so we can collectively revolutionize travel and together find the good out there. Tripadvisor captured the online travel market 20 years ago as a Boston-based startup before an online travel market existed. The fact that we still dominate the industry proves that we know how to operate a fast-moving technology company and hire the right people who allow us to maintain that lead throughout the many advancements in technology. As we enter the era of Large Language Models and mobile-based internet everywhere, we are poised to innovate again. As a Tripadvisor Engineer, you will work with some of the best and brightest minds that technology offers and learn best practices and engineering methodologies that will empower you for the rest of your career. The Site Operations team at Tripadvisor maintains and enhances the core systems that power and support the website. This includes systems in private data centers and over a hundred accounts in AWS. Our scope of responsibilities is vast, and listing them here would take an entire page. Suffice it to say that we are the go-to team for questions about the interface boundaries between these two halves of the company and the deep inner workings of our infrastructure. As a Site Operations Engineer on the SiteOps team, you will be a force multiplier for our engineering and operations teams, delivering tooling & infrastructure that not only has a direct impact on day-to-day operations but also helps contribute to the future evolution of infrastructure and engineering here at Tripadvisor. You'll be part of a dynamic team responsible for ensuring our services' high availability, reliability, and scalability. We seek passionate engineers with experience in Python, Java, Ansible, PostgreSQL, CentOS, and Alma Linux to help us optimize and automate our infrastructure and deployment processes. We are currently involved in several types of systems migrations, within both the scope of on-prem to AWS/cloud-native migrations and on-prem data centers to alternate on-prem data center migrations. As a SiteOps Engineer, you will be involved in designing and implementing how we perform those migrations, testing them, and then performing them with a "no surprises in production" mindset. What You'll Do: Infrastructure Automation: Design, implement, and maintain automated infrastructure provisioning and configuration management using tools like Ansible to ensure consistency and scalability. Monitoring and Alerting: Set up monitoring and logging systems to proactively detect and address potential issues, ensuring optimal performance and reliability in environments like on-prem Prometheus/Thanos, Grafana Cloud, and Grafana Cloud Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, and disaster recovery strategies. Collaboration: Work closely with cross-functional teams, including developers and system administrators, to improve the overall development and deployment processes. Troubleshooting and Incident Management: Assist in identifying and resolving operational issues and participate in on-call rotations. Skills and Experience: Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience). Proven experience as a DevOps Engineer or similar role, focusing on building and maintaining scalable infrastructures. Strong proficiency in Python for scripting and automation tasks. Expertise in configuration management such as Ansible or Puppet. Solid understanding of PostgreSQL and experience in managing PostgreSQL databases. Hands-on experience with CI/CD tools like Jenkins, GitLab CI, and GitHub Actions. Knowledge of containerization technologies like Docker and container orchestration tools like Kubernetes is a plus. Understanding of networking concepts such as load balancing and DNS. Strong problem-solving skills and the ability to work in a fast-paced, agile environment. If you need a reasonable accommodation or support during the application or the recruiting process due to a medical condition or disability, please reach out to your individual recruiter or send an email to and let us know the nature of your request . Please include the job requisition number in your message.

Python Developer

Inspire People

Join a team at the heart of the global economy! The Department for Business and Trade ("DBT") and Inspire People are partnering together to bring you an exciting opportunity for an experienced Python Developer to support essential tooling and systems across DBT. This role is ideal for a Back End Python developer looking for career growth and be exposed to cloud native systems with an SRE touch to join a team that ensures DBT's digital services work as users expect, working with development teams giving them the tools for their job, including application performance monitoring, exception, log and metrics aggregation, dashboards, and declarative CD/CI pipelines. £55,400 to £74,600 (including allowances) plus excellent Civil Service benefits and pension. Salary is dependent on location and technical skills as assessed at interview. Flexible, hybrid working from London, Cardiff, Darlington, Edinburgh, Belfast, Cardiff, Birmingham or Salford. DBTs Digital, Data and Technology (DDaT) team develops and operates tools, services, and platforms that enable the UK government to provide world leading support to businesses in the UK and overseas. As a senior SRE developing Python, you will work to give development teams the tools for their job, including application performance monitoring, exception, log and metrics aggregation, dashboards, and declarative CI/CD (continuous integration/continuous delivery) pipelines. You'll evangelise product teams about service-level indicators, objectives, and error budgets, and negotiate them. You'll help build and scale our global product platform and participate in an on-call rota. The Tech Stack includes: Python and Django framework Serverless compute (Lambda) Amazon Web Services Azure Jenkins and AWS Codepipelines Terraform & CloudFormation Kubernetes Elastic Container Service (ECS) Elasticsearch PostgreSQL Sentry Redis Essential Skills and Experience You should be able to demonstrate: Experience and fluency in Python, writing clean and effective code. Cloud experience with either Amazon Web Services, Azure or Google Cloud. Ability to build code-defined, reliable, and well tested infrastructure on top of cloud computing systems (eg, Terraform, CloudFormation, Pulumi). Experience in designing, analysing, and troubleshooting distributed systems. Knowledge of Linux/Unix fundamentals and TCP/IP Networking. Ability to see user impact in the infrastructure changes. Desirable Skills and Experience While not essential, it would be ideal if you have demonstrable skills and experience of: Experience coding infrastructure (ie, Terraform, CloudFormation). Experience in defining and measuring Service Level Objectives. Experience in observability driven development. Experience in prototyping through reuse of existing Open Source components. In return, you can expect a planned, transparent progression with learning and development tailored to your role, an environment with flexible working options and a culture encouraging inclusion and diversity, plus the following benefits: Salary of £54,400 to £74,600 (including allowances) including annual allowance depending on location and experience Flexible, hybrid working from London, Cardiff, Darlington, Edinburgh, Belfast, Birmingham, Salford Annual leave starting at 26 days per annum plus statutory bank holidays rising to 33 days with service An excellent Civil Service pension scheme. If you are a Python Developer, DevOps Engineer, Site Reliability Engineer or Systems Administrator looking to enhance your career and make a difference across an expanding function, then apply today or contact Alison Whitehead at Inspire People in complete confidence for further information. Further Information: This role requires SC clearance, a condition of which is to have been present in the UK for 3 out of the past 5 years.

Aug 14, 2023

Full time

Join a team at the heart of the global economy! The Department for Business and Trade ("DBT") and Inspire People are partnering together to bring you an exciting opportunity for an experienced Python Developer to support essential tooling and systems across DBT. This role is ideal for a Back End Python developer looking for career growth and be exposed to cloud native systems with an SRE touch to join a team that ensures DBT's digital services work as users expect, working with development teams giving them the tools for their job, including application performance monitoring, exception, log and metrics aggregation, dashboards, and declarative CD/CI pipelines. £55,400 to £74,600 (including allowances) plus excellent Civil Service benefits and pension. Salary is dependent on location and technical skills as assessed at interview. Flexible, hybrid working from London, Cardiff, Darlington, Edinburgh, Belfast, Cardiff, Birmingham or Salford. DBTs Digital, Data and Technology (DDaT) team develops and operates tools, services, and platforms that enable the UK government to provide world leading support to businesses in the UK and overseas. As a senior SRE developing Python, you will work to give development teams the tools for their job, including application performance monitoring, exception, log and metrics aggregation, dashboards, and declarative CI/CD (continuous integration/continuous delivery) pipelines. You'll evangelise product teams about service-level indicators, objectives, and error budgets, and negotiate them. You'll help build and scale our global product platform and participate in an on-call rota. The Tech Stack includes: Python and Django framework Serverless compute (Lambda) Amazon Web Services Azure Jenkins and AWS Codepipelines Terraform & CloudFormation Kubernetes Elastic Container Service (ECS) Elasticsearch PostgreSQL Sentry Redis Essential Skills and Experience You should be able to demonstrate: Experience and fluency in Python, writing clean and effective code. Cloud experience with either Amazon Web Services, Azure or Google Cloud. Ability to build code-defined, reliable, and well tested infrastructure on top of cloud computing systems (eg, Terraform, CloudFormation, Pulumi). Experience in designing, analysing, and troubleshooting distributed systems. Knowledge of Linux/Unix fundamentals and TCP/IP Networking. Ability to see user impact in the infrastructure changes. Desirable Skills and Experience While not essential, it would be ideal if you have demonstrable skills and experience of: Experience coding infrastructure (ie, Terraform, CloudFormation). Experience in defining and measuring Service Level Objectives. Experience in observability driven development. Experience in prototyping through reuse of existing Open Source components. In return, you can expect a planned, transparent progression with learning and development tailored to your role, an environment with flexible working options and a culture encouraging inclusion and diversity, plus the following benefits: Salary of £54,400 to £74,600 (including allowances) including annual allowance depending on location and experience Flexible, hybrid working from London, Cardiff, Darlington, Edinburgh, Belfast, Birmingham, Salford Annual leave starting at 26 days per annum plus statutory bank holidays rising to 33 days with service An excellent Civil Service pension scheme. If you are a Python Developer, DevOps Engineer, Site Reliability Engineer or Systems Administrator looking to enhance your career and make a difference across an expanding function, then apply today or contact Alison Whitehead at Inspire People in complete confidence for further information. Further Information: This role requires SC clearance, a condition of which is to have been present in the UK for 3 out of the past 5 years.

DevOps Engineer

NBCUniversal

Company Description NBC Sports Next is where sports and technology intersect. We're a subdivision of NBC Sports and home to all NBCUniversal digital applications in sports and technology within our three groups: Youth & Recreational Sports; Golf; and Betting, Gaming & Emerging Media. At NBC Sports Next, we make playing sports better through innovative technology and immersive experiences for athletes, coaches, players and fans. We equip more than 30MM players, coaches, athletes, sports administrators and fans in 40 countries with more than 25 sports solution products, including SportsEngine, the largest youth sports club, league and team management platform; GolfNow, the leading online tee time marketplace and provider of golf course operations technology; GolfPass the ultimate golf membership that connects golfers to exclusive content, tee time credits, and coaching, tips; TeamUnify, swim team management services; GoMotion, sports and fitness business software solutions; and NBC Sports Edge, a leading platform for fantasy sports information and betting-focused tools. At NBC Sports Next we're fueled by our mission to innovate, create larger-than-life events and connect with sports fans through technology that provides the ultimate in immersive experiences. Golf fuses the team behind products and services like GolfNow, TeeOff and GolfPass, which better connects golfers and golf facilities around the world through innovative solutions like cloud-based golf course management and SmartPlay contactless technology and services that create optimum golfing experiences. Come join us as we work together as one team to innovate and deliver what's Next. Job Description Role Purpose: GolfNow/NBC Sports Digital are seeking to hire a DevOps Engineer. You'll be joining a dedicated, ambitious and diverse team who are focused on delivering operational excellence inside the NBC Sports Next organisation. You will work collaboratively with Engineering, Quality, Product and Security teams to build, deploy and operate GolfNow Products across Domestic and International. You will be responsible for automating and improving our build and deploy processes; monitoring and operations; public and private cloud infrastructure; troubleshooting and resolution across dev, test and production globally. This is a fantastic opportunity for an ambitious engineer to be involved in the world's largest golf technology company backed by Comcast/NBCUniversal/Sky with the opportunity to make a difference. RESPONSIBILITIES Job Duties: In delivering the key responsibilities of the role the Dev Ops Engineer will; Operational Support and Maintenance Using APM and other tools you will monitor production systems; remediating production issues and implementing performance/cost improvements Identify capacity and performance issues to ensure we meet our SLAs Documentation of services and processes Participate in on-call schedule Infrastructure Design, implement and manage production grade services in Public clouds (AWS/GCP) using a variety of technologies, ensuring geographic redundancy, security and best practices Build/manage large Kubernetes footprint deployed on Google Cloud Platform Install and manage web and backend services in a high throughput, multi-technology e-commerce environment Build/Deploy Design and implement CI/CD processes and tooling Ensure "shift-left" is implemented in our build and deploy processes in collaboration with Security teams Ensure pre-production environments are built and managed Perform deployments of high throughput revenue generating applications Innovation Work with the DevOps team to champion new processes, tools and technologies in collaboration with Engineering Constantly striving to find a better way Undertake other duties within the scope of the role as assigned. QUALIFICATIONS Basic Qualifications 2+ years working as an SRE/DevOps/Operations Engineer 2+ years working with Kubernetes in a production environment 2+ years Linux System administration experience (Redhat or Debian variants) 2+ years production experience configuring web servers e.g. IIS, Nginx, Apache 2+ years production experience working with a Public Cloud Provider (GCP) Production experience with CI/CD pipelines, e.g. Jenkins, Teamcity, Gitlab CI, Bamboo, Github Actions Proficient in a scripting language such as BASH, Perl, Python, Powershell etc Proficient with source control technologies; Git, TFS, SVN Strong problem-solving ability, attention to detail and ability to work from first principles Hands on experience with public cloud providers; GCP preferred Experience deploying and operating enterprise scale applications in high throughput production environments Hands on experience provisioning Infrastructure as Code with Terraform or CloudFormation Hands on experience of managing services with configuration management tools, Ansible preferred Strong experience in Continuous Integration tools such as Teamcity, Jenkins, Github Actions or Gitlab CI Experience building production grade services with fault tolerance for zonal and regional issues in public clouds Experience capturing metrics and monitoring cloud infrastructure A working understanding of code and scripting (Java, JavaScript, PHP, Nodejs, Golang, .NET, Python etc.) Experience in a collaborative, cross-functional team environment using source control tools like git and git-flow branching strategies Desired Qualifications Experience with Redis, Elasticsearch, RabbitMQ and MongoDB Experience with APM and alerting tools (AppDynamics / Datadog / NewRelic / OpsGenie / PagerDuty) Proficient with configuration management tools such as Ansible, Chef, Puppet Experience with software development and supporting developers Build automation/CI tooling including one of the following: Jenkins Teamcity Bamboo Gitlab CI Github Actions Experience with Infrastructure as a Code tools e.g. Terraform / CloudFormation Experience with WAF/CDN services such as Cloudflare/Cloudfront/Akamai/Fastly etc Knowledge of load balancing software and hardware (F5, HAProxy, Nginx, GCP GLB, AWS ELB/ALB) Additional Job Requirements Interested candidates must; Submit a resume/CV through to be considered. Participate in a rotational "on call" schedule (24 hours a day / 7 days a week) This role is also suitable for remote working We are proud to be a disability confident employer and we'll do everything we can to support you during your application. If you need us to make any adjustments to our recruitment process, speak to our recruitment team who will be happy to support you. Additional Information NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. NBCUniversal will consider for employment qualified applicants with criminal histories in a manner consistent with relevant legal requirements, including the City of Los Angeles Fair Chance Initiative For Hiring Ordinance, where applicable. If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access as a result of your disability. You can request reasonable accommodations in the US by calling 1- and in the UK by calling .

Sep 24, 2022

Full time

Company Description NBC Sports Next is where sports and technology intersect. We're a subdivision of NBC Sports and home to all NBCUniversal digital applications in sports and technology within our three groups: Youth & Recreational Sports; Golf; and Betting, Gaming & Emerging Media. At NBC Sports Next, we make playing sports better through innovative technology and immersive experiences for athletes, coaches, players and fans. We equip more than 30MM players, coaches, athletes, sports administrators and fans in 40 countries with more than 25 sports solution products, including SportsEngine, the largest youth sports club, league and team management platform; GolfNow, the leading online tee time marketplace and provider of golf course operations technology; GolfPass the ultimate golf membership that connects golfers to exclusive content, tee time credits, and coaching, tips; TeamUnify, swim team management services; GoMotion, sports and fitness business software solutions; and NBC Sports Edge, a leading platform for fantasy sports information and betting-focused tools. At NBC Sports Next we're fueled by our mission to innovate, create larger-than-life events and connect with sports fans through technology that provides the ultimate in immersive experiences. Golf fuses the team behind products and services like GolfNow, TeeOff and GolfPass, which better connects golfers and golf facilities around the world through innovative solutions like cloud-based golf course management and SmartPlay contactless technology and services that create optimum golfing experiences. Come join us as we work together as one team to innovate and deliver what's Next. Job Description Role Purpose: GolfNow/NBC Sports Digital are seeking to hire a DevOps Engineer. You'll be joining a dedicated, ambitious and diverse team who are focused on delivering operational excellence inside the NBC Sports Next organisation. You will work collaboratively with Engineering, Quality, Product and Security teams to build, deploy and operate GolfNow Products across Domestic and International. You will be responsible for automating and improving our build and deploy processes; monitoring and operations; public and private cloud infrastructure; troubleshooting and resolution across dev, test and production globally. This is a fantastic opportunity for an ambitious engineer to be involved in the world's largest golf technology company backed by Comcast/NBCUniversal/Sky with the opportunity to make a difference. RESPONSIBILITIES Job Duties: In delivering the key responsibilities of the role the Dev Ops Engineer will; Operational Support and Maintenance Using APM and other tools you will monitor production systems; remediating production issues and implementing performance/cost improvements Identify capacity and performance issues to ensure we meet our SLAs Documentation of services and processes Participate in on-call schedule Infrastructure Design, implement and manage production grade services in Public clouds (AWS/GCP) using a variety of technologies, ensuring geographic redundancy, security and best practices Build/manage large Kubernetes footprint deployed on Google Cloud Platform Install and manage web and backend services in a high throughput, multi-technology e-commerce environment Build/Deploy Design and implement CI/CD processes and tooling Ensure "shift-left" is implemented in our build and deploy processes in collaboration with Security teams Ensure pre-production environments are built and managed Perform deployments of high throughput revenue generating applications Innovation Work with the DevOps team to champion new processes, tools and technologies in collaboration with Engineering Constantly striving to find a better way Undertake other duties within the scope of the role as assigned. QUALIFICATIONS Basic Qualifications 2+ years working as an SRE/DevOps/Operations Engineer 2+ years working with Kubernetes in a production environment 2+ years Linux System administration experience (Redhat or Debian variants) 2+ years production experience configuring web servers e.g. IIS, Nginx, Apache 2+ years production experience working with a Public Cloud Provider (GCP) Production experience with CI/CD pipelines, e.g. Jenkins, Teamcity, Gitlab CI, Bamboo, Github Actions Proficient in a scripting language such as BASH, Perl, Python, Powershell etc Proficient with source control technologies; Git, TFS, SVN Strong problem-solving ability, attention to detail and ability to work from first principles Hands on experience with public cloud providers; GCP preferred Experience deploying and operating enterprise scale applications in high throughput production environments Hands on experience provisioning Infrastructure as Code with Terraform or CloudFormation Hands on experience of managing services with configuration management tools, Ansible preferred Strong experience in Continuous Integration tools such as Teamcity, Jenkins, Github Actions or Gitlab CI Experience building production grade services with fault tolerance for zonal and regional issues in public clouds Experience capturing metrics and monitoring cloud infrastructure A working understanding of code and scripting (Java, JavaScript, PHP, Nodejs, Golang, .NET, Python etc.) Experience in a collaborative, cross-functional team environment using source control tools like git and git-flow branching strategies Desired Qualifications Experience with Redis, Elasticsearch, RabbitMQ and MongoDB Experience with APM and alerting tools (AppDynamics / Datadog / NewRelic / OpsGenie / PagerDuty) Proficient with configuration management tools such as Ansible, Chef, Puppet Experience with software development and supporting developers Build automation/CI tooling including one of the following: Jenkins Teamcity Bamboo Gitlab CI Github Actions Experience with Infrastructure as a Code tools e.g. Terraform / CloudFormation Experience with WAF/CDN services such as Cloudflare/Cloudfront/Akamai/Fastly etc Knowledge of load balancing software and hardware (F5, HAProxy, Nginx, GCP GLB, AWS ELB/ALB) Additional Job Requirements Interested candidates must; Submit a resume/CV through to be considered. Participate in a rotational "on call" schedule (24 hours a day / 7 days a week) This role is also suitable for remote working We are proud to be a disability confident employer and we'll do everything we can to support you during your application. If you need us to make any adjustments to our recruitment process, speak to our recruitment team who will be happy to support you. Additional Information NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. NBCUniversal will consider for employment qualified applicants with criminal histories in a manner consistent with relevant legal requirements, including the City of Los Angeles Fair Chance Initiative For Hiring Ordinance, where applicable. If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access as a result of your disability. You can request reasonable accommodations in the US by calling 1- and in the UK by calling .

Production Engineer

8 jobs found

Modal Window