AI and Cloud Solutions Senior DevOps Engineer - MHCLG - G7
Government Digital & Data -
Here at the Ministry of Housing, Communities & Local Government (MHCLG), we work on things that make a real difference to people’s lives.
Whether it's through the homes we live in, the work of our local councils, or the communities we’re all part of, our work is at the top of the political agenda. We have ambitious and far-reaching outcomes to achieve this year and, if you’re thinking of joining us, there’s never been a more exciting time.
We have over 3,500 staff who are based in 20 offices across the UK.
We are building an AI-first team to develop and support cutting-edge platforms that facilitate modern AI-driven solutions and applications. Our current infrastructure is built on Azure with Terraform, providing reusable, scalable modules for multiple projects.
This role offers the chance to work predominantly on AI-related projects, supporting innovative applications that leverage the latest advances in artificial intelligence. While your primary focus will be on building and maintaining infrastructure for AI systems, you’ll also have opportunities to contribute to other exciting initiatives as our needs evolve.
We value strong problem-solving skills and attention to detail, as well as excellent communication and teamwork abilities. A willingness to share knowledge and grow in a collaborative environment is essential.
If you’re passionate about DevOps and intrigued by the challenges of supporting AI solutions at scale, we’d love to hear from you.
We particularly welcome candidates from an ethnic minority background and other underrepresented groups to apply, as we work to continually improve our ability to represent the places and communities we support through our work.
Find out more about what it's like to work in a digital, data and technology role at MHCLG including our culture, ways of working, career progression and staff benefits. You can also read the MHCLG Digital blog to learn about the work we're doing.
Job description
As a Senior DevOps Engineer, you'll:
- maintain and enhance Terraform modules: keep our reusable Terraform modules up-to-date and tailor them to meet project-specific needs
- build and expand infrastructure: collaborate with architects to design and deploy scalable infrastructure for AI-powered applications
- support multi-cloud readiness: explore and implement solutions for platforms like AWS when needed
- collaborate with AI teams: work closely with developers and others to support seamless deployment of AI models into production
- automate processes: create tools and scripts to improve deployment, monitoring, and scaling
- monitor and maintain infrastructure: use tools like Azure Monitor or similar to ensure uptime, performance, and security
- document processes and changes for easy reusability and team knowledge-sharing
- be responsible for embedding security best practices throughout the DevOps lifecycle, from infrastructure-as-code to deployment and ensure high availability and disaster recovery strategies, while also designing and enforcing security compliance across cloud infrastructure
- demonstrate advanced expertise in multi-cloud environments (Azure, AWS, etc.) and take responsibility for architecting hybrid cloud solutions while acting as leaders in the deployment and scaling of AI/ML models, ensuring reliable, secure, and cost-effective integration of AI workflows across cloud platforms
- design and optimise CI/CD pipelines at an enterprise level, ensuring automation of build, test, and deployment processes across multiple projects
- troubleshoot and scale complex CI/CD environments, implementing robust testing frameworks to ensure continuous and reliable delivery
Person specification
We will use the essential criteria below to evaluate you during the recruitment process. Make sure your CV and cover letter details how you meet the criteria.
As a Senior DevOps Engineer, you'll have:
- extensive experience with Terraform, ideally at an enterprise level for a minimum of two years
- experience of building and maintaining IaaC modules
- significant enterprise experience of building and running services in Azure and preferably AWS
- experience of using CI/CD (e.g., GitHub Actions, Jenkins, or Azure DevOps pipelines) to operate services within a complex multi-cloud environment.
- scripting experience (e.g., Bash, PowerShell, or Python)
- knowledge of multi-cloud setups (AWS or similar, alongside Azure)
- familiarity with monitoring tools like Azure Monitor, Prometheus, or Datadog
- understanding of AI/ML deployment workflows, such as model hosting, pipeline orchestration, or data processing
- exposure to containerisation tools (e.g., Docker, Kubernetes)
- familiarity of operating data services in cloud platforms, preferably databricks