We are working with a global leader in technology and financial services, operating across more than 40 locations worldwide. Headquartered in a major North American city, their Technology division delivers cutting-edge, cloud-based solutions to millions of customers. They're seeking a talented Site Reliability Engineer to join their London team, offering an opportunity to contribute to a dynamic, impact-driven organization focused on reliability and innovation.
The Role
We are working with our client to recruit a Site Reliability Engineer (SRE) for their Digital Technology team in London. This senior role is essential to maintaining the performance and resilience of applications hosted in a customized Microsoft Azure environment. Based in the European timezone, you'll support a 24/7 digital operation, collaborating with global teams to enhance SRE practices and optimize their cloud infrastructure.
Key Responsibilities
- Reliability & Automation: Design and manage deployment pipelines, support non-production environments, and oversee production releases to ensure scalability and efficiency.
- Incident Management: Serve as the L3 escalation point for complex incidents, conducting root cause analysis and partnering with production and Azure teams.
- Stakeholder Engagement: Act as a bridge between technical teams and business stakeholders, delivering clear updates during incidents and aligning with business needs.
- Cloud Optimization: Enhance a tailored Azure platform (using Databricks, Azure SQL, Data Factory) while adhering to strict security and compliance standards.
- Global Collaboration: Work with developers in London and the Americas as part of a high-performing, dispersed team.
What You'll Need
- Azure Expertise: Proven experience with Azure services (e.g., App Service Plan, Databricks, Data Factory, Azure SQL, Blob Storage) and tools like Terraform, YAML, Git, and GitHub Actions.
- SRE Experience: A track record of improving system reliability, automating processes, and adapting to customized cloud environments with unique controls.
- Analytical Skills: Strong problem-solving and log interpretation skills to resolve issues under pressure.
- Communication: Ability to engage technical and non-technical audiences with clarity and professionalism.
- Adaptability: Comfort working independently across time zones, with flexibility for occasional handover overlap.
Nice-to-Haves
- Familiarity with monitoring tools like Datadog or Dynatrace.
- Knowledge of cloud security standards (e.g., encryption) and audit processes.
- Experience in Agile Scrum environments with rapid delivery cycles.
What's On Offer
- Competitive Package: Excellent Base Salary + performance bonus, plus pension, healthcare, and a comprehensive benefits suite.
- Impactful Role: Shape the reliability of a global digital platform in a unique Azure ecosystem.
- Flexible Environment: Hybrid model with a collaborative, inclusive culture.