Site Reliability Engineer (RMM)
Milton Keynes - Hybrid (can be based in London)
65,000 - 75,000 with excellent benefits and 10% bonus
Do you have a proven record of systems engineering, with a focus on monitoring and site reliability?
If you have experience with monitoring tools, PowerShell and API usage, we want to hear from you!
Our client is a multiple award-winning IT Cloud & Managed Services partner, run and privately owned by technologists at its very core; If you want to develop your career with a progressive company within the Private/Hybrid cloud space, this could be a great role for you!
As Site Reliability Engineer, you will be responsible for ensuring that the engineering teams can accurately and consistently monitor, manage and maintain every aspect of the platform. You will work with senior leaders progressing out automation through scripting, which will leverage their monitoring tools.
- GlobalMulti-Tenanted Cloud Solutions
- MaaS
- IaaS platform
- DRaaS platform
- BaaS platform
- STaaS / FaaS platform
- Security
Key responsibilities:
- Responsible for the Tech Services stack. Ensuring it is stable, up to date, available and ready for consumption.
- Act as a key stakeholder in site reliability, supporting the SMEs to ensure full stack visibility.
- Act as a key stakeholder for Business and Service continuity. Ensuring that any risks are assessed, raised and addressed.
- Actively monitor the support queues, completing initial incident triage and troubleshooting.
- Ensure documentation is kept up to date and accurate.
- Work with the Security & SRE Team Lead and SMEs to ensure SLAs are not breached.
- Liaise with project management teams to ensure any project participation is on track and actions are accurate.
- Collaborate with customers and other teams via email, phone, virtual and face to face meetings as required.
Skills and experience
- A deep understanding of monitoring systems and associated mechanisms
- IE Solarwinds, DataDog, LogicMonitor, vRealize/Aria suite, Cortex/Prometheus/Grafana, NinjaOne RMM
- Ability to read and write code as required
- PowerShell - Highly Desirable
- MSSQL, Python, Ansible - Desirable
- An understanding and appreciation of cloud platforms from a holistic view
- An understanding of all layers of the infrastructure stack
- Compute, storage, networking, virtualisation
- Experience working for a CSP, MSP, or multi-tenanted enterprise is beneficial
- Experience working incident, problem, requests and changes within ITSM
- An appreciation of the DevOps methodologies
Please be aware this advert will remain open until the vacancy has been filled. Interviews will take place throughout this period, therefore we encourage you to apply early to avoid disappointment.
Tate is acting as an Employment Business in relation to this vacancy.
Tate is committed to promoting equal opportunities. To ensure that every candidate has the best experience with us, we encourage you to let us know if there are any adjustments we can make during the application or interview process. Your comfort and accessibility are our priority, and we are here to support you every step of the way. Additionally, we value and respect your individuality, and we invite you to share your preferred pronouns in your application.