SITE RELIABILITY ENGINEER / 3-6 MONTH CONTRACT (OUTSIDE OF IR35) / BRISTOL (HYBRID) / UP TO £550 PER DAY
A fantastic new opportunity for an experienced Site Reliability Engineer, to join a fast-growing business.
In 2019, our founders were working as engineers solving complex cross domain problems in defence and security organisations. TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. The business is headquartered in Cheltenham with teams working both on-site with clients and remotely from home.
We are looking for skilled engineers to join a new team that will deploy and maintain our established cross-domain system for a customer. The system uses an AMQP event-driven microservices architecture and extensively utilizes docker container services. As a team member, you will maintain a continuous deployment pipeline, work with feature delivery teams to promote component releases into production, and apply configuration management tools to ensure all deployments are consistent and correctly configured.
This role is perfect for an experienced engineer who is comfortable working in a managed service environment and wants to gain more experience with best-of-breed DevOps tools and techniques.
This is a hybrid role, where you may be required to work in Bristol and/or Corsham. Applicants must be eligible for SC/DV clearance.
What’s on Offer?
- Highly competitive rates (£500 - £550 Per Day).
- 25 days' holiday plus bank holidays.
- Every quarter, we hold a meeting involving all team members from TwinStream - this allows us to meet up, chat about all things TwinStream, and enjoy team building and company updates.
- Christmas and summer parties to celebrate our successes.
- Opportunity to lease an electric vehicle via salary sacrifice
- Health and Well-being - Access to workplace Mental Health First Aider
Key Responsibilities of the Site Reliability Engineer:
- Collaborate with Feature Development teams to promote new component versions into production as efficiently as possible.
- Maintain the system to agreed service level and availability objectives using real-time monitoring tools and system generated metrics.
- Instrumentation of new system metrics and alerts to pre-empt issues and improve performance.
- Respond to monitoring alerts and customer incidents, taking preventative/remedial action to minimise customer impact.
- Liaising with key customer stakeholders to schedule capability changes and capture new service requirements as they arise.
- Apply automation techniques to reduce manual operations burden.
Skills & Experience Required:
- Experience in infrastructure automation tools (CloudFormation, Terraform or Ansible)
- Experience working with docker containers & container orchestration tools (such as Kubernetes, OpenShift or Docker Swarm)
- Experience using and maintaining CI / CD tools (such as Jenkins or GitHub actions)
- Good understanding of relational databases and SQL
- Linux command line, administration and shell scripting
- Solid understanding of monitoring, auto-scaling, performance tuning, troubleshooting and disaster recovery best practices
- Working knowledge of network security protocols
- Working knowledge of AWS
- Experience with monitoring tools such as InfluxDB, Prometheus or Grafana
What’s Next?
If you have the drive and experience to be successful in this Site Reliability Engineer position, we would love to hear from you. APPLY NOW for immediate consideration.