UC
Posted 2 days ago
Site Reliability Engineer (SRE)
UA Consulting
📍 London
💷 £78,000 - 104,000Information Technology
Job description
About Us<br><br>Do you have the right skills and experience for this role Read on to find out, and make your application.<br>We are a leading gaming and gambling solution software provider with a strong presence in the USA, UK, and Europe.<br>Through partnerships with global gaming companies, we build cutting-edge technical platforms across sportsbooks, lottery, casino, virtual gaming, and financial trading.<br>Our vision is to shape the future of gaming by transforming operations into intelligent, data-driven solutions that deliver exceptional customer experiences and create sustainable value for all stakeholders.<br>We believe in teamwork, knowledge sharing, and transparency with accountability.<br>The Role<br>Were looking for a Site Reliability Engineer (SRE) to help shape and drive how we build and operate reliable, observable, and cost-efficient systems.<br>Youll work closely with development, platform, and incident management teams to define what reliable means in measurable terms and build the tooling and processes to achieve it.<br>Your work will directly influence the speed, stability, and scalability of our platform.<br>Key Responsibilities<br>Partner with development teams to define and manage SLOs/SLIs, and use error budgets to guide engineering decisions.<br>Enhance observability ensuring metrics, logs, and tracing are in place to detect and fix issues proactively.<br>Lead cost optimisation initiatives: monitor spend, rightsize workloads, tune autoscaling, and drive efficient infrastructure usage.<br>Strengthen production readiness with pre-deployment checks, post-release validation, and robust platform guardrails.<br>Introduce and run chaos engineering experiments to improve system resilience.<br>Automate operational processes to reduce manual intervention across the stack.<br>Contribute to major incident response, providing engineering expertise.<br>Collaborate cross-functionally to raise the bar on platform stability, security, and performance.<br>Required Skills & Experience<br>3+ years in SRE, Platform, or DevOps roles.<br>Strong operational experience with Kubernetes (on-prem and AWS EKS). xwzovoh <br>Proven track record defining and working with SLOs/SLIs in production environments.<br>Deep understanding of observability (metrics, logging, tracing, telemetry