O pozici
Join Barclays as a Site Reliability Engineer and play a key role in building a new, high-impact SRE capability within Markets Post-Trade. As part of a cross-cutting team, you will expand application stability and reliability measurement by automating reliability tooling, closing telemetry gaps, and addressing reliability findings across multiple mission-critical systems. You will help extend and scale an SRE solution across Markets Post-Trade, driving full-stack observability for cash settlements, securities settlement, and liquidity management flows. Through centralised dashboards and end-to-end transaction tracing, you will deliver greater transparency, faster issue resolution, and enable the adoption of AI-driven observability, anomaly detection, and advanced analytics. This role focuses on pre-emptive monitoring, optimisation, and non-functional architecture design to ensure resilient, high-performing systems in a fast-paced environment.
Co budeš dělat
- Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
- Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.
- Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
- Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
- Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.
- Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth.
Koho hledáme
- Experience with observability and APM tools such as OpenTelemetry, Elastic, AppDynamics, or Prometheus.
- Experience designing and implementing resilience patterns, including Retry, Timeout, Circuit Breaker, Bulkhead, Throttling, and Saga.
- Proficiency with load-testing tools such as HP Performance Center, LoadRunner, k6, or JMeter.
- Solid knowledge of networking and security fundamentals, including VPC design, IAM, encryption, and secrets management.
- Operational experience with scripting and/or programming languages such as Java, Python, Ruby, or Bash.
Benefity
- The successful candidate will be based in Prague.
- Barclays employees are also eligible for a suite of competitive country-specific benefits.
- This position is eligible for an incentive award.
- Our Work Experience is the combination of everything that's unique about us: our culture, our core values, our company meetings, our commitment to sustainability, our recognition programs, but most importantly, it's our people. Our employees are self-disciplined, hard working, curious, trustworthy, humble, and truthful. They make choices according to what is best for the team, they live for opportunities to collaborate and make a difference, and they make us the #1 Top Workplace in the area.