[Remote] Site Reliability Engineer (Contract outside of IR35)
Note: The job is a remote job and is open to candidates in USA. TwinStream is a company formed to provide technical excellence and exceptional service to clients in government organizations. They are seeking an experienced Site Reliability Engineer to ensure the availability, performance, and cost-effectiveness of their services while collaborating with various teams to improve infrastructure and delivery pipelines. Responsibilities • Collaborate with Software Engineers to improve reliability and performance in their subsystems • Partner with System Administrators in automating toil and eliminating alerts • Evolve observability and monitoring capabilities to identify and solve problems before they impact the business • Support development environments to help us achieve our delivery and quality goals • Research and evaluate technologies, tools and services to influence buy-vs-build decisions • Develop expertise in diverse technical and business domains • Expand your knowledge of the technical stacks used Skills • Experience using Azure • Experience using modern configuration management tools (such as Ansible, Chef or similar) • Experience working with Terraform • Experience working with docker containers & container orchestration tools (such as Kubernetes, OpenShift or Docker Swarm) • Experience both using and maintaining CI / CD tools (such as Jenkins or similar) • Experience with monitoring tools such as InfluxDB, Prometheus or Grafana • Experience of event-driven integration with MQ messaging (RabbitMQ or similar AMQP solution) • Good understanding of relational databases and SQL • Linux command line, administration and shell scripting • Working knowledge of network security protocols • Experience using, developing with and maintaining cloud hosting services (ideally AWS EC2, RDS, S3, Lambda) • Industry experience writing well-tested code in one of our platform languages (Java, Go, Python or similar) • Knowledge of cross-domain principles & technologies • Experience of working in a service management environment • Practical applications of using observability patterns in previous systems • Creating and monitoring system availability metrics and using those to drive work that reduces downtime Company Overview • A Cyber Security Startup specialising in Cross Domain Solutions It was founded in 2018, and is headquartered in Cheltenham, Gloucestershire, GBR, with a workforce of 51-200 employees. Its website is Apply tot his job