Software Engineer/Site Reliability Engineer
Crypto.com
This job is no longer accepting applications
See open jobs at Crypto.com.See open jobs similar to "Software Engineer/Site Reliability Engineer" Blockchain Association.What you’ll be doing
- Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
- Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
- Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again
- Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
- Define SLA/SLOs for different services partnering with product engineers
- Represent the SRE team in system design reviews and operational readiness exercises for new and existing services
What you need
- Experience coding in Ruby and/or Go
- Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
- Experience in designing, analyzing, and troubleshooting large-scale distributed systems
- Curiosity about finding root causes in incidents and outages
- Ability to develop alignment to cultivate relationships and driving impact
- Mindset in designing fault tolerance system architecture
- Comfort with being uncomfortable in ambiguous situations
- Involvement with incident management and response
- Desire to grow expertise, inform, and educate others
- Capable to pick up various technologies, a fast learner and have a “get things done” mentality
- Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities
Desirable
- Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
- Familiar with monitoring tools (e.g. NewRelic, Datadog, and/or OpenTelemetry)
- Familiar with IaC tools (e.g. Terraform, Spacelift)
- Experience in designing resilient system architecture
- Experience in optimizing performance of large-scale production system
- Experience in promoting site reliability engineering practices
This job is no longer accepting applications
See open jobs at Crypto.com.See open jobs similar to "Software Engineer/Site Reliability Engineer" Blockchain Association.