Senior Site Reliability Engineer (Vancouver)
What we'll accomplish together
- Develop effective infrastructure (cloud platform services, networking, kubernetes, etc.) for our projects to deploy onto, ensuring projects are scalable, resilient, and reliable in support of growing products.
- Build shared observability services including metrics, logs, tracing, and dashboarding as well as embody a center of excellence partnering with other teams to define SLOs and actionable error budgets for everyone’s services.
- Respond to infrastructure incidents and support the larger Engineering team with their product incident response strategy.
- Perform post-mortems and in-depth root cause analysis to ensure we are always improving.
- Enhance tools and automation to fill the gaps in our current systems as well as build entirely new ones as we face bigger and more complex challenges.
- On-call rotation: 1 week every 5 weeks.
A little about you:
- You execute on defined projects to achieve team-level goals and independently define the right solutions or use existing approaches to solve defined problems.
- You understand OS, networking, kubernetes and other cloud native services and can debug system issues and identify system bottlenecks.
- You have experience working with Infrastructure as Code systems like Terraform, pulumi, or CloudFormation.
- You have experience collecting and processing metrics from tools such as Prometheus/Datadog/NewRelic and are familiar with the concepts of SLOs and SLI targets.
- You are comfortable with responding to production incidents and can fight fires with a calm and level head, leveraging post mortems to apply lessons learned.
- You have experience coding and developing applications. Bonus points for Go experience.
- You are comfortable diving into an unfamiliar system and finding your way around.
- While you believe in processes and the power of planning, you understand that you will often have to roll with the punches and prioritize the most impactful tasks on the fly.
- You have a strong ability to collaborate with cross-functional teams and build solid working relationships with everyone in the organization, from individual contributors to the CEO.
- You have experience building and working on deployment systems.
- You have self-awareness about your strengths and areas for development
- At Dapper Labs, we're looking for people who are passionate about what they do.
- You're encouraged to apply even if your experience doesn't precisely match the job description!
- $132,000 - $207,000 CAD base salary + stock options