Senior Engineer, SRE/DevOps
IoTeX
Responsibilities:
- Infrastructure Automation: Design, implement, and maintain infrastructure as code (IaC) using tools such as K8S, Terraform or Ansible on GCP and AWS.
- Automate deployment, scaling, and management of blockchain nodes and related services.
- Continuous Integration/Continuous Deployment (CI/CD): Develop and maintain CI/CD pipelines for blockchain and backend applications to ensure efficient and reliable release processes.
- Implement automated testing strategies to validate code quality and deployment readiness.
- Monitoring and Incident Response: Establish and maintain monitoring solutions for the entire infrastructure.
- Collaborate with development teams to enhance application-level monitoring and alerting.
- Participate in on-call rotations to respond to incidents promptly.
- Security and Compliance: Implement and enforce security best practices across the infrastructure.
- Work closely with the security team to conduct regular audits and ensure compliance with industry standards.
- Capacity Planning and Performance Optimization: Analyze system performance and make recommendations for improvements.
- Collaborate with cross-functional teams to plan for future infrastructure needs.
- Documentation: Create and maintain comprehensive documentation for infrastructure configurations, processes, and procedures.
Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- Proven experience as a DevOps Engineer or Site Reliability Engineer in a blockchain/crypto environment.
- Strong proficiency in scripting languages (e.g., Python, Bash).
- Experience with containerization and orchestration tools (Docker, Kubernetes).
- Familiarity with blockchain technologies (e.g., Ethereum, Cosmos).
- Significant experience with AWS public cloud technologies implemented large-scale container clusters: AWS, EKS, Infrastructure as Code: Terraform, Containers technologies (Docker and Kubernetes), and IAM
- Strong programming/scripting skills with one or more scripting languages (Python, Go, Ruby, Bash, etc.) and strong Linux OS and networking fundamentals.
- Experience building monitoring systems to ensure high availability, performance, and security integrity (e.g., ELK-stack, Pingdom, Opsgenie/Pagerduty, Kiali, Weave Scope, CloudWatch, CloudTrail, etc.)
- Hands-on experience in operating microservices architecture based SaaS products, REST web services, SSO (Okta, Auth0), EC2-RDS, MySQL and Elasticsearch.
- Solid understanding of network protocols, security, and system administration.
- Experience with cloud platforms (e.g., GCP, AWS, Azure) and related services.
- AWS System Architect certification is strongly preferredKnowledge of CI/CD tools (e.g., Jenkins, GitLab CI).
- Strong problem-solving skills and the ability to work in a collaborative team environment.
- Self-motivated and excited about the ambiguity, opportunity, and self-direction required at an early stage startup.