Staff Software Engineer, Quality Assurance
OKX
Who We Are
Job Summary
We are seeking a Quality Assurance Engineer to take technical ownership of system-level quality, resilience, and reliability across OKX’s core platforms. This is a Staff-level role with responsibility beyond individual test execution. You will define and drive quality engineering strategy, lead chaos engineering initiatives, and influence how reliability and testability are designed into distributed, cloud-native systems from the ground up.
You will work closely with backend engineers, SREs, and platform teams to ensure OKX’s systems remain robust, fault-tolerant, and production-ready in a 24×7, high-risk environment.
What You’ll Be Doing
- Own system-level quality and resilience across distributed, cloud-native services.
- Design and operate chaos engineering and reliability testing to proactively surface failure modes.
- Influence system architecture to improve testability, observability, and fault tolerance.
- Build and maintain scalable automated testing frameworks (API, integration, end-to-end) and embed them into CI/CD pipelines.
- Set and enforce quality gates and reliability standards across teams.
- Act as a quality authority during incidents and postmortems, and mentor engineers to raise the overall quality bar.
- Partner with Product, Engineering, and SRE teams to align quality practices with business risk and production realities.
What We Look For In You
- Experience in Quality Engineering, SDET, Reliability Engineering, or related roles.
- Strong experience with distributed systems, microservices, and cloud-native architectures.
- Hands-on experience with chaos engineering, resilience testing, or fault injection.
- Proficiency in Java or C++, with familiarity in common backend frameworks and middleware (e.g. Spring Boot, Kafka).
- Strong test automation expertise, including API, integration, and end-to-end testing, and experience building or extending testing frameworks.
- Experience operating in CI/CD and containerized environments (e.g. Jenkins, Docker).
- Solid system-level thinking and ability to influence engineering practices across teams.
- Strong communication skills and ability to collaborate effectively across functions.
- Proficiency in speaking, reading and writing in both English and Mandarin to collaborate effectively with global and cross-functional team members.
Nice to Have
- Experience with financial systems, trading platforms, or crypto exchanges.
- Background in SRE, platform engineering, or infrastructure reliability.
- Experience participating in or leading incident response and postmortems.
Perks & Benefits
- Competitive total compensation package
- L&D programs and Education subsidy for employees' growth and development
- Various team building programs and company events
- Wellness and meal allowances
- Comprehensive healthcare schemes for employees and dependants
- More that we love to tell you along the process!
#LI-ML1 #LI-ONSITE