Production Reliability Engineer

Jump Crypto

Jump Crypto

Chicago, IL, USA · Austin, TX, USA
Posted on Thursday, September 15, 2022

As part of our Trade Desk technical operations team, you will have primary responsibility for managing the real-time production trading environment for Jump Trading. It will require deep technical and operational knowledge across all areas of the trading platform in order to proactively monitor and troubleshoot our trading system, deploy changes to our production environment while minimizing operational risk, and implement tools and processes to drive continuous improvement. In this role, you will solve complex problems that require both technical and business understanding, working with traders, back-office teams, exchanges, and developers to optimize the trading environment and investigate and solve system issues.

What You'll Do:

  • Own the production environment, driving performance, reliability, and operability through continuous improvement
  • Proactively monitor and troubleshoot large-scale trading systems and exchange connectivity
  • Build and maintain devops toolkit for the production trading system including configuration management, process management, deployment, monitoring, data collection, and analysis
  • Leverage firm-wide metrics to improve scalability and system performance
  • Collaborate across the technology organization to analyze and troubleshoot complex system problems
  • Work closely with Risk Management and Operational Trading Support teams to coordinate changes and manage incidents
  • Interact directly with traders to communicate and drive technology changes, manage incidents, and troubleshoot problems
  • Work with Clearing team to reconcile trades and position breaks
  • Assess and manage operational risk of changes into the production environment
  • Define and document process and procedure
  • Provide mentorship and cross training to other technical operations SREs
  • Other duties as assigned or needed

Skills You'll Need:

  • At least 5+ years of relevant work experience in an IT ops role, such as DevOps, SRE, Linux Systems Engineering, or Network Engineering
  • At least 3+ years of experience in python and shell scripting
  • Familiarity with C++ helpful but not required
  • A rigorous, detail-oriented approach to operations
  • Strong understanding of the linux operating system, including network and system configuration, kernel internals, scheduling, performance tuning
  • Strong understanding of networking concepts such as routing, multicast, LLDP, VLAN tagging, ethernet
  • A deep sense of ownership and urgency
  • Ability to handle shared operational and periodic on-call duties
  • Reliable and predictable availability
  • Degree in Computer Science, a related field, or equivalent professional experience

If you are currently a student or recent graduate, please see our Campus postings which offer both Summer and Full-Time opportunities.


- Discretionary bonus eligibility
- Medical, dental, and vision insurance
- HSA, FSA, and Dependent Care options
- Employer Paid Group Term Life and AD&D Insurance
- Voluntary Life & AD&D insurance
- Paid vacation plus paid holidays
- Retirement plan with employer match
- Paid parental leave
- Wellness Programs

Annual Base Salary Range
$150,000$250,000 USD