Production Reliability Engineer, Trade Desk

Jump Crypto

Jump Crypto

Amsterdam, Netherlands
Posted on Tuesday, January 25, 2022

As part of our Trade Desk production reliability team, you will have primary responsibility for managing the real-time production trading environment for Jump Trading. It will require deep technical and operational knowledge across all areas of the trading platform in order to proactively monitor and troubleshoot our trading system, deploy changes to our production environment while minimizing operational risk, and implement tools and processes to drive continuous improvement.

In this role, you will solves complex problems that require both technical and business understanding. You will work with traders, operations, exchanges, and developers to optimize the trading environment and investigate and solve system issues.

What You'll Do:

  • Proactively monitor and troubleshoot large-scale trading systems and exchange/venues
  • Implement moves/adds/changes to production trading environment
  • Build and maintain devops toolkit for the production trading system including configuration management, process management, deployment, monitoring, data collection, and analysis
  • Collaborate across the technology organization to analyze and troubleshoot complex system problems
  • Perform in-depth analysis of system performance
  • Work closely with Risk Management and Operational Trading Support teams to coordinate changes and manage incidents
  • Interact directly with traders to communicate technology changes, manage incidents, and troubleshoot problems
  • Work with Clearing Team to reconcile trades and position breaks
  • Manage and assess operational risk of change control into the production environment
  • Define and document process and procedure
  • Provide second and third level support and mentor other trade desk analysts
  • Other duties as assigned

Skills You’ll Need:

  • Degree in Computer Science, a related field or equivalent professional experience
  • At least 5+ years of relevant work experience in an IT ops role, such as DevOps, Linux Systems Engineering, or Network Engineering
  • Fluency in python and shell scripting
  • Familiarity with C++
  • A rigorous, detail-oriented approach to operations
  • Strong understanding of the linux operating system, including network and system configuration, kernel internals, scheduling, performance tuning
  • Strong understanding of networking concepts such as routing, multicast, LLDP, VLAN tagging, ethernet
  • A deep sense of ownership and urgency
  • Ability to handle shared operational and periodic on-call duties
  • Reliable and predictable availability