Linux Systems Engineer - Shared Engineering

The role

The Shared Engineering team at XTX Markets design, develop and maintain infrastructure used by XTX Markets. The variety of technology covered includes operating systems, connectivity, storage, monitoring tools, CI/CD, configuration management and the platform on which our inhouse post-trade and low-latency OTC trading systems are run. We also provide technical consultancy services to all parts of the company, including the research technology and proprietary trading groups. This role is ideal for someone who is well organised, pragmatic and possesses expert skills in infrastructure systems engineering
Ideally you would have spent over three years working as a Linux or Site Reliability engineer, be completely at home with open source technologies, and possess a good scientific degree from a respected institution or equivalent industry experience.

Responsibilities

The team is responsible for engineering and supporting existing infrastructure and tooling with the overarching aim to transform it into a modern infrastructure with self-service APIs and an ‘always-on’ approach. Activities include:

• Design, develop and continually improve the platforms that enable other teams to easily interface with and use our infrastructure
• Reduce manual activities by automating complex processes involving multiple systems, some without formal APIs
• Define standards to reduce duplication of effort across the company and consult with other technologists to promote re-use.
• Mentoring other team members, enabling them to support and maintain the systems you develop
• Evaluate new technology and development techniques, figuring out how they are applicable to the company
• Maintain the current infrastructure and tooling where it makes sense to do so
• Contribute to commercial discussions where third-party systems are involved
• Support of internal and trading systems

Essential skills

• Strong Linux engineering skills including developing automated builds and patching cycles, rolling new kernels, and fixing open source tools.
• Excellent skills at the command line, shell scripting should be second nature to you.
• Comfortable writing code in a language such as Python
• Configuration management and deployment using Puppet, Ansible or similar
• Hands-on expertise with one or more distributed systems such Elasticsearch, Cortex or Cassandra.
• Use of SRE principles to produce class-leading alerting and monitoring.
• Demonstrable ability to take on complicated projects and deliver good quality work with a minimum of oversight.
• A good working knowledge of network and storage technologies.
• Building solid relationships with peers both inside a team and across an organisation
• Use of standard development tools such as git and test-driven development techniques

Other skills

We would not expect someone to have all the following skills or experience, but some subset would be preferred and gives an idea of the wide range of technologies we work with:
• Containerisation, Kubernetes, the Hashicorp stack or similar to deliver a containerised infrastructure.
• Monitoring applications such as Icinga, Nagios or Zabbix
• Observability infrastructure, e.g. KairosDB, Prometheus, InfluxDB, Grafana
• Logging tools, e.g. Splunk, ELK stack, Graylog
• System admin level knowledge of hardware, networking and/or file storage systems, e.g. Dell, HP, Cisco, NetApp, ZFS, NFS
• System admin level knowledge of object store technology, e.g. Ceph, minio, EMC ECS, Cloudian
• Production experience with cloud-based service providers such as AWS, GCP or Azure and the tools to manage them.
• Knowledge of trading systems, particularly with regards to exchange and counterparty connectivity, precise timing infrastructure and latency reduction techniques