Big Data Linux Engineer

Information Technology London, United Kingdom


Description

Big Data Linux Engineer

The Role

We are looking for a Big Data Linux Engineer to join our growing Linux Platform Engineering team. You will be coming on board at a critical time to help drive the expansion of big data technologies and grow their reach within G-Research. Inputs from engineers is highly valued at G-Research: for the right individual, it is a great opportunity to really help shape the infrastructure of a leading FinTech firm, drive the adoption of big data technologies, build systems of high quality, with a focus on scalability, resiliency, security and automation.

The team covers a breadth of project work and is tasked with developing elegant solutions to challenging problems.  The team plays a focal role within the company, and on a daily basis you would be interacting with systems/infrastructure teams, developer teams, quantitative analysts, as well as external third parties such as exchanges and market data, software and hardware vendors.

The role will centre on projects to evaluate and implement new big data frameworks and tools, as well as maintain and expand existing ones. You will also have some interaction with Market Data such as feeds/applications from exchanges and data from third party providers (e.g. Bloomberg, Thomson Reuters etc.).

The Individual

The right person for the role will have the following:

  • Experience in designing, running and troubleshooting Hadoop clusters
  • Batch and streaming job frameworks (such as Spark, Storm)
  • NoSQL databases (HBase, Cassandra, MongoDB)
  • Middlewares and messaging systems (e.g. Kafka, RabbitMQ, FTL, Ultra Messaging)
  • Strong understanding of Linux OS core principles, performance and tuning
  • Scripting (e.g. bash, Python, Perl)
  • Automation via the use of configuration management (such as Puppet or Chef) and orchestration tools (such as Ansible)

The following would be advantageous but not necessary:

  • Service discovery (e.g. Zookeeper, Consul, etcd)
  • Data collection and querying (such as Flume, Sqoop, Hive)
  • Time series databases  (such as InfluxDB, OpenTSDB, or Prometheus)
  • Kerberos, SSL certificates
  • Other scalable distributed systems, such as Splunk
  • Solid knowledge of basic network protocols (e.g. IP, UDP, TCP), OS network stacks. Multicast is a plus.
  • Database administration and querying (SQL)
  • Cluster managers (e.g. Docker, Apache Mesos, Kubernetes)

Candidates should first and foremost be great at what they know and do, and be willing to learn the technologies and aspects they may not have been exposed to in the past.  They should have a track record of delivering high quality work, complete, tested and documented.

In addition to the technical skill set, we are looking for motivated team players who are able to see inefficiencies in our systems and processes and who can then suggest solutions as to how things could be done better. We are always looking to continuously improve and it is important that you have this mind-set too.