Head of Site Reliability Engineering (m/f) | Hamburg

Created on 14-11-2017

CLOSED VACANCY Apply now

Description

Join the financial revolution!

Join Kreditech and our team of over 300 professionals and become part of a transforming industry. From engineering, to design, to analytics, and collections, we do things differently and we reward great ideas, team work, and persistence. We have a lot of work to do – and we want you to join us! As a part of our Infrastructure team, you will be responsible to make sure our mission-critical applications and systems are reliable and ready for Internet scale.

Your Role

Be responsible for leading the Site Reliability Engineering as well as Networking team;
Support and facilitate the work of a team of engineers who develop, scale and maintain critical systems in our infrastructure;
Develop and grow the people in your teams, you lead by example and drive the team’s performance;
Innovate, design, and implement solutions to maintain availability, reliability, and efficiency of the services offered by Kreditech;
Keep our systems up and running and automate all handling of failure conditions;
Engage with external vendors to identify, negotiate and implement effective solutions for operating our systems;
Collaborate closely with the engineering teams to ensure fast delivery of high quality;
Manage an international team of highly skilled and motivated professionals in a hands-on way across two sites.

Technologies & Challenges

300+ servers to operate by your team (10% AWS, 90% VPS);
SaltStack, Debian, Ubuntu, ZFS, KVM, Open vSwitch;
Transition from VPS to AWS;
Grow internal adoption of Container technologies (e.g. Docker).

Your Strengths

At least 5 years of experience managing IT operations of a similar size;
Very good understanding of Linux (Debian/Ubuntu), networking, and databases;
In-depth knowledge of the state of the art DevOps tooling and methodology and current trends;
Load balancing and High Availability tools and strategies;
Strong software engineering background with programming background in at least one language (Java or another JVM-based language, Go, Node.js, C, C++, etc.);
Deep knowledge of database operations (PostgreSQL, MongoDB) would be a distinct advantage;
Experience with container orchestration systems (e.g. Kubernetes) is a distinct advantage;
Able to participate in 24x7 on-call rotation;
Hands-on management and leadership skills;
A University degree in Mathematics or Computer Science;
Proven track record in remote management is a plus;
Proficient in English;
Minimal Travel is required.

How to apply

We accept applications exclusively through our recruiting platform, found under the ‘Apply Now’ link.

CLOSED VACANCY Apply now

Head of Site Reliability Engineering (m/f) | Hamburg

Description

Video