Head of Site Reliability Engineering (m/f) | Hamburg
Descrizione
Join the financial revolution!
Join Kreditech and our team of over 300 professionals and become part of a transforming industry. From engineering, to design, to analytics, and collections, we do things differently and we reward great ideas, team work, and persistence. We have a lot of work to do – and we want you to join us! As a part of our Infrastructure team, you will be responsible to make sure our mission-critical applications and systems are reliable and ready for Internet scale.
Your Role
- Be responsible for leading the Site Reliability Engineering as well as Networking team;
- Support and facilitate the work of a team of engineers who develop, scale and maintain critical systems in our infrastructure;
- Develop and grow the people in your teams, you lead by example and drive the team’s performance;
- Innovate, design, and implement solutions to maintain availability, reliability, and efficiency of the services offered by Kreditech;
- Keep our systems up and running and automate all handling of failure conditions;
- Engage with external vendors to identify, negotiate and implement effective solutions for operating our systems;
- Collaborate closely with the engineering teams to ensure fast delivery of high quality;
- Manage an international team of highly skilled and motivated professionals in a hands-on way across two sites.
Technologies & Challenges
- 300+ servers to operate by your team (10% AWS, 90% VPS);
- SaltStack, Debian, Ubuntu, ZFS, KVM, Open vSwitch;
- Transition from VPS to AWS;
- Grow internal adoption of Container technologies (e.g. Docker).
Your Strengths
- At least 5 years of experience managing IT operations of a similar size;
- Very good understanding of Linux (Debian/Ubuntu), networking, and databases;
- In-depth knowledge of the state of the art DevOps tooling and methodology and current trends;
- Load balancing and High Availability tools and strategies;
- Strong software engineering background with programming background in at least one language (Java or another JVM-based language, Go, Node.js, C, C++, etc.);
- Deep knowledge of database operations (PostgreSQL, MongoDB) would be a distinct advantage;
- Experience with container orchestration systems (e.g. Kubernetes) is a distinct advantage;
- Able to participate in 24x7 on-call rotation;
- Hands-on management and leadership skills;
- A University degree in Mathematics or Computer Science;
- Proven track record in remote management is a plus;
- Proficient in English;
- Minimal Travel is required.
How to apply
We accept applications exclusively through our recruiting platform, found under the ‘Apply Now’ link.