Site Reliability Engineer - Visibility & databases (w/m/d)
Location: Lausanne, Switzerland or remote in Europe
Apply NowExoscale is the leading Swiss/European cloud service provider.
With services covering the full cloud infrastructure spectrum - from fast deploying virtual machines to S3 compatible object storage - Exoscale provides a simple and scalable experience in order to let its clients focus on their core business.
As part of its ongoing efforts to grow its infrastructure footprint Exoscale is hiring a Site Reliability Engineer.
The site reliability engineer plays a critical role in ensuring constant availability of the Exoscale platform. The engineering team at Exoscale works on all aspects from designing & developing products, to their operation and support.
With an expanding customer base and new products to further advance Exoscale's product portfolio, site reliability engineers build and maintain a wide range of technologies. As users of Exoscale itself, site reliability engineers also take active part in improving products.
This position focuses on database persistence and visibility stacks. A range of topics are covered: Platform development and maintenance, tooling development, automation, self-service infrastructure delivery and more.
Some of the challenges you will be working on:
- Design and maintain key platforms such as:
- Our database systems consisting of Mysql, FoundationDB and Apache Cassandra.
- Our data streaming processing platform based on Apache Kafka
- Our visibility stack based on Prometheus compatible components
- Our logging platform based on Elastic ecosystem
- Automate our database provisioning and maintenance operations.
- Help design our next tracing service.
- Help improve the developer experience (DX) through the delivery of self-service systems and pipelines.
- Contribute to the overall design and the architecture of the Exoscale platform systems.
- Contribute to internal tooling development.
- Improve our systems and processes to be scalable and highly available, helping achieve outstanding SLAs.
- Participate in code & changes reviews.
- Take part in the on-call roll after a training period.
Ideal candidates:
- Have solid experience dealing with Linux on a daily basis.
- Have a good knowledge of Apache Kafka.
- Are familiar with transactional database systems like MySQL and PostgreSQL
- Are used to deal with Prometheus monitoring and its ecosystem like Grafana and Mimir
- Are familiar with logs management platform like Elastic ecosystem
- Have experience with Containerization, Kubernetes a plus
- Have a good experience with Golang, Clojure & Python a plus
- Have experience with configuration management solutions and large scale infrastructure.
- Love to automate anything that could be.
- Are curious, autonomous and embrace learning new things everyday.
- Are team players and are comfortable working in a distributed team.
- Have good English communication skills, written and spoken.
What we offer:
- Flexible working hours and working from home.
- Autonomous working conditions with a lot of freedom to create.
- Modern working atmosphere and centrally located office with great public transport. connection
- Team events as well as training and further education.
Candidates who are not familiar with all the topics above but willing to learn are encouraged to apply.
We look forward to receiving your application!