We're looking for dynamic individuals to join us as a Site Reliability Engineer!
What's your main responsibilities as our Site Reliability Engineer?
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless post-mortems.
You are welcome to apply, if:
- Bachelor degree in Computer Science or related technical field involving systems engineering (e.g., physics or mathematics), or equivalent practical experience.
- Experience in one or more of the following: C, C++, Java, Python, Go, Perl, Ruby or shell scripting.
- Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) or networking (e.g., TCP/IP, routing, network topologies and hardware, SDN)
Wonderful if you have:
- Expertise in designing, analysing and troubleshooting large-scale distributed systems.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to debug and optimize code and automate routine tasks