Senior Devops Engineer, Data Platform Reliability
Netflix, Los Gatos, California
Leading subscription service for watching TV episodes and movies
Specifically, you will:
- Develop effective tooling, alerts, and response to both identify and address reliability risks.
- Build tools and automation to reduce operational tasks, improve automatic issue identification and routing, and predict platform performance in accordance to SLAs based on overall platform health and progress.
- Participate in on-call rotation to manage incident and to handle unknown/new issues.
- Drive issue resolution and root cause identification with the various data infrastructure teams.
- Evangelize best practices around collaboration and reliability to all partner teams.
- Effective root cause identification, triage and mitigation
- Experience with configuration and troubleshooting of Linux, Java, Tomcat, and other middleware technologies
- Understands large-scale complex systems from a reliability perspective
- Strong communication skills and the ability to engage partner teams effectively
- Strong automation mindset and passion to identify strategies to mitigate going forward
- Experience with Cloud Computing platforms (particularly AWS) a plus
- Strong Linux system-level analysis and network analysis experience a plus
Netflix is the world’s leading Internet television network with over 100 million members in over 190 countries enjoying more than 125 million hours of TV shows and movies per day, including original series, documentaries and feature films. Members can watch as much as they want, anytime, anywhere, on nearly any Internet-connected screen. Members can play, pause and resume watching, all without commercials or commitments.