Do you want to build and manage scaleable, self-healing, globally-distributed systems? Our remote Site Reliability Engineers keep Yelp fast, available, and growing, connecting users to great local businesses. No matter how many times we get searched, scraped, scanned, spammed, pinged, paged, or queried, we gotta keep our cool - and keep the site running smoothly.
We work in both the dev and systems worlds, implementing key parts of the core architecture and supporting devs as they try to do the same. We get to tackle interesting challenges that you can only find at the kind of scale that serves over 100 million users per month.
You'll work to empower product teams and developers: at Yelp, spinning up infrastructure should always be a git commit and a code review away: automation and self-service are at the core of what we do.
We’d love to have you apply, even if you don’t feel you meet every single requirement in this posting. At Yelp, we’re looking for great people, not just those who simply check off all the boxes.
Where You Come In:
Work closely with developers in supporting new features and services.
Build tools to monitor site stability and performance.
Help scale our AWS-based infrastructure (no racking servers or swapping hard drives here!)
Troubleshoot site issues using industry-leading tools like Splunk and SignalFX.
Automate everything with Puppet, Git, Jenkins, and Terraform.
Develop custom tools when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects.
Design new systems, tests, and procedures.
Participate in light on-call rotations - we have geographically distributed SRE teams for follow-the-sun support, which means no 2:00 AM pages!
What it Takes to Succeed:
Mastery of Linux (we use Ubuntu but any distro is fine)
Command of your favourite modern programming language: Python, Ruby, Go, Rust, Java, C++, etc.
A solid understanding of fundamental technologies like TCP/IP, HTTP, and DNS.
Knowledge of best practices related to security, performance, and disaster recovery.
Experience with web server configuration (Apache/Nginx/HAproxy), monitoring, trending, and high availability.
Strong scripting and automation skills.
Expertise in Configuration Management (Puppet/Ansible/Chef/etc.)
Experience with public cloud platforms (we use AWS, but Azure/GCP are fine) and related tooling (Terraform, etc.)
Experience with Docker or other container technologies.
Excellent communication and documentation skills.
At Yelp, we believe that diversity is an expression of all the unique characteristics that make us human: race, age, sexual orientation, gender identity, religion, disability, and education — and those are just a few. We recognize that diverse backgrounds and perspectives strengthen our teams and our product. The foundation of our diversity efforts are closely tied to our core values, which include “Playing Well With Others” and “Authenticity.”
We’re proud to be an equal opportunity employer and consider qualified applicants without regard to race, color, religion, sex, national origin, ancestry, age, genetic information, sexual orientation, gender identity, marital or family status, veteran status, medical condition or disability.
We are committed to providing reasonable accommodations for individuals with disabilities in our job application process. If you need assistance or an accommodation due to a disability, you may contact us at [email protected] or 415-969-8488.
Note: Yelp does not accept agency resumes. Please do not forward resumes to any recruiting alias or employee. Yelp is not responsible for any fees related to unsolicited resumes.
Yelp connects people with great local businesses. Our users have contributed approximately 127 million cumulative reviews of almost every type of local business, from restaurants, boutiques and salons to dentists, mechanics, plumbers and more. These reviews are written by people using Yelp to share their everyday local business experiences, giving voice to consumers and bringing “word of mouth” online.