Site Reliability Engineer - Databases (m/f/d) (Remote)

Yelp, Remote, Remote, Germany

Connecting people with great local businesses

Duration: Full-Time

Do you want to be a Site Reliability Engineer that builds and manages scalable, self-healing, globally distributed systems? Our Site Reliability Engineers make sure users are always connected to great local businesses by keeping Yelp fast and available as we continue to scale. No matter how many times we get searched, scraped, scanned, spammed, pinged, paged, or queried, we gotta keep our cool — and keep the site and the apps running smoothly.
We work for both the Yelp end users and the Yelp developers, implementing critical parts of the core architecture and supporting developers as they do the same. We get to take on exciting challenges that you can only find at the kind of scale that serves over 100 million users per month. Spinning up infrastructure should always be a git commit and a code review away: automation and self-service are at the core of what we do.
We're looking for people with a passion for all things related to distributed systems, serving queries fast, uptime, scaling, and solving hard problems with the right tools. We have fun working on these challenges and are looking for others who do, too!

Where You Come In:

  • Work closely with developers in supporting new features and services
  • Analyze solutions and implement best practices for our database cluster and its components
  • Build cluster management tooling for Cassandra Kubernetes Operator
  • Develop and maintain easy, intuitive API (REST/GraphQL) interfaces to our databases that keep developers moving fast
  • Work on observability of relevant database metrics and troubleshoot site issues using industry-leading tools like Splunk and prometheus
  • Support and administer Cassandra clusters, as well as the stacks they run on by automation
  • Design new systems, tests, and procedures
  • Participate in our daytime on-call rotation, acting as a point of call for automated systems and highlighting availability issues when they can't be automatically resolved

What it Takes to Succeed:

  • Based or willing to relocate within Germany
  • An experienced software engineer with a strong interest in distributed systems and database technologies (like Cassandra or any other NoSQL databases)
  • Fluency in Python, Java, Golang, or a similar language—familiarity with more than one is a plus
  • Knowledge of best practices related to security, performance, high availability and disaster recovery
  • Proficiency in Kubernetes
  • Mastery of Linux 
  • Expertise in Configuration Management (i.e., Puppet/Ansible/Chef/etc.)
  • Experience with public cloud platforms and related tooling (i.e., Terraform, AWS CloudFormation, etc.)

What You'll Get:

  • Full responsibility for projects from day one, an awesome team, and a dynamic work environment
  • Competitive salary with equity in the company, a pension scheme, and an optional employee stock purchase program
  • 30 days paid vacation per year plus one floating holiday
  • Flexible working hours and meeting-free Wednesdays
  • Regular 3-day Hackathons and weekly learning groups, always with interesting topics
  • Opportunities to participate in digital events and conferences
  • Complementary elective German lessons for international employees
  • €64 per month toward any wellness activity of your choice
  • Quarterly team offsites

About Yelp

Yelp connects people with great local businesses. Our users have contributed approximately 127 million cumulative reviews of almost every type of local business, from restaurants, boutiques and salons to dentists, mechanics, plumbers and more. These reviews are written by people using Yelp to share their everyday local business experiences, giving voice to consumers and bringing “word of mouth” online.

Want to learn more about Yelp? Visit Yelp's website.