Senior Site Reliability Engineer (SRE)

Collective Health, San Mateo, CA | Lehi, UT | Chicago, IL | Remote

Leading the healthcare evolution

We all depend on healthcare throughout our lifetimes, for ourselves, and our families and friends, but it is notoriously difficult to navigate and understand. As an industry that comprises 20% of the US economy we think healthcare should work better for all of us. At Collective Health we believe it’s time for a new day in healthcare where as members we are informed and empowered to make the right care choices when the decisions are urgent and critical. 

Site Reliability Engineering at Collective Health is a discipline combining software and systems engineering skills. We apply modern infrastructure, systems, software, architecture, and development practices to give our customers a more reliable healthcare management experience.

Partnering with engineering teams, Site Reliability Engineers build on public cloud services to deliver a comprehensive platform that enables our developers to rapidly deliver high-quality, impactful, scalable, and reliable services. As a broader group of Site Reliability Engineers including those focused on infrastructure and those embedded in other engineering teams, we collaborate and identify themes and solutions to benefit Collective Health at large, engage in regular knowledge sharing activities and retrospectives, and relentlessly support one another in order to gain knowledge, remove barriers, and grow as individuals and a team.

Engineers specializing in Build & Release bring passion for reducing development toil, reducing friction to continuous integration and deployment, and improving developer tooling and experience. As owners of the Build & Release function, engineers will drive this critical part of our platform forward whilst having opportunities to explore and grow in other areas through rotations, knowledge sharing activities and cross-functional projects.

Together, we’re building the next generation healthcare platform, and proud to be on the leading edge of this important mission.

What you'll do: 

  • Collaborate on and/or lead engineering efforts from requirements to production, solving problems of developer productivity and presenting complex technical concepts to the team, engineering org, and leadership audiences.
  • Write code that is well-tested, easily understood, and maintainable by others.
  • Troubleshoot and fix complex production issues related to availability or performance, even if they are outside your comfort zone.
  • Apply software engineering principles to the operations of our systems in order to reduce toil.
  • Advise, critique, or comment on engineering designs.
  • Help our internal customers solve their problems in as efficient and future-proof a manner as possible.
  • Create and execute plans that ensure our existing infrastructure remains up-to-date, compliant, and secure.
  • Work independently and autonomously.

Imposter syndrome is real. If you are hesitant to apply because of not checking all the boxes, or you’ve had a less-traditional pathway into Site Reliability Engineering, we encourage you to still apply and mention why you believe you’d be a fit for the role.

To be successful in this role, you'll need:

  • 7+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.
  • Ability to drive projects that involve multiple internal and external stakeholders to completion.
  • Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.
  • Proven technical domain leadership: decomposing tasks, setting priorities, triaging incoming bugs and requests.
  • Experience in supporting customer-facing production systems and responding to incidents as part of an oncall rotation.
  • Knowledge of data structures, algorithms, distributed systems, and information retrieval.
  • Experience developing in two or more general purpose programming or scripting languages, including but not limited to: Java, GoLang, JavaScript, Groovy, Python, Shell Scripting.
  • Expertise in management and use of relational databases
  • Experience in solving, diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.
  • Experience in the following areas of software development: refactoring code, test-driven development, build infrastructure, debugging, building tools and testing frameworks.
  • Understanding of networking concepts such as routing, firewalls, load balancers, and secure communication -- especially in the context of cloud infrastructure.
  • Methodical problem-solving approach, coupled with strong communication skills and an ability to own and drive projects to completion.
  • Demonstrated technical mentorship and ability to increase the abilities of those on and outside the team.


  • 10+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.
  • Experience leading projects that demonstrate strong competencies in infrastructure, architecture, and software. Examples may include 3rd party API integrations, disaster recovery plans, cloud migrations, automation of manual or fragmented operations processes, containerization of services.
  • Experience with at least one of the following or similar technologies, including:  Kubernetes, Postgres, etcd, Elasticsearch, or related scheduling and persistence services. Apache Kafka, or related eventing systems.
  • Familiarity with using Infrastructure as Code technologies (eg. Terraform, Ansible)
  • Good understanding of private and public cloud design considerations and limitations in the areas of infrastructure, distributed systems, data storage, Linux-based operating systems, and security.

Colorado Pay Transparency Statement

In accordance with Colorado’s Equal Pay for Equal Work Act, the estimated salary range for this role if hired in Colorado is $111,000-$178,000. Compensation will depend on multiple factors, including geographic location, qualifications, skills, competencies and experience.

Please note that this information is provided for those hired in Colorado only, and this role is open to candidates outside of Colorado as well.

In addition to base salary, this position is eligible for stock options and benefits. Learn more about our benefits at https://jobs.collectivehealth.com/#benefits.

About Collective Health

Founded in 2013, Collective Health has created an ecosystem of innovative partners across care and benefits delivery, as well as built a powerful and flexible infrastructure to better enable employees and their families to understand, navigate, and pay for healthcare. By reducing the administrative lift of delivering health benefits, providing an intuitive member experience, and improving health outcomes, the company guides employees toward healthier lives and companies toward healthier bottom lines. Collective Health is headquartered in San Mateo, CA with locations in Chicago, IL, and Lehi, UT. For more information, please visit collectivehealth.com.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Collective Health is committed to providing support to candidates who require reasonable accommodation during the interview process. If you need assistance, please contact [email protected].

Collective Health requires that any employee who enters a physical workplace be verified as vaccinated or have received an exemption to vaccination. Candidates are not required to furnish such a verification during the application process but would be asked to do so prior to start date if they accept an offer for a role in which they would work at a Collective Health office or would meet regularly in person with clients.

About Collective Health

While medical technology continues to take giant steps forward, somehow the systems driving health coverage are still stuck in the past. The experience we have today is confusing. It’s painful. And we all deserve better. Collective Health was founded on the belief that better is possible. Driven by our mission to make understanding, navigating, and paying for care effortless, we’ve evolved the way health benefits work. More than 155 million Americans count on an employer for coverage. That's why, with the technology to create a more intelligent solution and the compassion to know that every person matters, we deliver a connected healthcare experience for companies who want the best for their employees.

Want to learn more about Collective Health? Visit Collective Health's website.