Data Scientist - Entity Resolution

Spokeo, Pasadena

See jobs at Spokeo

Spokeo is seeking a Data Scientist - Entity Resolution to join us in Pasadena, CA. Spokeo builds innovative products that make your world more transparent. We help you know the people around you better so you can be more connected, more protected and trust a little easier.

Spokeo is a people search engine that both enlightens and empowers our customers. With over 12 billion records and 18 million visitors per month, we reconnect friends, reunite families, prevent fraud, and more. Every day our nimble team takes on enormous challenges in data science that push the limits of the cloud and search architecture.

We are looking for a Data Scientist to help us turn our vast repository of data into usable information leveraging solutions that include matching techniques, machine learning and graph theory to resolve entities across disparate data sets and discover relationships among those entities.  

Responsibilities and Deliverables:

  • Profiling, processing and evaluating large structured and unstructured data sets.
  • Developing data driven models to quantify the value of a given data set.
  • Data mining using state-of-the-art methods.
  • Inventing and maintaining algorithms primarily focused on entity resolution, linking and scoring.
  • Creating automated anomaly detection systems and quality assurance metrics to constantly track performance.
  • Collaborating with product and engineering to understand the needs of the company, develop business logic and find innovative solutions.
  • Guiding and educating data team members with regard to best practices for data cleansing and processing.
  • Keeping up to date with the latest technology trends.
  • Performing Ad-hoc analysis and presenting results.


Skills and Competencies:

  • Advanced degree in Machine Learning, Computer Science, Engineering, Physics, Statistics, Applied Math or other quantitative fields.
  • Minimum of 4 years working experience in analytics, data mining, data visualization and/or predictive modeling.
  • Proficiency working with statistical tools (such as R, SAS), database query languages (SQL) and some scripting languages (such as Python).
  • Experience with Hadoop and NoSQL related technologies such as MapReduce, Spark, Hive, Pig, HBase, Cassandra, etc.
  • Comfortable training and communicating complex ideas to non-technical audiences.
  • Demonstrated ability to lead and execute projects from start to finish.


Recruiters or staffing agencies: Spokeo is not obligated to compensate any external recruiter or search firm who presents a candidate or their resume or profile to a Spokeo employee without 1) a current, fully-executed agreement on file and 2) being assigned to the open position (as a search) via our applicant tracking solution.

About Spokeo

What can you do with Spokeo? Research Whether you're looking up an unknown caller, looking to learn more about your date, or wanting to research a new neighborhood you are moving to, a quick search on Spokeo gives you access to billions of records in an instant. Reconnect Spokeo makes it easy to reunite with family, friends and love interests. Simply search a name, phone number, or other details and easily find the most recent information we have available. What Spokeo doesn't do Credit, Insurance or Employee Screening Spokeo does not have access to secure or private financial information. You may not use Spokeo for credit, tenant or insurance screening purposes. Utilizing Spokeo's platform for purposes of employee screening is strictly prohibited. Any Purpose Specified in the FCRA Spokeo is not a credit reporting agency and does not offer consumer reports. We work hard to ensure that all of the data on our site is within established guidelines.

Want to learn more about Spokeo? Visit Spokeo's website.