Your browser cookies must be enabled in order to apply for this job. Please contact if you need further instruction on how to do that.

Data Optimization Analyst

Data | Remote | Full Time

Job Description

Spokeo is a people search engine that both enlightens and empowers our customers. With over 12 billion records and 18 million visitors per month, we reconnect friends, reunite families, prevent fraud, and more.

As a Data Optimization Analyst at Spokeo, you will have the opportunity to combine programming, algorithms, and quantitative analysis to optimize our data quality and key metrics for high volume data pipelines, integrating multiple sources. You will be responsible for leveraging data mining approaches to systematically recognize patterns and data anomalies intended to inform data cleaning, search-ability, and identity resolution. This is a hands-on development/analytics position focused on automating processes that ensure the integrity of the data that we provide to our customers. You should be open to working with diverse types of big data and technologies such as Amazon AWS, MapReduce/Spark, Tableau, and Pentaho. The ideal candidate should possess expertise in data mining large and complex datasets using Python and Spark.


  • Research and develop analysis, predictive modeling and optimization methods to improve our people data and identity resolution.
  • Extract insights from data sets for backend development to identify strategic opportunities to improve data quality and performance and translate them into meaningful visuals and dashboards to inform business stakeholders.
  • Develop and automate regression test scenarios in support of ETL and data science development efforts.
  • Perform ad-hoc quality audits to investigate data anomalies and ensure quality standards, procedures, and methodologies are being followed to provide a risk assessment and set the correct severity and priority for resolution.
  • Define and track quality assurance metrics, acceptance thresholds, procedures, and documentation in accordance with standards and guidelines.


  • A bachelor's degree in Machine Learning, Computer Science, Engineering, Physics, Statistics, Applied Math, Operations Research, or another quantitative field.
  • Minimum of two (2) years of full-time working experience in data analysis, data mining, and/or visualization
  • Proficiency in working with Python-based statistical tools (pandas/numpy) and Spark/PySpark
  • Experience with Hadoop and NoSQL-related technologies such as MapReduce, Spark, Hive, ElasticSearch, and DynamoDB.
  • Comfortable training and communicating complex ideas to non-technical audiences.
  • Demonstrated ability to lead and execute projects from start to finish

Privacy Notice for Candidates:

Spokeo is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Spokeo fosters a business culture where ideas and decisions from all people help us grow, innovate, create the best products, and be relevant in a rapidly changing world.

Recruiters or staffing agencies: Spokeo is not obligated to compensate any external recruiter or search firm who presents a candidate or their resume or profile to a Spokeo employee without 1) a current, fully-executed agreement on file and 2) being assigned to the open position (as a search) via our applicant tracking solution.