Machine Learning Infrastructure Engineer
Machine Learning | San Francisco, CA | Full Time
Artificial Intelligence (AI) is transforming the world in almost every industry. Everyone knows only high-quality annotated training data can produce the most accurate machine learning solutions. However, creating training data with high-quality in a scalable way is very challenging and very few companies can do it. Even less can do it well, which is why businesses across all industries trust Figure Eight.
In March 2019, Figure Eight was acquired by Appen. Together, Appen and Figure Eight combine the best of human and machine intelligence to provide high-quality annotated training data that powers the world’s most innovative machine learning (ML) and business solutions. The Figure Eight platform enables ML and data-driven business solutions to scale across a diverse set of industries including retail, automotive, finance, manufacturing, agriculture, life sciences, robotics, and more. The Figure Eight platform transforms audio, video, text, and images into high-quality annotated data to support a variety of use cases ranging from computer vision and search relevance to data categorization and natural language processing (NLP). Learn more at www.figure-eight.com.
About the Role
In this role, you will be the part of a small team solving very interesting technical problems at the intersection of various exciting domains like Machine Learning, Big Data, Distributed Systems, Cloud Computing, and High-Performance Computing. Your work will have an enormous impact on Appen's long-term success.
This is a role that will be supporting the greater Appen team in the SF Bay Area. Our team is located in the Bay Area, but you will have the opportunity of working with other Appen team-members located in our Shanghai and Sydney offices. If you are looking to make a huge impact on the AI world and rise with a leading data company that has a start-up culture as we continue to grow, Appen is the place for you.
- Building core machine learning infrastructure including distributed systems, abstractions, development tools, data ETL, and model hosting/serving/inference pipelines.
- Building continuous integration, testing, and deployment pipelines in cloud computing environments.
- Building data logging, tracking, analyzing, monitoring and reporting pipelines in cloud computing environments.
- Bachelor or MS (preferred) in Computer Science, Statistics, Mathematics, or equivalent is required for this position.
- Basic understanding of machine learning is expected.
- Working experience of building data pipelines using big data tools (Hadoop/Spark/Airflow).
- Working experience of building distributed systems, including real-time streaming and batch data processing.
- Working experience of deploying machine-learned models into real-world settings.
- Working experience of building scalable Restful API services.
- Working experience of deploying and operating high availability production systems in the cloud such as AWS, GCP or Azure.
- Strong communication skills.
- Strong overall programming ability.
- Working knowledge of SQL.