Your browser cookies must be enabled in order to apply for this job. Please contact if you need further instruction on how to do that.

Senior DevOps Engineer

Engineering | San Francisco, CA | Full Time

Job Description

About Us

Figure Eight is the essential Human-in-the-Loop Machine Learning platform for data science and machine learning teams. The Figure Eight platform transforms unstructured text, image, audio, and video data into customized high-quality training data to make AI work in the real world. Figure Eight's technology and expertise supports a wide range of use cases including autonomous vehicles, intelligent personal assistants, medical image labeling, consumer product identification, content categorization, customer support ticket classification, social data insight, CRM data enrichment, product categorization, and search relevance.

Headquartered in San Francisco and backed by Canvas Ventures, Trinity Ventures, Industry Ventures, Microsoft Ventures, and Salesforce Ventures, Figure Eight serves Fortune 500 and fast-growing data-driven organizations across a wide variety of industries. For more information, visit

About the Role

This role is ideal for an experienced DevOps Engineer looking to expand their knowledge of Machine Learning. Our SaaS platform consists of annotation software used by 100,000s of people creating training data, Machine Learning models for real-time predictions at scale, and the combination of the two for smart combinations of Human and Machine Intelligence. You will support every part of this pipeline.

You don’t need to have a background in Machine Learning, but you do need to have an interest in expanding your knowledge of Machine Learning services. Within the company, we train all our engineers on Machine Learning, and how to deploy them on services including AWS, Google Cloud, and Microsoft Azure, so you will have the opportunity to learn all of these technologies alongside the other team members, and you will lead DevOps strategies to support them.


  • Work closely with Application Developers and enjoy participating in the architectural discussions
  • Work closely with Machine Learning Scientists and enjoy thinking about making services scale
  • Support of production infrastructure and services, including our
    • AWS Infrastructure such as EC2, ECS, S3, IAM, Route53, Elasticache, Load Balancers, CloudWatch etc.
    • Java and Rails Applications
    • Docker and Kubernetes
    • PostgreSQL and Redis databases
  • Provide leadership to the team in mastering technologies, identifying and implementing worthwhile new technologies and improving our process
  • Continuous delivery (CI/CD) using Jenkins, Maven, Artifactory, Docker, Chef/Ansible
  • Site reliability and availability, including end-to-end performance, service monitoring, alerting, capacity sizing and planning
  • 24/7 on-call rotation for production support, troubleshooting production, and development issues.  After-hour emergencies are rare, and you will help us make them even rarer!
  • Business continuity planning and testing

Skills & Experience

  • At least 5 years of DevOps and system administration experience, preferably in mid or late startups
  • At least 3 years in managing AWS or GCP cloud infrastructure
  • Experience in configuring and supporting SaaS environments, provisioning resources, monitoring utilization and making adjustments in accordance with SOPs
  • Expertise in Docker. Kubernetes would be an added advantage
  • Experience monitoring/APM tools such as New Relic, CloudWatch, PaperTrail and Rollbar
  • Linux administration (Ubuntu, Amazon Linux, Centos) and scripting (e.g. shell script, Python)
  • Soft skills, e.g. team player, clear and concise communication, problem solver, sense of humor

Good to have

  • Expertise in database scalability and availability, preferably with PostgreSQL and Redis
  • Building hybrid cloud using VMWare, CloudFoundry
  • Experience in building PaaS (e.g. Heroku, RedisGreen, Deis)
  • Data protection and secret handling technology such as Vault
  • Logging (e.g. splunk, logstash, Graylog) and Elastic Search (ELK)
  • Managing micro-services and real-time event processing is a big plus