Senior DevOps Engineer
Engineering | San Francisco, CA | Full Time and Temp to Perm
CrowdFlower is the essential human-in-the-loop AI platform for data science and machine learning teams. The CrowdFlower software platform trains, tests, and tunes machine learning models to make AI work in the real world. CrowdFlower’s technology and expertise supports a wide range of use cases including autonomous vehicles, intelligent personal assistants, medical image labeling, consumer product identification, content categorization, customer support ticket classification, social data insight, CRM data enrichment, product categorization, and search relevance.
Headquartered in San Francisco and backed by Canvas Ventures, Trinity Ventures, Industry Ventures, Microsoft Ventures, and Salesforce Ventures, CrowdFlower serves Fortune 500 and fast-growing data-driven organizations across a wide variety of industries. For more information, visit www.crowdflower.com
The Ideal Candidate:
The ideal candidate has an extensive experience with SaaS platforms, primarily hosted on AWS infrastructure. You are able to strike the right balance between appropriate levels of process and security while still supporting an agile and fast paced development environment. You are used to working closely with application developers and enjoy participating in the architectural discussions. While you don’t enjoy it, you know that you are occasionally going to be summoned to respond to an after hours emergency. You are equally determined to ensure it does not happen often.
- Support of production infrastructure and services, including our
- AWS Infrastructure such as EC2, S3, IAM, Route53, Elasticache, ALB's, CloudWatch, CloudFormation, etc
- Java and Rails Applications
- Docker and Kubernetes
- PostgreSQL and Redis databases
- Provide leadership to the team in mastering technologies, identifying and implementing worthwhile new technologies and improving our process.
- Continuous delivery (CI/CD) using Jenkins, Maven, Artifactory, Docker, Chef/Ansible, Git.
- Site reliability and availability, including end-to-end performance, service monitoring, alerting, capacity sizing and planning.
- 24/7 on-call rotation for production support, troubleshooting production and development issues.
- Business continuity planning & testing.
Skills and Experience:
- You must have
- Minimum 5 years of DevOps and system administration experience, preferably in mid or late startups.
- At least 3 years in managing AWS cloud infrastructure.
- Experience configuring and supporting SaaS environments, provisioning resources, monitoring utilization and making adjustments in accordance with SOPs.
- Expertise in Docker and Kubernetes.
- Hands-on experience with monitoring/APM tools such as New Relic, CloudWatch, PaperTrail and Rollbar.
- Strong Linux administration (e.g. Ubuntu) and scripting (e.g. shell script, Python).
- Soft skills, e.g. team player, clear and concise communication, problem solving, sense of humor.
- Desired skills, but not mandatory:
- Expertise in database scalability, availability and performance tuning, preferably with PostgreSQL and Redis.
- Building hybrid cloud using VMWare, CloudFoundry.
- Experience in building and supporting PaaS (e.g. Heroku, RedisGreen, Deis).
- Data protection and secret handling technology such as Vault.
- Logging (e.g. splunk, Graylog) and Elastic Search (ELK).
- Building micro-services and real-time event processing is a big plus.
CrowdFlower offers an attractive total compensation package including outstanding benefits and stock options. Learn more about our culture at http://www.crowdflower.com/careers/.