Your browser cookies must be enabled in order to apply for this job. Please contact if you need further instruction on how to do that.

Site Reliability Engineer

Engineering – Infrastructure | San Francisco, CA | Full Time

Job Description

imgix is building the future of visual media on the Internet. imgix operates the premier solution to deliver impactful, engaging, highly responsive and super fast imagery to eyeballs around the world. The service consists of a top tier image delivery platform tightly coupled with imgix's proprietary, on-demand image processing pipeline. It provides customers with great design flexibility while reducing the engineering investment required to serve state-of-the-art visual media. imgix enables our customers to greatly increase the value of their imagery and get back to building awesome things.

We're looking for a Site Reliability Engineer to join our team. Your mission is to help scale tens of thousands of requests per second of production traffic with a four 9's or better success rate. This will entail setting monitors on production systems and performance, hardening distributed services and gathering intelligence around traffic anomalies. You will also be taking active steps to shift and direct traffic.

What you'll be doing:

  • Oversee general health of the entire image rendering stack from CDN to customer-facing network proxies
  • Oversee the health and liveliness of logging pipeline
  • Author dashboards in Grafana
  • Monitor services for performance regressions, diagnose and work with infrastructure team to apply necessary fixes
  • Tune alerts to predict looming failures in the stack
  • Develop and maintain workflows for repairing ailing or failed physical machinery
  • Maintain and improve OS installer: OSX and Ubuntu
  • Manage external DNS across multiple third party providers
  • Push proxy policy changes to openresty, haproxy, and internally authored proxies
  • Identify missing performance measurements and either make code changes to include or work with team to do the same
  • Continue efforts to automate both new and existing service deployments (ansible mostly)
  • Design entirely new services with infrastructure team
  • Author packages (ubuntu, nix packages)
  • Identify and perform work to remove single points of failure

What we're looking for:

  • 3+ years of relevant work experience
  • Linux and OSX(macos) systems administration
  • Scripting experience including Bash, Python, Lua
  • Familiarity with web environments including HTTP, SSL, and DNS
  • Solid grasp of network fundamentals: DHCP, ARP, subnetting, routing, firewall
  • A hunger to dive into network servers and services
  • Experience with Linux Kernel and packaging
  • Performance analysis and debugging with tools like perf, sar, strace, dtrace
  • Load balancing and reverse proxy technologies such as nginx
  • Three or more years experience in Python
  • Time series databases (OpenTSDB, graphite)
  • ​​​Internet working and BGP
  • Experience with networking programming in C, C++ or GO
  • Experience with rapid release engineering
  • Experience working in a 24/7/365 service environment
  • Networking or routing experience

Our Stack:

  • Haproxy
  • Openresty(nginx)
  • Consul
  • Ansible
  • Grafana
  • Prometheus
  • BigQuery
  • Fastly
  • Python
  • Go
  • Lua
  • Levee
  • Nix packaging

imgix is located in downtown San Francisco, pretty close to BART, Caltrain and a really good sandwich place. Employee benefits are comprehensive (401k, medical, dental and vision), perks are generous (catered lunches, paid rides home, the occasional team outing), vacation time is flexible and salaries are commensurate with experience. We also provide employees with anything (reasonably) necessary to be effective in their work: funky keyboards, standing desks, a desk cactus and maybe even a laptop. The troposphere is the limit.

If you'd like to help us build the future of image serving on the Internet, submit your resume/CV and cover letter for consideration. Principals only please. We look forward to hearing from you!