Your browser cookies must be enabled in order to apply for this job. Please contact if you need further instruction on how to do that.

Lead Site Reliability Engineer

WithMe | Redwood City, CA | Full Time

Job Description

Lead Site Reliability Engineer

Redwood City, CA

As a Lead Site Reliability Engineer for our Backend Services team you will be a champion for service quality and uptime. You will support and mentor the software development, CI/CD, and SRE team members in safely deploying, maintaining and monitoring services. You'll extend, support and improve the operations infrastructure. In all you do, your goal will be a maintainable, automated, and highly available environment, impervious to bugs and human mistakes.
Your solutions have to work in AWS / Kubernetes / Istio environment, and will cover every aspect of a system, provisioning, deployment, monitoring, failover, documentation and recovery.
Whether you own a project from start to finish, or in concert with team members and members of other parts of engineering, you will lead and think holistically, you will decide how we move forward in a sustainable way, minimizing any adverse effects on production as a whole.


  • Ensuring the availability and service quality of all of our environments
  • safeguarding the service from external threats and internal honest to goodness mistakes,
  • building and scaling infrastructure,
  • keeping abreast of industry standards and technology and figuring out if we can benefit,
  • developing and improving internal processes and lead by example,
  • mentoring less senior team members,
  • enabling product and service engineering teams to develop software designed to be operated smoothly and consistently at scale and to adhere to reliability and security standards,
  • making operations more efficient through seamless maintenance and automated responses to key performance metrics, logs, and alerts,
  • troubleshooting and resolving issues affecting production, such as performance bottlenecks, buggy software and error-prone processes,
  • periodic (including weekends), daytime (PST, PDT) on-call duty where you are first response for any production problem,
  • conducting post-mortems to analyze and prevent repeat failures.


  • Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience,
  • AWS
  • Kubernetes
  • Production at scale
  • experience with build automation and continuous integration/delivery ecosystem,
  • 10+ years of experience in computing, distributed systems, storage, or networking,
  • proficiency programming in at least one of Bash, Python, Go, Perl, Java, Javascript
  • experience with Docker
  • experience with infrastructure configuration management and automation processes and tools,
  • experience architecting, developing, and troubleshooting large scale systems,
  • experience with Unix/Linux systems internals (e.g., filesystems, system calls) and administration,
  • experience with performance management, logging, and monitoring tools,
  • experience with common open source technologies at scale, such as MongoDB, nginx, redis, elasticsearch,
  • a systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive,
  • on-call mindset.


  • Istio service mesh and associated tools (Jaeger, Prometheus, Envoy), or just Envoy Proxy
  • experience with Sql and NoSql database administration, replication, backups, query tuning and schema design for performance optimization.
  • in particular, experience with databases and services such as MongoDB, MongoDB Atlas, Redis, RDS, Elastic Search, and other popular AWS services
  • experience with intrusion, penetration, vulnerability scanning and PCI compliance,
  • experience with capacity planning,
  • experience with disaster recovery,
  • experience with Atlassian JIRA, Confluence, OpsGenie, Jenkins, Hashicorp Vault, Bitbucket Pipelines


IMVU is a global 3D avatar-based social community of users who come together to play, interact, and make friends. Far beyond traditional social media, IMVU users customize an avatar representing themselves to express and interact with other avatars creating a greater sense of social presence – the feeling of being with someone else as if you’re physically there.

Today, over 5 million IMVU users every month enjoy the freedom to live the life they create through highly-stylized avatars, interacting with friends in immersive chat rooms, shopping for new looks, and sharing their experiences. IMVU is an expressive and collaborative world which includes over 50,000 creators making real money using design tools to create looks and rooms to sell in the IMVU Store.

To continue our success, we are building an even better way for friends to spend time together, in 3D, first on mobile and later in VR. The new platform will support advanced real-time play and interaction of all kinds, and a marketplace for creators (UGC) of realistic virtual goods, spaces, games and social experiences. It’s the most ambitious technical stack for a 3D virtual world ever conceived for Mobile, enabled by the latest advances in consumer mobile devices. Come help us build the future of Social.


  • Please try out our core product before you apply. We’d like you to understand our products and have some understanding of our customers.

  • Please include a cover letter. Make sure to discuss why you are interested in learning more about IMVU. Job applications without cover letters will not be considered.

IMVU is an equal opportunity employer; applicants are considered for all roles without regard to race, color, religious creed, sex, national origin, citizenship status, age, physical or mental disability, sexual orientation, marital, parental, veteran or military status, unfavorable military discharge, or any other status protected by applicable federal, state or local law.