Your browser cookies must be enabled in order to apply for this job. Please contact support@jobscore.com if you need further instruction on how to do that.

Senior Site Reliability Engineer

WithMe | Redwood City, CA | Full Time

Job Description

Senior Site Reliability Engineer

 

As a Senior Site Reliability Engineer for our Backend Services team you will champion service quality and uptime. You will support the software development, CI/CD, and SRE team members in safely deploying, maintaining and monitoring services. You'll extend, support and improve the operations infrastructure. In all you do, your goal will be a maintainable, automated, and highly available environment, impervious to bugs and human mistakes.

Your solutions have to work in AWS / Kubernetes environment and will cover every aspect of a system, provisioning, deployment, monitoring, failover, documentation and recovery.

Whether you own a project from start to finish, or in concert with team members and members of other parts of engineering, you will lead and think holistically, you will decide how we move forward in a sustainable way, minimizing any adverse effects on production as a whole.

RESPONSIBILITIES

  • Ensuring the availability and service quality of all of our environments

  • safeguarding the service from external threats and internal honest to goodness mistakes,

  • building and scaling infrastructure,

  • keeping abreast of industry standards and technology and figuring out if we can benefit,

  • developing and improving internal processes and lead by example,

  • mentoring less senior team members,

  • enabling product and service engineering teams to develop software designed to be operated smoothly and consistently at scale and to adhere to reliability and security standards,

  • making operations more efficient through seamless maintenance and automated responses to key performance metrics, logs, and alerts,

  • troubleshooting and resolving issues affecting production, such as performance bottlenecks, buggy software and error-prone processes,

  • periodic (including weekends), daytime (PST, PDT) on-call duty where you are first response for any production problem,

  • conducting post-mortems to analyze and prevent repeat failures.

QUALIFICATIONS

  • Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience,

  • AWS

  • Kubernetes

  • Production at scale

  • Experience with build automation and continuous integration/delivery ecosystem,

  • 5+ years of experience in computing, distributed systems, storage, or networking,

  • proficiency programming in at least one of Bash, Python, Go, Perl, Java, Javascript

  • Experience with Docker

  • Experience with infrastructure configuration management and automation processes and tools,

  • Experience architecting, developing, and troubleshooting large scale systems,

  • Experience with Unix/Linux systems internals (e.g., filesystems, system calls) and administration,

  • Experience with performance management, logging, and monitoring tools,

  • Experience with common open source technologies at scale, such as MongoDB, nginx, redis, elasticsearch,

  • A systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive,

  • Service owner mindset.

BONUS POINTS

  • Experience with Sql and NoSql database administration, replication, backups, query tuning and schema design for performance optimization.

  • In particular, experience with databases and services such as MongoDB, MongoDB Atlas, Redis, RDS, Elastic Search, and other popular AWS services

  • Experience with intrusion, penetration, vulnerability scanning, and PCI compliance

  • Experience with capacity planning

  • Experience with disaster recovery

  • Experience with Atlassian JIRA, Confluence, OpsGenie, Jenkins, Hashicorp Vault, Bitbucket Pipelines

 

ABOUT US

WithMe is a new mobile platform that empowers social connection through shared experiences in virtual spaces. Just like in real life, friends are able to get together to play games, watch videos, hang out and create.


Based in Silicon Valley, Together Labs (formerly IMVU Inc.) is the parent company to WithMe and a leading technology company dedicated to empowering people to connect, create and earn in virtual worlds. Together Labs’ products represent the changing landscape of social interaction and redefine human connection. 


With IMVU, the world's largest avatar social platform and a top-5 grossing app in the App Store, millions of users can customize their characters through a growing catalog of 50 million community-created items and explore over 400,000+ destinations to connect to each other. 


Setting the standard for the use of digital currencies in the metaverse, VCOIN unlocks the full potential of virtual economies and solves the unmet need for easy and secure global peer-to-peer payment. 


Currently, in development, WithMe is a new mobile platform that empowers social connection through shared experiences in virtual spaces.


Founded in 2004 and based in the heart of Silicon Valley, Together Labs is led by a team that's dedicated to pioneering the virtual reality industry. Together Labs is backed by venture investors Menlo Ventures, Allegis Capital, Bridgescale Partners, and Best Buy Capital.


IMVU has been recognized frequently as Best Place to Work in Silicon Valley. 


 Want to see more...  www.withme.com