Site Reliability Engineer
Technology | Santa Monica, CA | Full Time
At Edmunds we’re driven to make car buying easier. Ever since we began publishing printed car guides in the 60’s, the company has been in the business of trust, innovating ways to empower and support car shoppers. When Edmunds launched the car industry’s first Internet site in 1994, we established a leadership position online and have never looked back. Now, as one of the most trusted review sites on the Internet, millions of visitors use our research, shopping and buying tools every month to make an easy and informed decision on their next car. For consumers, we bring peace of mind. For dealers, we make tools to help them solve their problems and sell more cars. How do we do it, you ask? The key ingredients are our enthusiastic employees, progressive company culture and cutting-edge technology. Want to join the team? Read on to find out how!
What You’re Applying For:
Build it. Test it. Measure it. Monitor it. Improve it. An Edmunds Site Reliability Engineer (SRE) is involved in all aspects of product development focusing on bridging the gap of the operations team and the software development team by focusing on system performance, latency, efficiency, and capacity planning. We will spearhead building resiliency and observability into our own systems, as well as, interdependent systems. Ultimately our goal is to help build systems that other people can understand and maintain long after we’re gone.
What You’ll Do:
- Ensure that our infrastructures that our applications run on are highly available, performant, and error free for our customers.
- Work with product managers and software engineers, to build high quality artifacts that emphasizes code visibility (Metric Driven Data), performance, security, cloud first architecture, and developer owned infrastructure.
- Create, manage, monitor, and improve highly scalable, distributed systems to create mission-critical services in Java and NodeJS.
- Solve problems relating to our services and build automation to prevent problem recurrence with the goal of automating response to all non-exceptional service conditions (measure and monitor).
- Participate in service capacity planning and demand forecasting, software performance analysis, and system tuning for developing scalable products.
- Develop effective documentation, tooling, and alerts to both identify and address reliability risks.
What You Need:
- BS degree in Computer Science or related technical field, or equivalent practical experience.
- Understanding of web application development from front-end through to data storage using appropriate technologies already familiar to you.
- Software Development experience using Java or another relevant programming language
- Scripting skills in Python, Perl, Shell or another common language.
- Experience in architecting cloud-based solutions on AWS.
- Experience working with Container technologies such as; Docker.
- Experience with algorithms, data structures, complexity analysis and software design.
- Working knowledge of Linux/Unix operating system and networking.
- Ninja of application performance tuning and troubleshooting.
- A passion to never stop learning - our stack is constantly evolving.
Working @ Edmunds.com:
Employees think it’s a pretty great place to work and some pretty impressive publications think it is too: we have been recognized as one of the best places to work by the Fortune Magazine and Great Places to Work, LA Business Journal (for the last 6 years!), Computerworld, and Built in LA. We've also been identified as one of the best workplaces specifically in Technology and also for Diversity and Asian Americans. In fact, our CEO, Avi Steinlauf, was rated as one of Glassdoor's Highest Rated CEOs! If you’re interested in learning more and joining our mission, we’d love to hear from you!
Edmunds will consider for employment qualified candidates with criminal histories in a manner consistent with the requirements of all applicable laws.