Site Reliability Engineer
Systems & Infrastructure | Burnaby, British Columbia, Canada | Full Time
In 2004, Teradici set out to create the best virtual desktop and workstation experience in the world, and along the way we've enabled the most demanding use cases with requirements like top secret security, complex IT infrastructures, and intensive graphics performance. Our PCoIP technology fundamentally simplifies how computing is provisioned, managed and used.
With over 15 million endpoints deployed around the globe, we're no startup. Top government agencies, media conglomerates, production studios, financial firms, and design houses trust Teradici to support their need for secure, high-performance virtual desktops and workstations delivered from private data centers, public clouds, or any combination of both.
You will be joining a security-minded team that supports our internal systems and virtual desktops—a critical component in our company's operational success. The team is responsible for ensuring the secure and reliable service delivery of all internal tools and the systems on which they run. Every day our automation tools test all our products across multiple public and private cloud infrastructure spinning up and spinning down thousands of instances per week. They must rely on a solid foundation.
As a Site Reliability Engineer who focuses on creating automation and using CI/CD to embrace DevSecOps culture, you are part of a team in building secure quality systems and architecting automation that will become the basis that many teams and services require. You also believe in taking advantage of existing tools wherever they improve our capabilities and ability to deliver. Therefore, a commitment to collaborative problem solving, elegant and extensible design, and quality product is imperative. You believe in fail fast, detect fast, fix fast and deploy quickly.
To be a successful candidate, applicants must:
- Demonstrate a passion for learning, testing, automation, security, and quality.
- Exhibit curiosity and excellent attention to detail
- Evangelize best practices for building and operating highly secure and reliable systems.
- Serve as subject matter expert in security, observability, and monitoring.
- Consult in system design to meet security, reliability, and capacity requirements.
- Automate infrastructure and configuration management.
- Conduct timely retrospectives of production infrastructure incidents.
- Assist with all aspects of operational security and compliance.
- Seek out potential threats to security and reliability and advocate solutions.
- Participate in an on-call rotation to receive escalations.
- Implement metrics and log analytics including key indicators, dashboards, and alerts.
- Build and improve operational run books.
- Know when to triage and when to dive down into a root-cause analysis
- Experience developing and monitoring mission-critical systems
- Knowledge of key SRE principles, including monitoring and alerting systems like Elastic Stack, Sumologic, et al.
- Experience in defining infrastructure as code for deployment (Terraform, Cloud Native tools), and configuration (Chef, Puppet, or Ansible)
- First-class analytical, diagnostic, and problem-solving skills
- Excellent verbal and written communication skills
- Experience creating automation to cover installation and configuration, functional testing, security testing, and performance testing
- Experience with software security and secure development lifecycles
- Experience with Git
- Please advise in your application whether you are eligible to work in Canada
- Bachelors or Masters in Computer Science, Computer Engineering, Software Engineering or equivalent
- Experience using public cloud services (Azure, AWS or GCP, though AWS preferred)
- Experience with or knowledge of IT technologies such as LDAP, Active Directory, IP networking, DNS, or virtualization of networks or workstations
- Experience with containers and container orchestration tools (i.e., Kubernetes, Docker, AWS ECS)
- Experience or knowledge of cloud-native DevOps practices is highly desirable
- Experience working with and configuring all major operating systems (Windows, Linux, macOS)
- Experience with security testing and compliance products (i.e., Synopsys Blackduck, Qualys, Nessus, Veracode)
- A knack for operational efficiencies and cost savings
- Automate everything. You actively automate many manual tasks as possible to be repeated many times and allow us to scale.
- Self-managed teams. You hold yourself accountable for the full end-to-end lifecycle of what you are working on, from ensuring you build something that will deliver customer value to getting it into customers' hands.
- Developer collaboration. You want feedback from your internal customers to ensure your work provides value and enables you to iterate on that work.
- Tech-debt reduction. Software can live longer than you expect; therefore, you need to ensure it stays healthy and manage your technical debt accordingly.
- Collective ownership. You value contribution, wherever it comes from, and believe in peer review, continuous integration, test coverage and customer validation.
- Solving complex problems as simply as possible
- Drive to improve – whether it relates to a process, a tool, infrastructure, or general team knowledge – look to assist in making the impossible possible.
- Be a strong team player, who is not afraid to speak up, ask questions and be heard.
- We offer a competitive base salary and Employee Bonus Plan (company performance based). Our health benefits and retirement savings contributions start right away – no waiting period! We also offer three weeks of vacation for the first year (accrued and increased annually, up to 20 days per year)
- We develop and nurture our employees to be their best and bring their authentic selves to every team interaction. We strive for a dynamic team environment that is transparent and allows everyone to contribute and be heard.
- The health and safety of our employees is our top priority. As a result of COVID-19, we have implemented a Work From Home initiative for all employees and we encourage our teams to stay connected virtually during this time.
- Once our office is open again, we are excited to offer:
- Monthly social events & activities
- Luxury shuttle service to and from the nearest SkyTrain station
- Underground and secured bike "cage"
- Fully equipped onsite gym, Basketball, "Beach" Volleyball court and weekly yoga classes
- Teradici supports remote work flexibility using our own Cloud Access Software!