AI-Ops Engineer
Artificial Intelligence | Hybrid in Stanford, CA | Full Time, Contract, and Temporary | From $60.00 to $60.00 per hour
Job Description
AI-Ops Engineer 1464463
- Hourly pay: $60/hr
- Worksite: Leading university (Stanford, CA 94305 - Hybrid, Must be onsite 2–3 days on campus)
- W2 Employment, Group Medical, Dental, Vision, Life, Retirement Savings Program, PSL
- 40 hours/week, 12 Month Assignment, Possible extension or conversion
A leading university seeks an AI-Ops Engineer. The successful candidate will be responsible for evolving traditional DevOps into AI- Ops at the Engineering Center. This role leverages AI and machine learning to automate and enhance IT operations. The company offers a family-oriented culture and environment!
AI-Ops Engineer Responsibilities:
- AI-Driven Operations & Automation
- Implement AIOps solutions that use ML algorithms to automate performance monitoring, workload scheduling, and infrastructure management.
- Build anomaly detection systems that identify infrastructure issues before they impact users.
- Create predictive maintenance workflows that analyze historical patterns to proactively mitigate issues.
- Observability & Intelligent Monitoring:
- Architect comprehensive observability platforms that aggregate data from disparate sources into unified dashboards.
- Implement intelligent alerting systems using NLP and ML to reduce alert fatigue and surface actionable insights.
- Deploy application performance monitoring (APM) solutions integrated with AI-driven analytics. Ensure end-to-end visibility across cloud infrastructure, applications, and AI/ML workloads.
- Cloud Infrastructure & DevOps:
- Design, build, and maintain scalable, secure AWS infrastructure using Infrastructure as Code (CloudFormation, Terraform, or CDK).
- Implement and manage containerized environments using Docker, AWS ECS, Fargate, and Kubernetes (EKS).
- Build CI/CD pipelines for continuous delivery, integrating AI-powered code quality and deployment optimization.
- Collaboration & Continuous Improvement:
- Partner with cross-functional teams to implement domain-agnostic AIOps solutions across the organization.
- Use Git-based version control and code review best practices as part of a collaborative, agile workflow.
- Document operational procedures, runbooks, and AIOps workflows for team knowledge sharing.
- Occasional on-call responsibilities for critical infrastructure.
AI-Ops Engineer Qualifications:
- 3+ years of experience in DevOps, SRE, or Cloud Engineering roles.
- 2+ years of hands-on experience with AWS infrastructure (EC2, ECS, Lambda, S3, IAM, VPC).
- Experience implementing monitoring, observability, and alerting solutions at scale.
- Bachelor's degree in Computer Science, DevOps, Cloud Engineering, or a related field (Master's preferred).
- AWS certification preferred (Solutions Architect, SysOps Administrator, or DevOps Engineer); Professional-level certification a plus.
- Familiarity with ML/AI concepts and their application to operational automation.
- Languages: Python (required); Bash, Go, or TypeScript preferred.
- AIOps & Monitoring: CloudWatch, X-Ray, Prometheus, Grafana, Datadog, or Splunk with ML capabilities.
- Infrastructure as Code: AWS CloudFormation, Terraform, or AWS CDK.
- Containers & Orchestration: Docker, AWS ECS/Fargate, Kubernetes (EKS).
- AWS Services: Lambda, EC2, S3, API Gateway, EventBridge, CloudWatch, IAM, VPC, CodePipeline, SageMaker.
- CI/CD Tools: GitHub Actions, AWS CodePipeline, Jenkins, or GitLab CI.
- Data & Analytics: Experience with log aggregation, metrics analysis, and event correlation platforms.
- Strong understanding of AIOps principles—using AI to enhance, not just support, IT operations is preferred.
- Passion for automation and eliminating manual, repetitive operational tasks is preferred.
- Excellent problem-solving, debugging, and root cause analysis skills are preferred.
- Demonstrated ability to learn rapidly, adapt to new technologies, and continuously improve is preferred.
- Strong communication skills withthe ability to collaborate across technical and non-technical teams are preferred.
- Commitment to reliability, security, and operational excellence is preferred.
- Thrives in a fast-paced, evolving environment, proactively seeking opportunities to embed intelligence into systems and processes is preferred.
Shift:
- Monday to Friday 9 am - 6 pm.
(H)
