CA locals--Cloud Infrastructure Engineer
- Kairos Expired
- Newark, California
- Full Time

This job ad was removed 6 hours ago.
Job Description
Greetings,
Role: Cloud Infrastructure Engineer
Location: Newark, CA (3 day onsite work from client office)
Duration: Long Term
Experience: 11+ years.
Typical Day in the Role
Purpose of the Team: The purpose of this team is the Cloud Infrastructure Team and managed the AWS Cloud, OCI Cloud and all critical applications.
Key projects: This role will contribute to working on open VPN upgrades, AMQX upgrades, etc.
Typical task breakdown and operating rhythm: The role will consist of responsibilities in the JD
Compelling Story & Candidate Value Proposition
What makes this role interesting? - This role provides the opportunity to have the ability to move into a FTE position. This role also offers the opportunity to learn more about Electric Vehicle industry and the engineering process.
Candidate Requirements
Years of Experience Required: 7-8 overall years of experience in the field.
Degrees or certifications required: AWS Cloud Certification or OCI Certification
Disqualifiers: N/A
Best vs. Average: N/A
Performance Indicators: Performance will be assessed based on meeting deadlines and quality of work.
Top 3 Hard Skills Required + Years of Experience
1. Minimum 7 years experience with Cloud architecture or engineering
2. Minimum 7 years experience with DevOps
3. Minimum7 years experience with Kubernetes
Hard Skills Assessments
Expected Dates that Hard Skills Assessments will be scheduled: ASAP
Hard Skills Assessment Process: The assessment process will include 2 rounds with the sponsor
Required Candidate Preparation: Be prepared to work out a troubleshoot an issue.
You will:
Reliability Engineering: Own and enhance the reliability of services deployed across various cloud regions. You will proactively monitor, automate, and scale services to ensure seamless uptime and performance.
Containerization & Microservices Deployment: Lead the containerization and deployment of microservices and data pipelines on Kubernetes, using Helm charts, ensuring best practices for scalability and fault tolerance.
DevOps Advocacy: Foster and advocate for a DevOps culture that emphasizes automation, self-service, and engineering excellence. Enable development teams to manage and deploy applications seamlessly with minimal intervention.
Performance Monitoring & Autoscaling: Implement autoscaling strategies and monitor the performance of applications and infrastructure with tools like Prometheus, Grafana, and other observability platforms.
Site Reliability Engineering (SRE): Perform SRE tasks such as availability monitoring, incident response, post-mortem analysis, and preparing reliability reports for leadership and stakeholders.
Tool Deployment & Maintenance: Deploy, configure, and maintain essential cloud services and tools including Kafka, Spark, Presto, Airflow, MQTT, and other microservices platforms in a cloud-native environment.
Infrastructure as Code (IaC): Set up and manage cloud infrastructure using tools like Terraform, Cluster API, and other IaC frameworks, ensuring seamless provisioning, management, and scaling of resources.
Automated Alerts & Recovery: Continuously enhance and automate alerting, incident detection, and recovery mechanisms for critical applications and services to minimize downtime and improve system reliability.
On-Call Rotation: Participate in an on-call rotation to meet business SLAs, quickly troubleshoot and resolve issues, and document runbooks for consistent incident management processes.
Agile Collaboration: Work closely with Product Owners, Engineering Managers, and cross-functional teams in Agile Scrum and Kanban workflows to deliver iterative improvements and meet evolving business needs.
Impact Analysis & Incident Management: Perform impact analysis during incidents, collaborate with teams for root cause analysis, and implement preventive measures to avoid recurrence.
You Bring:
B.S. or M.S. degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
8+ years in Site Reliability Engineering (SRE), DevOps Engineering, or related fields
At least 4+ years of hands-on experience deploying, managing, and optimizing containerized applications using Docker and Kubernetes in both public and private cloud environments (AWS, Google Cloud Platform, Azure, etc.).
4+ years in Infrastructure-as-Code (IaC) using Terraform, Cluster API, or similar automation frameworks to manage cloud infrastructure.
Experience in scripting or programming with Python, Go, Bash/Shell, or similar languages.
Strong understanding of using Prometheus, Grafana, and other monitoring and observability tools.
Ability to effectively diagnose and resolve performance bottlenecks within AWS at the infrastructure and application layers.
Configuration Management: Experience with configuration man
Thanks& Regards,
K Hemanth Kumar | Sr IT Technical Recruiter | Kairos Technologies Inc
E:
Greetings,
Role: Cloud Infrastructure Engineer
Location: Newark, CA (3 day onsite work from client office)
Duration: Long Term
Experience: 11+ years.
Typical Day in the Role
Purpose of the Team: The purpose of this team is the Cloud Infrastructure Team and managed the AWS Cloud, OCI Cloud and all critical applications.
Key projects: This role will contribute to working on open VPN upgrades, AMQX upgrades, etc.
Typical task breakdown and operating rhythm: The role will consist of responsibilities in the JD
Compelling Story & Candidate Value Proposition
What makes this role interesting? - This role provides the opportunity to have the ability to move into a FTE position. This role also offers the opportunity to learn more about Electric Vehicle industry and the engineering process.
Candidate Requirements
Years of Experience Required: 7-8 overall years of experience in the field.
Degrees or certifications required: AWS Cloud Certification or OCI Certification
Disqualifiers: N/A
Best vs. Average: N/A
Performance Indicators: Performance will be assessed based on meeting deadlines and quality of work.
Top 3 Hard Skills Required + Years of Experience
1. Minimum 7 years experience with Cloud architecture or engineering
2. Minimum 7 years experience with DevOps
3. Minimum7 years experience with Kubernetes
Hard Skills Assessments
Expected Dates that Hard Skills Assessments will be scheduled: ASAP
Hard Skills Assessment Process: The assessment process will include 2 rounds with the sponsor
Required Candidate Preparation: Be prepared to work out a troubleshoot an issue.
You will:
Reliability Engineering: Own and enhance the reliability of services deployed across various cloud regions. You will proactively monitor, automate, and scale services to ensure seamless uptime and performance.
Containerization & Microservices Deployment: Lead the containerization and deployment of microservices and data pipelines on Kubernetes, using Helm charts, ensuring best practices for scalability and fault tolerance.
DevOps Advocacy: Foster and advocate for a DevOps culture that emphasizes automation, self-service, and engineering excellence. Enable development teams to manage and deploy applications seamlessly with minimal intervention.
Performance Monitoring & Autoscaling: Implement autoscaling strategies and monitor the performance of applications and infrastructure with tools like Prometheus, Grafana, and other observability platforms.
Site Reliability Engineering (SRE): Perform SRE tasks such as availability monitoring, incident response, post-mortem analysis, and preparing reliability reports for leadership and stakeholders.
Tool Deployment & Maintenance: Deploy, configure, and maintain essential cloud services and tools including Kafka, Spark, Presto, Airflow, MQTT, and other microservices platforms in a cloud-native environment.
Infrastructure as Code (IaC): Set up and manage cloud infrastructure using tools like Terraform, Cluster API, and other IaC frameworks, ensuring seamless provisioning, management, and scaling of resources.
Automated Alerts & Recovery: Continuously enhance and automate alerting, incident detection, and recovery mechanisms for critical applications and services to minimize downtime and improve system reliability.
On-Call Rotation: Participate in an on-call rotation to meet business SLAs, quickly troubleshoot and resolve issues, and document runbooks for consistent incident management processes.
Agile Collaboration: Work closely with Product Owners, Engineering Managers, and cross-functional teams in Agile Scrum and Kanban workflows to deliver iterative improvements and meet evolving business needs.
Impact Analysis & Incident Management: Perform impact analysis during incidents, collaborate with teams for root cause analysis, and implement preventive measures to avoid recurrence.
You Bring:
B.S. or M.S. degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
8+ years in Site Reliability Engineering (SRE), DevOps Engineering, or related fields
At least 4+ years of hands-on experience deploying, managing, and optimizing containerized applications using Docker and Kubernetes in both public and private cloud environments (AWS, Google Cloud Platform, Azure, etc.).
4+ years in Infrastructure-as-Code (IaC) using Terraform, Cluster API, or similar automation frameworks to manage cloud infrastructure.
Experience in scripting or programming with Python, Go, Bash/Shell, or similar languages.
Strong understanding of using Prometheus, Grafana, and other monitoring and observability tools.
Ability to effectively diagnose and resolve performance bottlenecks within AWS at the infrastructure and application layers.
Configuration Management: Experience with configuration man
Thanks& Regards,
K Hemanth Kumar | Sr IT Technical Recruiter | Kairos Technologies Inc
E: