Site Reliability Engineer (SRE)

McKinsol Consulting Inc
Jersey City, New Jersey
Full Time

Email Address

Apply Now

Job Title: Site Reliability Engineer (SRE)

Experience: 12 - 15+ Years

Strong hands-on experience with Grafana (Must Have)
Experience in Incident Management (Must Have)

Job Summary:

We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in monitoring, incident management, and cloud-native environments. The ideal candidate should have hands-on expertise in Grafana, Kubernetes, AppDynamics , and a solid understanding of production support along with basic Java knowledge .

Key Responsibilities:

Monitor system performance, availability, and reliability using tools like Grafana and AppDynamics
Manage and respond to production incidents, ensuring quick resolution and minimal downtime
Implement and improve incident management processes , including RCA (Root Cause Analysis)
Work with development and DevOps teams to ensure system reliability and scalability
Deploy, manage, and troubleshoot containerized applications in Kubernetes environments
Automate operational tasks and improve system efficiency
Analyze logs, metrics, and traces to identify and resolve performance bottlenecks
Participate in on-call rotations and support critical production systems

Required Skills: