at Cognizant in Boise, Idaho, United States
Job DescriptionRole: Production Support Engineer
Location: Candidates will need to be based in US and can work remotely from any part of US.
** Please note this role is not able to offer visa transfer or sponsorship, now or in the future.
- Applications will be accepted until June 5, 2026
Role Overview:
Production Monitoring & Incident Management Engineer with 8-10 years of overall experience
Key Responsibilities
1. Production Monitoring & Incident Management
+ Monitor applications, infrastructure, and batch processes in production environments
+ Handle P1/P2 incidents and ensure rapid resolution within SLA timelines
+ Perform initial troubleshooting, triage, and impact analysis
+ Coordinate with offshore L2/L3 teams for issue resolution
+ Ensure minimal business impact during incidents
Internal reference highlights:
+ Monitoring applications and troubleshooting production issues
+ SLA -driven incident handling and resolution
2. Stakeholder Communication (Critical Onshore Responsibility)
+ Act as the primary contact for client/business users onsite
+ Provide timely updates on incidents, outages, and resolutions
+ Participate in status calls, bridge calls, and escalation meetings
+ Manage expectations during critical outages
3. Incident, Problem & Change Management ( ITIL )
+ Manage ServiceNow (or similar) tickets for incidents, service requests, and changes
+ Perform Root Cause Analysis ( RCA ) for recurring issues
+ Participate in Problem Management and Change Advisory Board ( CAB ) processes
Supported by internal content:
+ Experience with Incident, Change, and Problem Management processes
Handling ServiceNow tickets and ITIL -based support work
4. Application / System Support
+ Provide end-to-end support across environments ( SIT / UAT / PROD )
+ Analyze logs, database queries, and system metrics for issue resolution
+ Support deployments and validate production releases
Example responsibilities:
+ Troubleshooting, defect fixes, and supporting multiple environments
+ Providing post-production support and incident resolution
5. Coordination with Offshore Teams
+ Drive daily syncs with offshore support teams
+ Assign and track tickets across L1/L2/L3
+ Ensure follow-the-sun or 24/7 support model efficiency
6. Preventive & Proactive Activities
+ Identify recurring issues and propose permanent fixes
+ Automate monitoring and alerts where possible
+ Improve system reliability and reduce i