11/25/2021
Vish Consulting Services is looking for technical ITSM manager/SRE. Please review details below to apply.
Position: Site Reliability Engineer
Location: 100% Remote
Duration: 12+ Months Contract
Location: Remote( Chicago IL)
Key Areas of Responsibility:
This is a strategic and hands-on position where you will work closely with cross-functional teams to identify potential issues and provide innovative insights to optimize system performance, stability, and availability.
Guide cross functional teams to manage and support their PagerDuty alerts, teams, schedules, escalation policies and automations.
The engineer will also be responsible for automating alerting and remediation processes to reduce mean time to resolution (MTTR) and improve system uptime.
Monitor Server, network infrastructure and application performance metrics, and identify patterns and trends to improve system performance and reliability.
Troubleshoot issues and outages, working closely with development and operations teams to identify root causes and develop solutions.
Collaborate with cross-functional teams to support incident management, change management, and problem management processes.
Proactively detect and prevent future problems/incidents and initiate the Problem Management process to allow quicker diagnosis and resolution.
Develop trend analysis and prepare service improvement plans to address identified gaps.
Build strong relationships with key stakeholders, including senior management, department heads, and external partners, to ensure their support and engagement in incident management initiatives.
Foster a culture of continuous improvement, staying abreast of industry trends, emerging technologies, and best practices to enhance incident management capabilities.
Create dashboards and reports to provide insights into operational performance and health.
Build automation to optimize processes and workflows within our on-call systems and monitoring platforms.
Complete any assigned project work or tasks, with a view to improving existing processes, capabilities and seek out automation opportunities.
Ability to support on-call rotation and off-hours support as required.
Qualifications:
Minimum Qualifications:
Bachelor’s Degree in IT, Business Management or a related discipline preferred.
5+ of direct experience working in the observability, operations, or DevOps domains.
Proficient in Observability, monitoring, PagerDuty, and logging tools Like Datadog, Dynatrace, PowerBI, etc.
3+ years of technical experience: systems engineering, SRE, DevOps, software engineering
Other Required Qualifications:
Excellent written and verbal communication skills with the ability to communicate effectively with all stakeholders including senior leadership.
Strong ability to understand, accurately translate and produce technical information for a general and business audience.
Strong experience with change, incident, and problem management principles, methodologies, and tools.
Experience using configuration and change tools to include such as ServiceNow Change and CMDB and or related tools.
Experience with project delivery methodologies (Agile, Scrum).
Hands on experience with monitoring and performance monitoring tools: DataDog, Dynatrace, Splunk, etc.
Preferred Qualifications:
ITIL v3 Foundation Certification Preferred
Certification in Project Management
Experience implementing continuous process improvements within a configuration, change, release, or asset management program
Cloud certifications (Azure, AWS, GCP)
Direct experience scripting in two of the following languages: Python, PowerShell, Bash.
Proficient at technical and business writing