CENTRALIZED EVENT MANAGEMENT, SELF-HEALING AND AUTO-REMEDIATION OF INCIDENTS

 

BUILDING BLOCKS

 

Select the plus signs for more information.

CENTRALIZED EVENT MANAGEMENT, SELF-HEALING AND AUTO-REMEDIATION OF INCIDENTS

Enterprise monitoring & Managed services for large Healthcare Delivery Organization

One of the three largest US  integrated managed care consortiums. Over 7000 beds, 209,000 employees, 21,250 physicians, and 39 medical centers, across the US. The enterprise currently runs with 30,00 servers with an IT employee strength of 700 FTE.

FUTURE STATE VISION

A unified infrastructure monitoring environment that integrates with multiple sources of information about events with self-healing of systems and auto-remediation of incidents. Managed services includes monitoring tools; analysis and identification of the root cause of  incidents; and implementation of security recommendations.

THE BUSINESS CHALLENGE

  • Manual processes and growing scale of the environment was overloading IT staff, resulting in very long time (approximately 3 days) to resolve critical incidents and growing backlog (~1000) of work requests

  • Low level of integration of multiple tools and siloed repositories of information (CMDB, incident logs, ITOM) made it difficult to correctly diagnose and resolve incidents first time, resulting significant rework effort and time to close incidents

  • IT staff consumed with reactive tasks leaving no time for process improvement

BUSINESS OBJECTIVES

  • Increase customer satisfaction level by reducing manual operation errors, and ensure the resolution of incidents within the committed SLA

  • Increase visibility and reporting to SLA metrics and elimination of aged backlog work

  • Reduction in human error related to manual tasks, and also improve efficiency through targeted automation and integrations.

KEY DELIVERABLES

  • Integrated existing monitoring and incident management tools (IBM and Netcool monitoring, BMC Remedy/ServiceNow) into a centralized event management system

  • Implemented auto-remediation of events using IPSoft, self healing by Netcool & IP-Center AI integrated with ServiceNow.

  • Developed correlation rules within Netcool to de-duplicate events

  • Validated automation logic and created SOP documents for IPSoft to support continuous development of automation scripts.

THE APPROACH

  • Performed workshops with business and IT stakeholders to identify people, process, tool issues and desired outcomes

  • Create future state architecture and end-state visualization for an integrated solution

  • Staffed a blended team of client and Trianz onshore/offshore resources to reduce backlogged requests

  • Managed monitoring infrastructure as a service

BUSINESS EFFECTS REALIZED

  • Decreased the incident remediation time and ensured adherence to all committed business SLA’s

  • Reduced the backlog of work requests by 91% over a period of 5 months.

  • Estimated improvement of 20% in staff efficiency