The client is a reinsurance firm that acquires existing books of policies and manages claims associated with them. The data science team builds predictive analytics applications to:
Enable their legal teams detect fraudulent claims
Predict the claim settlement amount
Proactively get alerts for new high risk claims and large financial increases to existing claims
Business Challenge
With no single source of truth for data sources, input data comprising large documents containing images and non-standard document formats was a challenge. The organization lacked a formal data warehouse and was facing data credibility & availability issues. There was also an additional challenge of data extraction as large amounts of information, in the form of thousands of documents, was either sent through email or downloaded from an external website. This resulted in the data science team spending countless hours to manually open documents using specialized legal document viewers to extract text for their predictive analytics applications.
Technology Components
Azure Data lake, Data factory, Databricks, Data catalog, DevOps, Python, R-Shiny, Docker, Spark NLP, DataRobot Prediction server, Word2vec neural network and Jira
Approach
As a strategic partner, Trianz provided the client with specialized analytics program management and project teams comprising cloud architects, data scientists, data architects and engineers, both onsite and offshore.
Strategy/Roadmap
An Azure foundation and platform for all data science applications and data
Data science application enhancements to include entity recognition, natural language processing, customized models and robotics automation
Development of a cloud-based operational data mart, with production strength data models and processes to support executive reporting and data monetization
Execution
An efficient agile development process customized to support a pipeline from analytics innovation to regulated production deployment
Transformational Effects
Robotics automation processes will save the data science team’s precious hours that they can now spend on advanced predictive analytics
Enhancements to scoring models will improve forecasting accuracy
Entity explorer graphs will help to identify and relate people, places & symptoms, leading to fraud detection
Natural language processing with part of speech tagging is being implemented to proactively alert the legal department to large increases in claim financials