Skip to content

Welcome to the AI-Actuarial-Use-Cases repository! This is a collaborative space that focuses on the exploration and documentation of use cases for Artificial Intelligence (AI) within the field of actuarial science.

Notifications You must be signed in to change notification settings

IAA-AI-DS-test/AI-Case-Studies-in-Actuarial-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 

Repository files navigation


IAA Taskforce Artificial Intelligence

AI Case Studies in Actuarial Science


# Date Added Author Title Resource(s) Type Level Primary Topics Secondary Topics Language(s) Programming Language(s) Methods and/or Models AI Control Cycle Notes Abstract/Summary
1 2024-05-08 DAV 🇩🇪 Binary Classification: Credit Scoring Description,
Notebook
Case Study 🟨🟨⬜
Advanced
Machine Learning Classification Explainable AI Hyperparameter Tuning GPU Usage English Python CatBoost, XGBoost, LightGBM, Deep Learning, Logarithmic Regression, SHAP (?) Data derived from a Kaggle competition's real-world dataset This Jupyter Notebook offers a hands-on tutorial on binary classification using the Home Credit Default Risk dataset from Kaggle. Our focus is on predicting loan repayment difficulties, equipping actuaries with skills applicable to common insurance scenarios like churn prediction and fraud detection. Structured in three parts, the notebook progresses from simple to advanced modeling techniques: Part A sets a performance benchmark with an initial CatBoost model, a gradient boosting algorithm that requires minimal data preprocessing. Part B explores logistic regression, then delves into a brief exploratory data analysis, feature engineering, and model interpretability – all essential for making informed decisions. We cover data preprocessing, including encoding, scaling, and subsampling for imbalanced data, and investigate the impact on modeling. Part C is devoted to the optimization and practical application of machine learning models. It first addresses overfitting using the example of regularized logistic regression, as well as hyperparameter tuning in artificial neural networks and gradient boosting methods CatBoost, LightGBM, and XGBoost. After a comprehensive model evaluation using validation and test data, we discuss application aspects in high-risk areas and conclude by summarizing the key insights we have learned. The appendix provides further information on CatBoost and GPU-accelerated training.
2 2024-05-08 SAV 🇨🇭 SHAP for Actuaries: Explain Any Model Article,
Notebook
Educational 🟨🟨⬜
Advanced
Explainable AI Interpretable ML Regression Synthetic Data Claims Prediction English Python, R GLM, LightGBM, Deep Learning, SHAP (?) Data generation process and ground truth given This tutorial gives an overview of SHAP (SHapley Additive exPlanation), one of the most commonly used techniques for examining a black‑box machine learning (ML) model.
Besides providing the necessary game theoretic background, we show how typical SHAP analyses are performed and used to gain insights about the model.
The methods are illustrated on a simulated insurance data set of car claim frequencies using different ML models and different SHAP algorithms.
3 2024-05-11 Caesar Balona 🇿🇦 Case Study 1: Parsing Claims Descriptions Article,
Code
Case Study 🟨🟨⬜
Advanced
Large Language Models Information Extraction Parsing English Python ChatGPT with GPT-4 (?) In this case study, GPT-4 was employed to parse interactions with policyholders during the claims process to assess the sentiment of the engagement, the emotional state of the claimant, and inconsistencies in the claims information to aid downstream fraud investigations. It is important to emphasise that the LLM functions as an automation tool in this context and is not intended to supplant human claims handlers or serve as the ultimate arbiter in fraud detection or further engagements. Instead, it aims to support claims handlers by analyzing the information provided by the claimant, summarizing the engagement, and offering a set of indicators to inform subsequent work.
4 2024-05-11 Caesar Balona 🇿🇦 Case Study 2: Identifying Emerging Risks Article,
Code
Case Study 🟩⬜⬜
Beginner
Large Language Models Text Generation English Python ChatGPT with GPT-4 (?) In this case study, GPT-4 is tasked with summarising a collection of news snippets to identify emerging cyber risks. The script conducts an automated custom Google Search for recent articles using a list of search terms. It extracts the metadata of the search results and employs GPT-4 to generate a detailed summary of the notable emerging cyber risks, themes, and trends identified. Subsequently, GPT-4 is requested to produce a list of action points based on the summary. Each action point is then input into GPT-4 again to generate project plans for fulfilling the action points. This case study and its associated code demonstrate, at a basic level, the ease with which LLMs can be integrated directly into actuarial and insurance work, including additional prompting against its own output to accomplish further tasks.
5 2024-06-13 Simon Hatzesberger 🇩🇪 Model-Agnostic Explainability Methods for Regression Problems: A Case Study on Medical Costs Data see folder 'Case Study #5' in this repository Educational 🟨🟨⬜
Advanced
Explainable AI Machine Learning Regression English Python CatBoost, PDP, ALE, PFI, SHAP, LIME (?) In this Jupyter notebook, we offer a comprehensive walkthrough for actuaries and data scientists on applying model-agnostic explainability methods to regression tasks, using a medical costs dataset as our case study. With the growing prevalence of modern black box machine learning models, which often lack the interpretability of classical statistical models, these explainability methods become increasingly important to ensure transparency and trust in predictive modeling. We illuminate both global methods – such as global surrogate models, PDPs, ALE plots, and permutation feature importances – for a thorough understanding of model behavior, and local methods – like SHAP, LIME, and ICE plots – for detailed insights into individual predictions. In addition to concise overviews of these methods, the notebook provides practical code examples that readers can easily adopt, offering a user-friendly introduction to explainable artificial intelligence.
6 2024-06-13 Simon Hatzesberger 🇩🇪 Model-Agnostic Explainability Methods for Binary Classification Problems: A Case Study on Car Insurance Data Notebook Educational 🟨🟨⬜
Advanced
Explainable AI Machine Learning Classification English Python CatBoost, PDP, ALE, PFI, SHAP, LIME, Counterfactual Explanations, Anchors (?) In this Jupyter notebook, we offer a comprehensive walkthrough for actuaries and data scientists on applying model-agnostic explainability methods to binary classification tasks, using a car insurance dataset as our case study. With the growing prevalence of modern black box machine learning models, which often lack the interpretability of classical statistical models, these explainability methods become increasingly important to ensure transparency and trust in predictive modeling. We illuminate both global methods – such as global surrogate models, PDPs, ALE plots, and permutation feature importances – for a thorough understanding of model behavior, and local methods – like SHAP, LIME, ICE plots, counterfactual explanations, and anchors – for detailed insights on individual predictions. In addition to concise overviews of these methods, the notebook provides practical code examples that readers can easily adopt, offering a user-friendly introduction to explainable artificial intelligence.
7 2024-07-16 MAS (Monetary Authority of Singapore) 🇸🇬 FEAT Principles Assessment Case Studies Website,
White Paper
Case Study 🟩⬜⬜
Beginner
Fairness Ethics Accountability Transparency Life Insurance Underwriting Fraud Detection Retail Marketing Credit Decisioning Customer Marketing English Gradient Boosting Model, PDP, SHAP, PFI (?) Data derived from a Kaggle competition's real-world dataset This document is one of a suite of documents published as an output of the Monetary Authority of Singapore (MAS) Veritas Phase 2 project. Its purpose is to illustrate implementation of the Fairness, Ethics, Accountability and Transparency (FEAT) Principles Assessment Methodology for Financial Institutions on selected use cases and it fits alongside the published documents as highlighted in the diagram below.
8 2024-07-16 Personal Data Protection Commission 🇸🇬 Compendium of Use Cases: Practical Illustrations of the Model AI Governance Framework Website,
White Paper (Volume 1),
White Paper (Volume 2)
Case Study 🟩⬜⬜
Beginner
Governance TODO English Gradient Boosting Model, PDP, SHAP, PFI (?) Data derived from a Kaggle competition's real-world dataset AI will transform businesses and power the next bound of economic growth. Businesses and society can enjoy the full benefits of AI if the deployment of AI products and services is founded upon trustworthy AI governance practices. As part of advancing Singapore’s thought leadership in AI governance, Singapore has released the Model AI Governance Framework (Model Framework) to guide organisations on how to deploy AI in a responsible manner. This Compendium of Use Cases demonstrates how various organisations across different sectors – big and small, local and international – have either implemented or aligned their AI governance practices with all sections of the Model Framework. The Compendium also illustrates how the organisations have effectively put in place accountable AI governance practices and benefit from the use of AI in their line of business. By implementing responsible AI governance practices, organisations can distinguish themselves from others and show that they care about building trust with consumers and other stakeholders. This will create a virtuous cycle of trust, allowing organisations to continue to innovate for their stakeholders. We thank the World Economic Forum Centre for the Fourth Industrial Revolution for partnering us on this journey. We hope that this Compendium will inspire more organisations to embark on a similar journey.
9 2024-07-16 SAV 🇨🇭 (Andreas Troxler, Jürg Schelldorfer) Actuarial Applications of Natural Language Processing Using Transformers Article,
Notebook
Educational 🟥🟥🟥
Expert
Natural Language Processing Transformers Property Insurance Claims Descriptions Recurrent Neural Networks English Python Transformers, Recurrent Neural Networks, Integrated Gradients (?) This tutorial demonstrates workflows to incorporate text data into actuarial classification and regression tasks. The main focus is on methods employing transformer-based models. A dataset of car accident descriptions with an average length of 400 words, available in English and German, and a dataset with short property insurance claims descriptions are used to demonstrate these techniques. The case studies tackle challenges related to a multi-lingual setting and long input sequences. They also show ways to interpret model output, to assess and improve model performance, by fine-tuning the models to the domain of application or to a specific prediction task. Finally, the tutorial provides practical approaches to handle classification tasks in situations with no or only few labeled data, including but not limited to ChatGPT. The results achieved by using the language-understanding skills of off-the-shelf natural language processing (NLP) models with only minimal pre-processing and fine-tuning clearly demonstrate the power of transfer learning for practical applications.
10 2024-09-12 SOA 🇺🇸 (Logan T. Smith, Emma Pirchalski, Ilana Golbin) Avoiding Unfair Bias in Insurance Applications of AI Models Website,
White Paper (English),
White Paper (Simplified Chinese)
TODO 🟩⬜⬜
Beginner
Bias Fairness Ethics English, (Simplified) Chinese Artificial intelligence (“AI”) adoption in the insurance industry is increasing. One known risk as adoption of AI increases is the potential for unfair bias. Central to understanding where and how unfair bias may occur in AI systems is defining what unfair bias means and what constitutes fairness. This research identifies methods to avoid or mitigate unfair bias unintentionally caused or exacerbated by the use of AI models and proposes a potential framework for insurance carriers to consider when looking to identify and reduce unfair bias in their AI models. The proposed approach includes five foundational principles as well as a four-part model development framework with five stage gates.
11 2024-09-12 SAV 🇨🇭 (Simon Rentzmann, Mario V. Wüthrich) Unsupervised Learning: What is a Sports Car? Article,
Notebook
Educational 🟥🟥🟥
Expert
Unsupervised Learning Dimension Reduction Clustering Low Dimensional Visualization English R Principal Component Analysis (PCA), Bottleneck Neural Network, k-Means, k-Mediods, Gaussian Mixture Models, t-SNE, UMAP, SOM (?) This tutorial studies unsupervised learning methods. Unsupervised learning methods are techniques that aim at reducing the dimension of data (covariables, features), cluster cases with similar features, and graphically illustrate high dimensional data. These techniques do not consider response variables, but they are solely based on the features themselves by studying incorporated similarities. For this reason, these methods belong to the field of unsupervised learning methods. The methods studied in this tutorial comprise principal components analysis (PCA) and bottleneck neural networks (BNNs) for dimension reduction, K-means clustering, K-medoids clustering, partitioning around medoids (PAM) algorithm and clustering with Gaussian mixture models (GMMs) for clustering, and variational autoencoder (VAE), t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), self-organizing maps (SOM) and Kohonen maps for visualizing high dimensional data.
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Notes:

  • The dates are formatted in ISO 8601 standard (YYYY-MM-DD).
  • The "Author" column incorporates both the individual author and the member association.
  • The "Resource(s)" column provides direct links to articles and code repositories.

About

Welcome to the AI-Actuarial-Use-Cases repository! This is a collaborative space that focuses on the exploration and documentation of use cases for Artificial Intelligence (AI) within the field of actuarial science.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published