skip to main content
10.1145/3459637.3481900acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Causal-Aware Generative Imputation for Automated Underwriting

Published: 30 October 2021 Publication History

Abstract

Underwriting is an important process in insurance and is concerned with accepting individuals into insurance policy with tolerable claim risk. Underwriting is a tedious and labor intensive process relying on underwriters' domain knowledge and experience, thus is labor intensive and prone to error. Machine learning models are recently applied to automate the underwriting process and thus to ease the burden on the underwriters as well as improve underwriting accuracy. However, observational data used for underwriting modelling is high dimensional, sparse and incomplete, due to the dynamic evolving nature (e.g., upgrade) of business information systems. Simply applying traditional supervised learning methods e.g., logistic regression or Gradient boosting on such highly incomplete data usually leads to the unsatisfactory underwriting result, thus requiring practical data imputation for training quality improvement. In this paper, rather than choosing off-the-shelf solutions tackling the complex data missing problem, we propose an innovative Generative Adversarial Nets (GAN) framework that can capture the missing pattern from a causal perspective. Specifically, we design a structural causal model to learn the causal relations underlying the missing pattern of data. Then, we devise a Causality-aware Generative network (CaGen) using the learned causal relationship prior to generating missing values, and correct the imputed values via the adversarial learning. We also show that CaGen significantly improves the underwriting prediction in real-world insurance applications.

References

[1]
Rhys Biddle, Shaowu Liu, Peter Tilocca, and Guandong Xu. 2018. Automated underwriting in life insurance: Predictions and optimisation. In Australasian Database Conference. Springer, 135--146.
[2]
PP Bonisone, Raj Subbu, and Kareem S Aggour. 2002. Evolutionary optimization of fuzzy decision systems for automated insurance underwriting. In 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Con-ference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No. 02CH37291), Vol. 2. IEEE, 1003--1008.
[3]
Hervé Bourlard and Yves Kamp. 1988. Auto-association by multilayer perceptrons and singular value decomposition. Biological cybernetics 59, 4 (1988), 291--294.
[4]
S van Buuren and Karin Groothuis-Oudshoorn. 2011. mice: Multivariate imputation by chained equations in R. Journal of statistical software (2011), 1--68.
[5]
Hongxu Chen, Yicong Li, Xiangguo Sun, Guandong Xu, and Hongzhi Yin. 2021. Temporal meta-path guided explainable recommendation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 1056--1064.
[6]
Tri Dung Duong, Qian Li, and Guandong Xu. 2021. Stochastic Intervention for Causal Effect Estimation. arXiv preprint arXiv:2105.12898 (2021).
[7]
Pedro J García-Laencina, José-Luis Sancho-Gómez, and Aníbal R Figueiras-Vidal. 2010. Pattern classification with missing data: a review. Neural Computing and Applications 19, 2 (2010), 263--282.
[8]
Lovedeep Gondara and Ke Wang. 2018. Mida: Multiple imputation using denois-ing autoencoders. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 260--272.
[9]
Qian Li, Wenjia Niu, Gang Li, Yanan Cao, Jianlong Tan, and Li Guo. 2015. Lingo: linearized grassmannian optimization for nuclear norm minimization. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 801--809.
[10]
Qian Li, Wenjia Niu, Gang Li, Jianlong Tan, Gang Xiong, and Li Guo. 2016. Riemannian optimization with subspace tracking for low-rank recovery. In 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, 3280--3287.
[11]
Qian Li, Xiangmeng Wang, and Guandong Xu. 2021. Be Causal: De-biasing Social Network Confounding in Recommendation. arXiv preprint arXiv:2105.07775 (2021).
[12]
Qian Li and Zhichao Wang. 2017. Riemannian submanifold tracking on low-rank algebraic variety. In Thirty-First AAAI Conference on Artificial Intelligence.
[13]
Qian Li, Zhichao Wang, Gang Li, Yanan Cao, Gang Xiong, and Li Guo. 2017. Learning robust low-rank approximation for crowdsourcing on Riemannian manifold. Procedia Computer Science 108 (2017), 285--294.
[14]
Xueyan Liu, Bo Yang, Hechang Chen, Katarzyna Musial, Hongxu Chen, Yang Li, and Wanli Zuo. 2021. A Scalable Redefined Stochastic Blockmodel. ACM Transactions on Knowledge Discovery from Data (TKDD) 15, 3 (2021), 1--28.
[15]
Xueyan Liu, Bo Yang, Wenzhuo Song, Katarzyna Musial, Wanli Zuo, Hongxu Chen, and Hongzhi Yin. 2021. A block-based generative model for attributed network embedding. World Wide Web (2021), 1--26.
[16]
Pierre-Alexandre Mattei and Jes Frellsen. 2019. MIWAE: Deep generative mod-elling and imputation of incomplete data sets. In International Conference on Machine Learning. 4413--4423.
[17]
Rahul Mazumder, Trevor Hastie, and Robert Tibshirani. 2010. Spectral regulariza-tion algorithms for learning large incomplete matrices. The Journal of Machine Learning Research 11 (2010), 2287--2322.
[18]
Judea Pearl et al. 2009. Causal inference in statistics: An overview. Statistics surveys 3 (2009), 96--146.
[19]
Swati Sachan, Jian-Bo Yang, Dong-Ling Xu, David Eraso Benavides, and Yang Li. 2020. An explainable AI decision-support-system to automate loan underwriting. Expert Systems with Applications 144 (2020), 113100.
[20]
Daniel J Stekhoven and Peter Bühlmann. 2012. MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 1 (2012), 112--118.
[21]
Zhenchao Sun, Hongzhi Yin, Hongxu Chen, Tong Chen, Lizhen Cui, and Fan Yang. 2020. Disease Prediction via Graph Neural Networks. IEEE Journal of Biomedical and Health Informatics 25, 3 (2020), 818--826.
[22]
Yi Tan and Guo-Ji Zhang. 2005. The application of machine learning algorithm in underwriting process. In 2005 International Conference on Machine Learning and Cybernetics, Vol. 6. IEEE, 3523--3527.
[23]
Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor Hastie, Robert Tibshirani, David Botstein, and Russ B Altman. 2001. Missing value estimation methods for DNA microarrays. Bioinformatics 17, 6 (2001), 520--525.
[24]
Guandong Xu, Tri Dung Duong, Qian Li, Shaowu Liu, and Xianzhi Wang. 2020. Causality Learning: A New Perspective for Interpretable Machine Learning. arXiv preprint arXiv:2006.16789 (2020).
[25]
Weizhong Yan and Piero P Bonissone. 2006. Designing a Neural Network Decision System for Automated Insurance Underwriting. In The 2006 IEEE International Joint Conference on Neural Network Proceedings. IEEE, 2106--2113.
[26]
Jinsung Yoon, James Jordon, and Mihaela Schaar. 2018. GAIN: Missing Data Imputation using Generative Adversarial Nets. In International Conference on Machine Learning. 5689--5698.

Cited By

View all
  • (2024)Reinforced Path Reasoning for Counterfactual Explainable RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.335407736:7(3443-3459)Online publication date: Jul-2024
  • (2024)Exploring Knowledge-Based Systems for Commercial Mortgage UnderwritingCurrent Trends in Web Engineering10.1007/978-3-031-50385-6_9(101-113)Online publication date: 4-Jan-2024
  • (2023)Be Causal: De-Biasing Social Network Confounding in RecommendationACM Transactions on Knowledge Discovery from Data10.1145/353372517:1(1-23)Online publication date: 20-Feb-2023
  • Show More Cited By

Index Terms

  1. Causal-Aware Generative Imputation for Automated Underwriting

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
    October 2021
    4966 pages
    ISBN:9781450384469
    DOI:10.1145/3459637
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automated underwriting
    2. causal-awareness
    3. data imputation
    4. gans

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reinforced Path Reasoning for Counterfactual Explainable RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.335407736:7(3443-3459)Online publication date: Jul-2024
    • (2024)Exploring Knowledge-Based Systems for Commercial Mortgage UnderwritingCurrent Trends in Web Engineering10.1007/978-3-031-50385-6_9(101-113)Online publication date: 4-Jan-2024
    • (2023)Be Causal: De-Biasing Social Network Confounding in RecommendationACM Transactions on Knowledge Discovery from Data10.1145/353372517:1(1-23)Online publication date: 20-Feb-2023
    • (2022)Deep treatment-adaptive network for causal inferenceThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00724-y31:5(1127-1142)Online publication date: 18-Feb-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media