Privacy-Preserving Knowledge Transfer with Bootstrap Aggregation of Teacher Ensembles

Yoon, Hong-Jun; Klasky, Hilda; Stanley, Christopher; Christian, Blair; Tourassi, Georgia; Durbin, Eric B.; Wu, Xiao-Cheng; Stroup, Antoinette; Doherty, Jennifer; Coyle, Linda; Penberthy, Lynne

doi:10.1007/978-3-030-71055-2_9

Title: Privacy-Preserving Knowledge Transfer with Bootstrap Aggregation of Teacher Ensembles

Conference · Mon Mar 01 00:00:00 EST 2021

DOI:https://doi.org/10.1007/978-3-030-71055-2_9· OSTI ID:1771902

^[1];

^[1]; Durbin, Eric B. ^[2]; Wu, Xiao-Cheng ^[3]; Stroup, Antoinette ^[4]; Doherty, Jennifer ^[5]; Coyle, Linda ^[6]; Penberthy, Lynne ^[6]

ORNL
University of Kentucky
LSUHSC-Louisiana Tumor Registry
Rutgers Cancer Institute of New Jersey
University of Utah
National Cancer Institute, Bethesda, MD

There is a need to transfer knowledge among institutions and organizations to save effort in annotation and labeling or in enhancing task performance. However, knowledge transfer is difficult because of restrictions that are in place to ensure data security and privacy. Institutions are not allowed to exchange data or perform any activity that may expose personal information. With the leverage of a differential privacy algorithm in a high-performance computing environment, we propose a new training protocol, Bootstrap Aggregation of Teacher Ensembles (BATE), which is applicable to various types of machine learning models. The BATE algorithm is based on and provides enhancements to the PATE algorithm, maintaining competitive task performance scores on complex datasets with underrepresented class labels.We conducted a proof-of-the-concept study of the information extraction from cancer pathology report data from four cancer registries and performed comparisons between four scenarios: no collaboration, no privacy-preserving collaboration, the PATE algorithm, and the proposed BATE algorithm. The results showed that the BATE algorithm maintained competitive macro-averaged F1 scores, demonstrating that the suggested algorithm is an effective yet privacy-preserving method for machine learning and deep learning solutions.

View Conference

Cite

Export

Save

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1771902

Resource Relation:: Journal Volume: 12633; Conference: International Conference on Very Large Data Bases (VLDB) - Tokyo, , Japan - 8/31/2020 8:00:00 AM-9/4/2020 8:00:00 AM

Country of Publication:: United States

Language:: English

Similar Records

Privacy-Preserving Deep Learning NLP Models for Cancer Registries

Journal Article · Thu Jul 01 00:00:00 EDT 2021 · IEEE Transactions on Emerging Topics in Computing · OSTI ID:1771902

Alawad, Mohammed; Yoon, Hong-Jun; Gao, Shang; +9 more

Adversarial Training for Privacy-Preserving Deep Learning Model Distribution

Conference · Sun Dec 01 00:00:00 EST 2019 · OSTI ID:1771902

Alawad, Mohammed; Gao, Shang; Wu, Xiao-Cheng; +4 more

Considerations for using Privacy Preserving Machine Learning Techniques for Safeguards

Technical Report · Tue Dec 01 00:00:00 EST 2020 · OSTI ID:1771902

Martindale, Nathan; Stewart, Scott L.; Adams, Mark; +1 more

Title: Privacy-Preserving Knowledge Transfer with Bootstrap Aggregation of Teacher Ensembles

Citation Formats

Similar Records

Related Subjects