Abstract:
Technology Assisted Review, the term used in the legal industry for a family of functions including document clustering and classification using machine learning, has tra...Show MoreMetadata
Abstract:
Technology Assisted Review, the term used in the legal industry for a family of functions including document clustering and classification using machine learning, has traditionally focused on textual raw data and excluded image files, mainly due to the lack of image analytics technology with the ability to produce satisfactory results. The emergence of deep learning and convolutional neural networks and consequent advances in the field of computer vision have generated the potential to incorporate image processing to legal industry workflows. We exploit the well-known VGG16 model, pretrained with ImageNet pictures, to encode images, which we then cluster using standard methods. Finally, we apply transfer learning to -and touch on fine tuning of - the pretrained network to perform binary classification of a test dataset. We test our methodology on scenarios of similar looking image classes as well as normal and very low responsiveness (positive class) rate. In parallel, we examine the process from a review cost savings perspective.
Date of Conference: 10-13 December 2018
Date Added to IEEE Xplore: 24 January 2019
ISBN Information: