Abstract
Pathology results play a critical role in medical decision making. A particular challenge is the large number of pathology results that doctors are presented with on a daily basis. Some form of pathology result prioritisation is therefore a necessity. However, there is no readily available training data that would support a traditional supervised learning approach. Thus some alternative solutions are needed. There are two approaches presented in this paper, anomaly-based unsupervised pathology prioritisation and proxy ground truth-based supervised pathology prioritisation. Two variations of each were considered. With respect to the first, point and time series based unsupervised anomaly prioritisation; and with respect to the second kNN and RNN proxy ground truth-based supervised prioritisation. To act as a focus, Urea and Electrolytes pathology testing was used. The reported evaluation indicated that the RNN proxy ground truth-based supervised pathology prioritisation method produced the best results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmad, B., Jian, W., Ali, Z.A., Tanvir, S., Khan, M.S.A.: Hybrid anomaly detection by using clustering for wireless sensor network. Wirel. Pers. Commun. 106, 1841–1853 (2019)
Ahmeda, M., Mahmooda, A.N., Islamb, M.R.: A survey of anomaly detection techniques in financial domain. Futur. Gener. Comput. Syst. 55, 278–288 (2016)
Baek, S., Kwon, D., Kim, J., Suh, S.C., Kim, H., Kim, I.: Unsupervised labeling for supervised anomaly detection in enterprise and cloud networks. In: Proceedings 4th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2017, pp. 205–210 (2017)
Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017)
Brawner, K., Boyce, M.W.: Establishing ground truth on pyschophysiological models for training machine learning algorithms: options for ground truth proxies. In: Schmorrow, D.D., Fidopiastis, C.M. (eds.) AC 2017. LNCS (LNAI), vol. 10284, pp. 468–477. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58628-1_35
Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J.: Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans. Knowl. Discov. Data 10(1), 1–51 (2015)
Cerrato, D., Jones, R., Gupta, A.: Classification of proxy labeled examples for marketing segment generation. In: Proceedings of the 17th International Conference Knowledge Discovery and Data, KDD 2011, pp. 343–350. ACM SIGKDD (2011)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)
Fawaz, H.I., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Deep learning for time series classification: a review. Data Min. Knowl. Disc. 33, 917–963 (2016)
Geler, Z., Kurbalija, V., Radovanović, M., Ivanović, M.: Comparison of different weighting schemes for the kNN classifier on time-series data. Knowl. Inf. Syst. 48, 331–378 (2016)
Harish, B.S., Kuma, S.V.A.: Anomaly based intrusion detection using modified fuzzy clustering. Int. J. Interact. Multimedia Artif. Intell. 4(6), 54–59 (2017)
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9–10), 1641–1650 (2003)
Jing, L., Tian, Y.: Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43, 4037–4058 (2020)
Khan, K., Rehman, S.U., Aziz, K., Fong, S., Sarasvady, S.: DBSCAN: past, present and future. In The 5th International Conference on the Applications of Digital Information and Web Technologies, ICADIWT 2014, pp. 232–238. IEEE (2014)
Kumar, V.: Parallel and distributed computing for cybersecurity. IEEE Distrib. Syst. Online 6(10), 1–9 (2005)
Landauera, M., Skopika, F., Wurzenbergera, M., Rauberb, A.: System log clustering approaches for cyber security applications: a survey. Comput. Secur. 92, 101739 (2020)
Lee, Y.-H., Wei, C.-P., Cheng, T.-H., Yang, C.-T.: Nearest-neighbor-based approach to time-series classification. Decis. Support Syst. 53(1), 207–217 (2012)
Li, L., Das, S., Hansman, R.J., Palacios, R., Srivastava, A.N.: Clustering techniques to detect abnormal flights of unique data patterns. J. Aerosp. Inf. Syst. 12, 587–598 (2015)
Li, Z., Wu, S., Zhou, Y., Li, C.: A combined filtering search for DTW. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 884–888. IEEE (2017)
Manning, C.D., Prabhakar, R., Schutza, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Rahmah, N., Sitanggang, I.S.: Determination of optimal epsilon (Eps) value on DBSCAN algorithm to clustering data on peatland hotspots in Sumatra. IOP Conf. Ser. Earth Environ. Sci. 31, 012012 (2016)
Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270 (2012)
Roohi, A., Faust, K., Djuric, U., Diamandis, P.: Unsupervised machine learning in pathology: the next frontier. Surg. Pathol. Clin. 13(2), 349–358 (2020)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Sig. Process. 26(1), 43–49 (1978)
Sander, J., Ester, M., Kriegel, H.-P., Xiaowei, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min. Knowl. Disc. 2(2), 169–194 (1998)
Thudumu, S., Branch, P., Jin, J., Singh, J.: A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data 7, 42 (2020)
Vikram, S., Li, L., Russell, S.: Handwriting and gestures in the air, recognizing on the fly. In: Proceedings of the CHI, vol. 13, pp. 1179–1184 (2013)
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2013)
Yang, A.C., Huang, N.E., Peng, C.-K., Tsai, S.-J.: Do seasons have an influence on the incidence of depression? The use of an internet search engine query data as a proxy of human affect. PLOS ONE 5, e13728 (2010)
Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 665–674 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Qi, J., Burnside, G., Coenen, F. (2021). Ranking Pathology Data in the Absence of a Ground Truth. In: Bramer, M., Ellis, R. (eds) Artificial Intelligence XXXVIII. SGAI-AI 2021. Lecture Notes in Computer Science(), vol 13101. Springer, Cham. https://doi.org/10.1007/978-3-030-91100-3_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-91100-3_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91099-0
Online ISBN: 978-3-030-91100-3
eBook Packages: Computer ScienceComputer Science (R0)