Skip to main content

Ranking Pathology Data in the Absence of a Ground Truth

  • Conference paper
  • First Online:
Artificial Intelligence XXXVIII (SGAI-AI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13101))

Abstract

Pathology results play a critical role in medical decision making. A particular challenge is the large number of pathology results that doctors are presented with on a daily basis. Some form of pathology result prioritisation is therefore a necessity. However, there is no readily available training data that would support a traditional supervised learning approach. Thus some alternative solutions are needed. There are two approaches presented in this paper, anomaly-based unsupervised pathology prioritisation and proxy ground truth-based supervised pathology prioritisation. Two variations of each were considered. With respect to the first, point and time series based unsupervised anomaly prioritisation; and with respect to the second kNN and RNN proxy ground truth-based supervised prioritisation. To act as a focus, Urea and Electrolytes pathology testing was used. The reported evaluation indicated that the RNN proxy ground truth-based supervised pathology prioritisation method produced the best results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmad, B., Jian, W., Ali, Z.A., Tanvir, S., Khan, M.S.A.: Hybrid anomaly detection by using clustering for wireless sensor network. Wirel. Pers. Commun. 106, 1841–1853 (2019)

    Article  Google Scholar 

  2. Ahmeda, M., Mahmooda, A.N., Islamb, M.R.: A survey of anomaly detection techniques in financial domain. Futur. Gener. Comput. Syst. 55, 278–288 (2016)

    Article  Google Scholar 

  3. Baek, S., Kwon, D., Kim, J., Suh, S.C., Kim, H., Kim, I.: Unsupervised labeling for supervised anomaly detection in enterprise and cloud networks. In: Proceedings 4th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2017, pp. 205–210 (2017)

    Google Scholar 

  4. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017)

    Article  MathSciNet  Google Scholar 

  5. Brawner, K., Boyce, M.W.: Establishing ground truth on pyschophysiological models for training machine learning algorithms: options for ground truth proxies. In: Schmorrow, D.D., Fidopiastis, C.M. (eds.) AC 2017. LNCS (LNAI), vol. 10284, pp. 468–477. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58628-1_35

    Chapter  Google Scholar 

  6. Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J.: Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans. Knowl. Discov. Data 10(1), 1–51 (2015)

    Article  Google Scholar 

  7. Cerrato, D., Jones, R., Gupta, A.: Classification of proxy labeled examples for marketing segment generation. In: Proceedings of the 17th International Conference Knowledge Discovery and Data, KDD 2011, pp. 343–350. ACM SIGKDD (2011)

    Google Scholar 

  8. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)

    Article  Google Scholar 

  9. Fawaz, H.I., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Deep learning for time series classification: a review. Data Min. Knowl. Disc. 33, 917–963 (2016)

    Article  MathSciNet  Google Scholar 

  10. Geler, Z., Kurbalija, V., Radovanović, M., Ivanović, M.: Comparison of different weighting schemes for the kNN classifier on time-series data. Knowl. Inf. Syst. 48, 331–378 (2016)

    Article  Google Scholar 

  11. Harish, B.S., Kuma, S.V.A.: Anomaly based intrusion detection using modified fuzzy clustering. Int. J. Interact. Multimedia Artif. Intell. 4(6), 54–59 (2017)

    Google Scholar 

  12. He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9–10), 1641–1650 (2003)

    Article  Google Scholar 

  13. Jing, L., Tian, Y.: Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43, 4037–4058 (2020)

    Article  Google Scholar 

  14. Khan, K., Rehman, S.U., Aziz, K., Fong, S., Sarasvady, S.: DBSCAN: past, present and future. In The 5th International Conference on the Applications of Digital Information and Web Technologies, ICADIWT 2014, pp. 232–238. IEEE (2014)

    Google Scholar 

  15. Kumar, V.: Parallel and distributed computing for cybersecurity. IEEE Distrib. Syst. Online 6(10), 1–9 (2005)

    Article  Google Scholar 

  16. Landauera, M., Skopika, F., Wurzenbergera, M., Rauberb, A.: System log clustering approaches for cyber security applications: a survey. Comput. Secur. 92, 101739 (2020)

    Article  Google Scholar 

  17. Lee, Y.-H., Wei, C.-P., Cheng, T.-H., Yang, C.-T.: Nearest-neighbor-based approach to time-series classification. Decis. Support Syst. 53(1), 207–217 (2012)

    Article  Google Scholar 

  18. Li, L., Das, S., Hansman, R.J., Palacios, R., Srivastava, A.N.: Clustering techniques to detect abnormal flights of unique data patterns. J. Aerosp. Inf. Syst. 12, 587–598 (2015)

    Google Scholar 

  19. Li, Z., Wu, S., Zhou, Y., Li, C.: A combined filtering search for DTW. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 884–888. IEEE (2017)

    Google Scholar 

  20. Manning, C.D., Prabhakar, R., Schutza, H.: Introduction to Information Retrieval. Cambridge University Press (2008)

    Google Scholar 

  21. Rahmah, N., Sitanggang, I.S.: Determination of optimal epsilon (Eps) value on DBSCAN algorithm to clustering data on peatland hotspots in Sumatra. IOP Conf. Ser. Earth Environ. Sci. 31, 012012 (2016)

    Article  Google Scholar 

  22. Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270 (2012)

    Google Scholar 

  23. Roohi, A., Faust, K., Djuric, U., Diamandis, P.: Unsupervised machine learning in pathology: the next frontier. Surg. Pathol. Clin. 13(2), 349–358 (2020)

    Article  Google Scholar 

  24. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  25. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Sig. Process. 26(1), 43–49 (1978)

    Article  Google Scholar 

  26. Sander, J., Ester, M., Kriegel, H.-P., Xiaowei, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min. Knowl. Disc. 2(2), 169–194 (1998)

    Article  Google Scholar 

  27. Thudumu, S., Branch, P., Jin, J., Singh, J.: A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data 7, 42 (2020)

    Article  Google Scholar 

  28. Vikram, S., Li, L., Russell, S.: Handwriting and gestures in the air, recognizing on the fly. In: Proceedings of the CHI, vol. 13, pp. 1179–1184 (2013)

    Google Scholar 

  29. Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2013)

    Article  MathSciNet  Google Scholar 

  30. Yang, A.C., Huang, N.E., Peng, C.-K., Tsai, S.-J.: Do seasons have an influence on the incidence of depression? The use of an internet search engine query data as a proxy of human affect. PLOS ONE 5, e13728 (2010)

    Article  Google Scholar 

  31. Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 665–674 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Qi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qi, J., Burnside, G., Coenen, F. (2021). Ranking Pathology Data in the Absence of a Ground Truth. In: Bramer, M., Ellis, R. (eds) Artificial Intelligence XXXVIII. SGAI-AI 2021. Lecture Notes in Computer Science(), vol 13101. Springer, Cham. https://doi.org/10.1007/978-3-030-91100-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91100-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91099-0

  • Online ISBN: 978-3-030-91100-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics