Abstract
In last years, we are witnessing a growing interest in the application of supervised machine learning techniques in the most disparate fields. One winning factor of machine learning is represented by its ability to easily create models, as it does not require prior knowledge about the application domain. Complementary to machine learning are formal methods, that intrinsically offer safeness check and mechanism for reasoning on failures. Considering the weaknesses of machine learning, a new challenge could be represented by the use of formal methods. However, formal methods require the expertise of the domain, knowledge about modeling language with its semantic and mathematical rigour to specify properties. In this article, we propose a novel learning technique based on the adoption of formal methods for classification thanks to the automatic generation both of the formula and of the model. In this way the proposed method does not require any human intervention and thus it can be applied also to complex/large datasets. This leads to less effort both in using formal methods and in a better explainability and reasoning about the obtained results. Through a set of case studies from different real-world domains (i.e., driver detection, scada attack identification, arrhythmia characterization, mobile malware detection, and radiomics for lung cancer analysis), we demonstrate the usefulness of the proposed method, by showing that we are able to overcome the performances obtained from widespread classification algorithms.
- [1] . 1999. Machine learning and data mining. Communications of the ACM 42, 11 (1999), 30–36.Google ScholarDigital Library
- [2] . 2006. The Discipline of Machine Learning. Carnegie Mellon University, School of Computer Science, Machine Learning ....Google Scholar
- [3] . 2017. The real risks of artificial intelligence. Communications of the ACM 60, 10 (2017), 27–31.Google ScholarDigital Library
- [4] . 1988. Why engineers should not use artificial intelligence. INFOR: Information Systems and Operational Research 26, 4 (1988), 234–246.
DOI: Google ScholarCross Ref - [5] . 2013. Incremental construction of systems: An efficient characterization of the lacking sub-system. Science of Computer Programming 78, 9 (2013), 1346–1367.Google ScholarCross Ref
- [6] . 2003. Heuristic search + local model checking in selective mu-calculus. IEEE Transactions on Software Engineering 29, 6 (2003), 510–523.Google ScholarDigital Library
- [7] . 1984. Lectures on a calculus for communicating systems. In Proceedings of the International Conference on Concurrency. Springer, 197–220.Google Scholar
- [8] . 1997. Model checking and the mu-calculus. DIMACS Series in Discrete Mathematics 31, 31 (1997), 185–214.Google ScholarCross Ref
- [9] . 2006. DELFIN+: An efficient deadlock detection tool for CCS processes. Journal of Computer and System Sciences 72, 8 (2006), 1397–1412.Google ScholarDigital Library
- [10] . 2016. Heuristic search for equivalence checking. Software and System Modeling 15, 2 (2016), 513–530.
DOI: Google ScholarCross Ref - [11] . 1989. An introduction to modal and temporal logics for CCS. In Proceedings of the Concurrency: Theory, Language, and Architecture. 2–20.Google Scholar
- [12] . 1989. Communication and Concurrency. Prentice Hall.Google ScholarDigital Library
- [13] . 1996. The NCSU concurrency workbench. In Proceedings of the International Conference on Computer Aided Verification. Springer, 394–397.Google ScholarCross Ref
- [14] . 1995. Supervised and unsupervised discretization of continuous features. In Proceedings of the Machine Learning Proceedings 1995. Elsevier, 194–202.Google ScholarCross Ref
- [15] . 2018. Driver and path detection through time-series classification. Journal of Advanced Transportation 2018 23, 1758731 (2018), 1–21.Google ScholarCross Ref
- [16] . 2018. A “pay-how-you-drive” car insurance approach through cluster analysis. Soft Computing 23, 13 (2018), 1–13.Google Scholar
- [17] . 2018. Battle of the attack detection algorithms: Disclosing cyber attacks on water distribution networks. Journal of Water Resources Planning and Management 144, 8 (2018), 04018048.Google ScholarCross Ref
- [18] . 2018. ECG heartbeat classification: A deep transferable representation. IEEE International Conference on Healthcare Informatics (ICHI’18), IEEE, 443–444.Google Scholar
- [19] . 2014. Drebin: Effective and explainable detection of android malware in your pocket. In Proceedings of the Ndss. 23–26.Google ScholarCross Ref
- [20] . 2013. Mobilesandbox: Looking deeper into android applications. In Proceedings of the 28th International ACM Symposium on Applied Computing.Google Scholar
- [21] . 2020. Model checking for malicious family detection and phylogenetic analysis in mobile environment. Computers and Security 90, 90 (2020), 101691.Google ScholarDigital Library
- [22] . 2021. Coronavirus disease 2019 (COVID-19) in Italy: Double reading of chest CT examination. Biology 10, 2 (2021), 1–10.
DOI: Google ScholarCross Ref - [23] . 2021. Radiomics as a new frontier of imaging for cancer prognosis: A narrative review. Diagnostics 11, 10 (2021), 1–22.
DOI: Google ScholarCross Ref - [24] . 2019. Neural networks for lung cancer detection through radiomic features. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–10.Google ScholarCross Ref
- [25] . 1994. C4. 5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers.Google Scholar
- [26] . 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97–106.Google ScholarDigital Library
- [27] . 2008. Bayesian network classifiers in weka for version 3-5-7. Artificial Intelligence Tools 11, 3 (2008), 369–387.Google Scholar
- [28] . 1995. Fast effective rule induction. In Proceedings of the Machine Learning Proceedings 1995. Elsevier, 115–123.Google ScholarCross Ref
- [29] . 2001. LIBSVM: A library for support vector machines, 2001. Retrieved from http://www.csie.ntu.edu.tw/cjlin/libsvm.Google Scholar
- [30] . 2017. Temporal logics for learning and detection of anomalous behavior. IIEEE Transactions on Automatic Control 62, 3 (2017), 1210–1222.Google ScholarCross Ref
- [31] . 2017. Logical clustering and learning for time-series data. In Proceedings of the International Conference on Computer Aided Verification. Springer, 305–325.Google ScholarCross Ref
- [32] . 1996. The application of cluster analysis in strategic management research: An analysis and critique. Strategic Management Journal 17, 6 (1996), 441–458.Google ScholarCross Ref
- [33] . 2014. Temporal logic based monitoring of assisted ventilation in intensive care patients. In Proceedings of the International Symposium On Leveraging Applications of Formal Methods, Verification and Validation. Springer, 391–403.Google ScholarCross Ref
- [34] . 2006. Machine learning biochemical networks from temporal logic properties. In Proceedings of the Transactions on Computational Systems Biology VI. Springer, 68–94.Google ScholarDigital Library
- [35] . 2001. Classification using association rules: Weaknesses and enhancements. In Proceedings of the Data Mining for Scientific and Engineering Applications. Springer, 591–605.Google ScholarCross Ref
- [36] . 2009. Learning and detecting emergent behavior in networks of cardiac myocytes. Communications of the ACM 52, 3 (2009), 97–105.Google ScholarDigital Library
- [37] . 2012. Querying parametric temporal logic properties on embedded systems. In Proceedings of the IFIP International Conference on Testing Software and Systems. Springer, 136–151.Google ScholarCross Ref
- [38] . 2011. Parametric identification of temporal properties. In Proceedings of the International Conference on Runtime Verification. Springer, 147–160.Google Scholar
- [39] . 2021. Knn classification with one-step computation. IEEE Transactions on Knowledge and Data Engineering, IEEE.Google Scholar
- [40] . 2022. Reachable distance function for KNN classification. IEEE Transactions on Knowledge and Data Engineering 1, 1 (2022), 1–15.Google ScholarDigital Library
- [41] . 2017. Efficient kNN classification with different numbers of nearest neighbors. IEEE Transactions on Neural Networks and Learning Systems 29, 5 (2017), 1774–1785.Google ScholarCross Ref
- [42] . 2019. Evolving deep neural networks. In Proceedings of the Artificial Intelligence in the Age of Neural Networks and Brain Computing. Elsevier, 293–312.Google ScholarCross Ref
Index Terms
- A Novel Classification Technique based on Formal Methods
Recommendations
A Formal Framework for ASTRAL Intralevel Proof Obligations
ASTRAL is a formal specification language for real-time systems. It is intended to support formal software development, and therefore has been formally defined. This paper focuses on how to formally prove the mathematical correctness of ASTRAL ...
A property based specification formalism classification
Specification formalisms may be classified through some common properties. Specification formalism classification may be used as a basis for the evaluation of the adequacy of formal specification languages within specific application domains. System ...
A formal requirements engineering method for specification, synthesis, and verification
SEE '97: Proceedings of the 8th International Conference on Software Engineering Environments (SEE '97)This paper presents a formal requirements engineering method capturing specification, synthesis, and verification. Being multi-paradigm, our approach integrates individual established formal methods: temporal logics are used to express abstract ...
Comments