Abstract
In this article we consider the problem of defending against increasing data exfiltration threats in the domain of cybersecurity. We review existing work on exfiltration threats and corresponding countermeasures. We consider current problems and challenges that need to be addressed to provide a qualitatively better level of protection against data exfiltration. After considering the magnitude of the data exfiltration threat, we outline the objectives of this article and the scope of the review. We then provide an extensive discussion of present methods of defending against data exfiltration. We note that current methodologies for defending against data exfiltration do not connect well with domain experts, both as sources of knowledge and as partners in decision-making. However, human interventions continue to be required in cybersecurity. Thus, cybersecurity applications are necessarily socio-technical systems that cannot be safely and efficiently operated without considering relevant human factor issues. We conclude with a call for approaches that can more effectively integrate human expertise into defense against data exfiltration.
- [1] 2019. Nodoze: Combatting threat alert fatigue with automated provenance triage. Network and Distributed Systems Security Symposium (NDSS’19). Google Scholar
- [2] . 2018. Social engineering threat and defense: A literature survey. Journal of Information Security 9 (2018), 257–264. Google ScholarCross Ref
- [3] . 2021. Machine learning based model to identify firewall decisions to improve cyber-defense. International Journal on Advanced Science Engineering and Information Technology 11, 4 (2021). Google Scholar
- [4] M. Afshar, S. Samet, and H. Usefi. 2021. Incorporating behavior in attribute based access control model using machine learning. In 2021 IEEE International Systems Conference (SysCon). IEEE, 1–8.Google Scholar
- [5] . 1975. Efficient string matching. Commun. ACM 18, 6 (
June 1975), 333–340. Google ScholarDigital Library - [6] . 2020. Learning the associations of MITRE ATT CK adversarial techniques. In 2020 IEEE Conference on Communications and Network Security (CNS’20).Google ScholarCross Ref
- [7] . 2019. Network anomaly intrusion detection using a nonparametric Bayesian approach and feature selection. IEEE Access 7 (2019), 52181–52190. Google ScholarCross Ref
- [8] . 2016. A survey on data leakage prevention systems. Journal of Network and Computer Applications 62 (
Feb. 2016), 137–152. Google ScholarDigital Library - [9] . 2015. Behind an application firewall, are we safe from SQL injection attacks? In 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST’15) - Proceedings.Google ScholarCross Ref
- [10] . 2016. Data leakage detection using system call provenance. Proceedings - 2016 International Conference on Intelligent Networking and Collaborative Systems, IEEE (INCoS’16), 486–491. Google Scholar
- [11] . 2014. Behavioral analysis of insider threat: A survey and bootstrapped prediction in imbalanced data. 135–155 pages. Google Scholar
- [12] P. Baecher, M. Koetter, T. Holz, M. Dornseif, and F. Freiling. 2006. The nepenthes platform: An efficient approach to collect malware. In Recent Advances in Intrusion Detection: 9th International Symposium, (RAID’06 Hamburg, Germany, September 20-22, 2006 Proceedings 9), Springer Berlin Heidelberg, 165–184.Google Scholar
- [13] . 2020. Country-level cybersecurity posture assessment:Study and analysis of practices. Information Security Journal 29, 5 (
Sept. 2020), 250–266. Google Scholar - [14] . 2011. 2011 Data Breach Investigations Report. Verizon RISK Team. www.verizonbusiness.com/resources/reports/rp_databreach-investigationsreport-2011_en_xg.pdf. 1–72.Google Scholar
- [15] . 2021. Combat security alert fatigue with AI-assisted techniques. In ACM International Conference Proceeding Series. 9–16. Google Scholar
- [16] . 2021. Does the whole exceed its parts? The effect of AI explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16. Google Scholar
- [17] . 2019. Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance.
Technical Report 1. 19 pages. www.aaai.org.Google Scholar - [18] . 2019. Updates in human-AI teams: Understanding and addressing the performance/compatibility tradeoff. In 33rd AAAI Conference on Artificial Intelligence (AAAI’19), 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 2429–2437. Google ScholarDigital Library
- [19] . 2010. Cyber SA: Situational awareness for cyber defense. Advances in Information Security 46 (2010), 3–13. Google Scholar
- [20] . 2012. Threat-oriented security framework in risk management using multiagent system. Wiley Online Library 43, 9 (Sept. 2012), 1013–1038.Google Scholar
- [21] . 2011. HoneyGen: An automated honeytokens generator. In Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics (ISI’11). 131–136. Google Scholar
- [22] . 2005. An annotation management system for relational databases. VLDB Journal 14, 4 (
Oct. 2005), 373–396. Google ScholarCross Ref - [23] . 2014. The operational role of security information and event management systems. IEEE Security and Privacy 12, 5 (2014), 35–41. Google ScholarCross Ref
- [24] . 2011. Guide for conducting risk assessments. (2011).Google Scholar
- [25] . 1998. Emergency signal failure: Implications and recommendations. Ergonomics 41, 1 (
Jan. 1998), 57–72. Google ScholarCross Ref - [26] . 2018. Cyber threat modeling: Survey, assessment, and representative framework. (2018).Google Scholar
- [27] . 2012. Cyber risk: How the 2011 Sony data breach and the need for cyber risk insurance policies should direct the federal response to rising data breaches. Wash. UJL & Pol’y 40 (2012), 257.Google Scholar
- [28] . 2020. Improving SIEM alert metadata aggregation with a novel kill-chain based classification model. Computers & Security 94 (
2020), 101817. Google ScholarCross Ref - [29] . 2001. Why and where: A characterization of data provenance. In International Conference on Database Theory, Vol. 1973, Springer, Berlin, 316–330. Google ScholarCross Ref
- [30] . 2018. Data provenance: What next? ACM SIGMOD Record 47, 3 (2018), 5–13. Google ScholarDigital Library
- [31] . 2013. The Diamond Model of Intrusion Analysis. Center for Cyber Intelligence Analysis and Threat Research.Google Scholar
- [32] J. J. Cash. 2009. Alert fatigue. American Journal of Health-System Pharmacy 66, 23 (2009), 2098–2101.Google Scholar
- [33] . 2020. Quantum-computing pioneer warns of complacency over internet security - document - gale academic onefile. Nature 587, 7833 (2020), 189–190.Google ScholarCross Ref
- [34] S. A. Chamkar, Y. Maleh, and N. Gherabi. 2022. The human factor capabilities in security operation center (SOC). EDPACS 66, 1 (2022), 1–14.Google Scholar
- [35] . 2019. Endpoint protection: Measuring the effectiveness of remediation technologies and methodologies for insider threat. In 2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC’19). 81–89. Google ScholarCross Ref
- [36] J. D. Chaparro, C. Hussain, J. A. Lee, J. Hehmeyer, M. Nguyen, and J. Hoffman. 2020. Reducing interruptive alert burden using quality improvement methodology. Applied Clinical Informatics 11, 01(2020), 046–058.Google Scholar
- [37] . 2003. BlueBoX: A policy-driven, host-based intrusion detection system. ACM Transactions on Information and System Security 6, 2 (2003), 173–200.Google ScholarDigital Library
- [38] . 2014. A study on advanced persistent threats. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8735 LNCS (2014), 63–72. Google Scholar
- [39] . 2018. A novel architecture combined with optimal parameters for back propagation neural networks applied to anomaly network intrusion detection. Computers & Security 75 (
June 2018), 36–58. Google ScholarCross Ref - [40] . 2020. Interactive machine learning for data exfiltration detection: Active learning with human expertise. IEEE Transactions on Systems, Man, and Cybernetics: Systems (Oct. 2020), 280–287. Google Scholar
- [41] M. Cinque, D. Cotroneo, and A. Pecchia. 2018. Challenges and directions in security information and event management (SIEM). In 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, 95–99.Google Scholar
- [42] . 2013. The Enemy Within: An Emerging Threat... https://www.clearswift.com/blog/2013/05/02/enemy-within-emerging-threat.Google Scholar
- [43] . 2000. Sociotechnical principles for system design. Applied Ergonomics 31, 5 (2000), 463–477. Google ScholarCross Ref
- [44] . 2018. Web application firewall: Network security models and configuration. Proceedings - International Computer Software and Applications Conference 1 (
June 2018), 835–836.Google Scholar - [45] . 1979. A string matching algorithm fast on the average. In Springer- International Colloquium on Automata, Languages, and Programming. 118–132.Google ScholarCross Ref
- [46] . 1982. Security Classification Policy and Executive Order 12356, 13–20 pages.Google Scholar
- [47] . 2006. Kojoney - A honeypot for the SSH Service.Google Scholar
- [48] . 2008. A framework for reasoning about the human in the loop. In Usability, Psychology, and Security (UPSEC’08).Google Scholar
- [49] . 2022. 2022 global threat report. (2022). https://www.crowdstrike.com/resources/reports/global-threat-report/.Google Scholar
- [50] . 1999. AES proposal: Rijndael. (1999).Google Scholar
- [51] . 2017. Enhancing honeypot deception capability through network service fingerprinting. Journal of Physics: Conference Series 801, 1 (
Jan. 2017), 012057.Google Scholar - [52] . 2022. Evaluation of AI-based use cases for enhancing the cyber security defense of small and medium-sized companies (SMEs). Electronic Imaging 34 (2022), 1–8.Google Scholar
- [53] . 1978. The data encryption standard in perspective. IEEE Communications Society Magazine 16, 6 (1978), 5–9. Google ScholarCross Ref
- [54] . [n.d.]. The Transport Layer Security (TLS) Protocol Version 1.2.Google Scholar
- [55] W. Diffie and M. E. Hellman. 2022. New directions in cryptography. In Democratizing Cryptography: The Work of Whitfield Diffie and Martin Hellman. 365–390.Google Scholar
- [56] . 1985. Issues in discretionary access control. In Proceedings - IEEE Symposium on Security and Privacy. 208–218. Google Scholar
- [57] . 2020. Data provenance and trust establishment in the Internet of Things. Security and Privacy 3, 3 (
May 2020), e99. Google ScholarCross Ref - [58] . 1988. Design and evaluation for situation awareness enhancement. Proceedings of the Human Factors Society Annual Meeting 32, 2 (
Oct. 1988), 97–101. Google ScholarCross Ref - [59] . 2016. Massive data breach exposes all Philippines voters. https://www.telecomasia.net/content/massive-data-breach-exposes-all-philippines-voters.Google Scholar
- [60] . 2019. A-PANDDE: Advanced provenance-based anomaly detection of data exfiltration. Computers & Security 84 (
July 2019), 276–287. Google ScholarDigital Library - [61] . 2016. PANDDE: Provenance-based anomaly detection of data exfiltration. In Proceedings of the 6th ACM Conference on Data and Application Security and Privacy (CODASPY’16), 267–276. Google Scholar
- [62] . 2020. Effectiveness of security incident event management (SIEM) system for cyber security situation awareness. Indian Journal of Forensic Medicine and Toxicology 14, 4 (2020), 802–808.Google Scholar
- [63] . 1995. Role-based access control (RBAC): Features and motivations In. Proceedings of 11th Computer Security Application Conference. 241–248.Google Scholar
- [64] . 2001. Proposed NIST standard for role-based access control. ACM Transactions on Information and System Security (TISSEC) 4, 3 (
Aug. 2001), 224–274. Google ScholarDigital Library - [65] . 2014. Cyber situational awareness-A systematic review of the literature. Computers & security 46 (2014), 18–31.Google Scholar
- [66] . 2014. Automating risk analysis of software design models. Scientific World Journal (2014).Google ScholarCross Ref
- [67] . 2015. At first cyber meeting, China claims OPM hack is “criminal case” [Updated]. | Ars Technica. https://arstechnica.com/tech-policy/2015/12/at-first-cyber-meeting-china-claims-opm-hack-is-criminal-case/.Google Scholar
- [68] . 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Computers and Security 28, 1–2 (2009), 18–28. Google ScholarDigital Library
- [69] . 1996. Cognitive engineering principles for enhancing human-computer performance. Plastics, Rubber and Composites Processing and Applications 8, 2 (1996), 189–211. Google Scholar
- [70] . 2016. Detection and prediction of insider threats to cyber security: A systematic literature review and meta-analysis. Big Data Analytics 1, 1 (2016), 1–29.Google ScholarCross Ref
- [71] . 1984. Probabilistic encryption. J. Comput. System Sci. 28, 2 (
April 1984), 270–299. Google ScholarCross Ref - [72] . 2021. Security information and event management (SIEM): Analysis, trends, and usage in critical infrastructures. Sensors 21, 14 (
2021), 4759. Google ScholarCross Ref - [73] . 2016. OPM hack: The most dangerous threat to the federal government today. Journal of Applied Security Research 11, 4 (2016), 517–525.Google ScholarCross Ref
- [74] . 2010. Combining traditional cyber security audit data with psychosocial data: Towards predictive modeling for insider threat mitigation. In Insider Threats in Cyber Security. Springer, 85–113.Google Scholar
- [75] . 2018. Lemna: Explaining deep learning based security applications. In Proceedings of the ACM Conference on Computer and Communications Security. 364–379. Google Scholar
- [76] . 2018. Toward human-understandable, explainable AI. Computer 51, 9 (
Sept. 2018), 28–36. Google ScholarDigital Library - [77] . 2020. Challenges to human drivers in increasingly automated vehicles. Human Factors 62, 2 (
March 2020), 310–328. Google ScholarCross Ref - [78] . 2012. Clustering of snort alerts to identify patterns and reduce analyst workload. In Proceedings - IEEE Military Communications Conference (MILCOM’12). Google ScholarCross Ref
- [79] . 2011. Text classification for data loss prevention. Privacy Enhancing Technologies (2011), 18–37. Google ScholarCross Ref
- [80] . 2020. OmegaLog: High-fidelity attack investigation via transparent multi-layer log analysis. In Network and Distributed System Security Symposium. Google Scholar
- [81] . 2013. Cyber security risk management in the scada critical infrastructure environment. EMJ - Engineering Management Journal 25, 2 (
June 2013), 38–45. Google ScholarCross Ref - [82] . 2018. Metrics for Explainable AI: Challenges and Prospects.
arxiv:1812.04608 .Google Scholar - [83] . 2019. Interactive machine learning: Experimental evidence for the human in the algorithmic loop: A case study on ant colony optimization. Applied Intelligence 49, 7 (
July 2019), 2401–2414. Google ScholarDigital Library - [84] . 2019. Insight into insiders and it: A survey of insider threat taxonomies, analysis, modeling, and countermeasures. ACM Computing Surveys (CSUR) 52, 2 (2019), 1–40.Google ScholarDigital Library
- [85] . 2021. Towards practical cybersecurity mapping of STRIDE and CWE - A multi-perspective approach. Conference of Open Innovation Association (FRUCT’21), 150–159.Google Scholar
- [86] . 2016. Russian intervention: Paranoia or weapon for national security? From the perspective on public diplomacy. Washington Post.Google Scholar
- [87] . 2020. A survey on data provenance in IoT. World Wide Web 23, 2 (
March 2020), 1441–1463. Google ScholarCross Ref - [88] . 2013. Guide to attribute based access control (ABAC) definition and considerations (draft). NIST Special Publication 800, 162 (2013).Google Scholar
- [89] . 2020. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance. Scientific Data 7, 1 (
Sept. 2020), 1–18.arxiv:2003.12476 .Google ScholarCross Ref - [90] . 2011. Insiders and insider threats-an overview of definitions and mitigation techniques. J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl. 2, 1 (2011), 4–27.Google Scholar
- [91] E. M. Hutchins, M. J. Cloppert, and R. M. Amin. 2011. Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. Leading Issues in Information Warfare & Security Research 1, 1 (2011), 80.Google Scholar
- [92] . 2000. Implementing a distributed firewall. In Proceedings of the 7th ACM Conference on Computer and Communications Security. 190–199.Google Scholar
- [93] . 2017. Applying provenance in APT monitoring and analysis: Practical challenges for scalable, efficient and trustworthy distributed provenance. In 9th USENIX Workshop on the Theory and Practice of Provenance.Google Scholar
- [94] . 2012. A unified attribute-based access control model covering DAC, MAC and RBAC. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 41–55. Google Scholar
- [95] . 2018. A survey on anomaly based host intrusion detection system. In Journal of Physics: Conference Series, Vol. 1000. Institute of Physics Publishing, 12049. Google ScholarCross Ref
- [96] N. Kaloudi and J. Li. 2020. The ai-based cyber threat landscape: A survey. ACM Computing Surveys (CSUR) 53, 1 (2020), 1–34.Google Scholar
- [97] A. Karahasanovic, P. Kleberger, and M. Almgren. 2017. Adapting threat modeling methods for the automotive industry. In Proceedings of the 15th ESCAR Conference. 1–10.Google Scholar
- [98] . 2005. Keep on truckin’ your back-up tapes? You’ve got to be kidding! | Network World. https://www.networkworld.com/article/2320740/keep-on-truckin--your-back-up-tapes--you-ve-got-to-be-kidding-.html.Google Scholar
- [99] . 2010. Querying data provenance. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 951–962. Google Scholar
- [100] . 2015. Magic quadrant for security information and event management. Gartner Group Research Note.Google Scholar
- [101] . 2020. Role of user and entity behavior analytics in detecting insider attacks. 1st Annual International Conference on Cyber Warfare and Security (ICCWS’20) - Proceedings. Google ScholarCross Ref
- [102] . 2017. STRIDE-based threat modeling for cyber-physical systems. In 2017 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe’17) - Proceedings. 1–6.Google Scholar
- [103] . 2018. A cyber kill chain based taxonomy of banking trojans for evolutionary computational intelligence. Journal of Computational Science 27 (
July 2018), 394–409.Google ScholarCross Ref - [104] L. Kohnfelder and P. Garg. 1999. The Threats to Our Products. Microsoft Interface, Microsoft Corporation, 33.Google Scholar
- [105] . 2018. What is the Cyber Kill Chain? Why It’s Not Always the Right Approach to Cyber Attacks. CSO.Google Scholar
- [106] . 2014. Visualization of security metrics for cyber situation awareness. In Proceedings - 9th International Conference on Availability, Reliability and Security (ARES’14), 506–513. Google ScholarDigital Library
- [107] . 2012. Trail of bytes: New techniques for supporting data provenance and limiting privacy breaches. IEEE Transactions on Information Forensics and Security 7, 6 (2012), 1876–1889. Google ScholarDigital Library
- [108] . 2007. Survey of Current Network Intrusion Detection Techniques. Washington Univ. in St. Louis. Google Scholar
- [109] . 2020. Cyber threat dictionary using MITRE ATTCK matrix and NIST cybersecurity framework mapping. In 2020 Resilience Week (RWS’20). 106–112.Google ScholarCross Ref
- [110] . 1974. Protection. ACM SIGOPS Operating Systems Review 8, 1 (
Jan. 1974), 18–24.Google ScholarDigital Library - [111] . 2003. A comparative study of anomaly detection schemes in network intrusion detection. In Proceedings of the 2003 SIAM International Conference on Data Mining (SDM’03). 25–36.Google ScholarCross Ref
- [112] . 2020. Analyzing data granularity levels for insider threat detection using machine learning. IEEE Transactions on Network and Service Management 17, 1 (
2020), 30–44. Google ScholarDigital Library - [113] . 2021. HSViz: Hierarchy simplified visualizations for firewall policy analysis. IEEE Access 9 (2021), 71737–71753.Google ScholarCross Ref
- [114] . 1994. Trust, self-confidence, and operators’ adaptation to automation. International Journal of Human - Computer Studies 40, 1 (1994), 153–184. Google ScholarDigital Library
- [115] . 2004. Trust in automation: Designing for appropriate reliance, 50–80 pages. Google Scholar
- [116] . 2017. ProvChain: A blockchain-based data provenance architecture in cloud environment with enhanced privacy and availability. Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID’17). 468–477. Google ScholarDigital Library
- [117] . 2018. Detecting and preventing cyber insider threats: A survey. IEEE Communications Surveys & Tutorials 20, 2 (2018), 1397–1417.Google ScholarCross Ref
- [118] . 2010. Data loss prevention. IT Professional 12, 2 (
March 2010), 10–13. Google ScholarDigital Library - [119] . 2022. Cyber Kill Chain. https://www.lockheedmartin.com/en-us/capabilities/cyber/cyber-kill-chain.html.Google Scholar
- [120] . 2011. Social engineering: The neglected human factor for information security management. Information Resources Management Journal (IRMJ) 24, 3 (2011). 1–8. Google ScholarDigital Library
- [121] T. Macaulay. 2016. RIoT control: understanding and managing risks and the internet of things. Morgan Kaufmann.Google Scholar
- [122] . 2012. Visual analysis of complex firewall configurations. In ACM International Conference Proceeding Series, 1–8.Google Scholar
- [123] . 2013. A threat model-based approach to security testing. Software: Practice and Experience 43, 2 (
Feb. 2013), 241–258.Google ScholarDigital Library - [124] . 2015. Towards a systematic threat modeling approach for cyber-physical systems. Proceedings - 2015 Resilience Week (RSW’15). 114–119.Google Scholar
- [125] . 2016. Cyber situational awareness. JSTOR: The Cyber Defense Review 1, 1 (2016), 35–46.Google Scholar
- [126] . 2018. Data-driven threat hunting using Sysmon. In Proceedings of the 2nd International Conference on Cryptography, Security and Privacy. Google ScholarDigital Library
- [127] . 2021. Advanced threat research report.Google Scholar
- [128] . 2014. Net Losses: Estimating the Global Cost of Cybercrime. McAfee, Centre for Strategic & International Studies.Google Scholar
- [129] . 2004. File classification in self-* storage systems. In Proceedings - International Conference on Autonomic Computing. 44–51. Google Scholar
- [130] . 2017. The design of cyber threat hunting games: A case study. In 2017 26th International Conference on Computer Communications and Networks (ICCCN’17). Google Scholar
- [131] . [n.d.]. ATT&CK Matrix for Enterprise. https://attack.mitre.org/.Google Scholar
- [132] . 2007. Honeypots: Concepts, approaches, and challenges. In Proceedings of the Annual Southeast Conference, Vol. 2007. 321–326. Google ScholarDigital Library
- [133] . 1994. Network intrusion detection. IEEE Network 8, 3 (1994), 26–41.Google ScholarDigital Library
- [134] . 2017. Towards a top-down policy engineering framework for attribute-based access control. In Proceedings of ACM Symposium on Access Control Models and Technologies (SACMAT’17). 103–114. Google Scholar
- [135] . 2021. Behavioral based insider threat detection using deep learning. IEEE Access 9 (2021), 143266–143274. Google ScholarCross Ref
- [136] . 2010. Combatting insider threats. In Insider Threats in Cyber Security. Springer, 17–44.Google ScholarCross Ref
- [137] . 2004. Usability engineering. In Computer Science Handbook, Second Edition. 45–1–45–21. Google Scholar
- [138] . 2020. Antivirus vs. EPP vs. EDR: How to Secure Your Endpoints. https://www.esecurityplanet.com/endpoint/antivirus-vs-epp-vs-edr/.Google Scholar
- [139] . 2013. Analytical visualization techniques for security information and event management. In Proceedings of the 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP’13). 519–525. Google ScholarDigital Library
- [140] . 2014. Understanding insider threat: A framework for characterising attacks. In 2014 IEEE Security and Privacy Workshops. IEEE, 214–228.Google ScholarDigital Library
- [141] . 1997. Mandatory access control and role-based access control revisited. In Proceedings of the ACM Workshop on Role-based Access Control. 31–40.Google ScholarDigital Library
- [142] . 2010. The design and implementation of host-based intrusion detection system. In The Design and Implementation of Host-based Intrusion Detection System. 595–598.Google Scholar
- [143] . 2021. A novel two-factor honeytoken authentication mechanism In. Proceedings - International Conference on Computer Communications and Networks (ICCCN’21).
arxiv:2012.08782 .Google Scholar - [144] . 2004. The UCONABC usage control model. ACM Transactions on Information and System Security (TISSEC) 7, 1 (
Feb. 2004), 128–174. Google ScholarDigital Library - [145] . 1988. Expert systems for experts. New York.Google Scholar
- [146] . 1981. Normal Accident at Three Mile Island.
Technical Report 5. 17–26 pages. Google Scholar - [147] . 2021. SANS 2021 top new attacks and threat report. https://www.rapid7.com/info/sans-2021-new-attacks-threat-report/.Google Scholar
- [148] . 2015. Honeytokens as active defense. In 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO’15) - Proceedings. 1313–1317. Google ScholarCross Ref
- [149] . 2009. Insiders behaving badly: Addressing bad actors and their actions. IEEE Transactions on Information Forensics and Security 5, 1 (2009), 169–179.Google ScholarCross Ref
- [150] . 2002. Information sharing and security in dynamic coalitions. In Proceedings of the 7th ACM Symposium on Access Control Models and Technologies (SACMAT’02). Google ScholarDigital Library
- [151] . 2019. Why SIEM is irreplaceable in a secure IT environment? In 2019 Open Conference of Electrical, Electronic and Information Sciences (eStream’19) - Proceedings. Google ScholarCross Ref
- [152] . 2021. Sharing machine learning models as indicators of compromise for cyber threat intelligence. Journal of Cybersecurity and Privacy 1, 1 (
Feb. 2021), 140–163. Google ScholarCross Ref - [153] . 2011. Developer-driven threat modeling: Lessons learned in the trenches. IEEE Security & Privacy 9, 4 (2011), 41–47.Google Scholar
- [154] . 2004. A virtual honeypot framework. In Proceedings of the 13th USENIX Security Symposium.Google Scholar
- [155] . 2011. PlayStation network hackers access data of 77 million users. The Guardian, 27.Google Scholar
- [156] . 2009. Towards improving mental models of personal firewall users. In Conference on Human Factors in Computing Systems - Proceedings. 4633–4638.Google Scholar
- [157] . 2011. Promoting a physical security mental model for personal firewall warnings. In Conference on Human Factors in Computing Systems - Proceedings. 1585–1590.Google Scholar
- [158] . 2021. Extended detection and response importance of events context. Kriative.tech (2021). Google Scholar
- [159] . 2021. Anomaly detection using user entity behavior analytics and data visualization. In 8th International Conference on Computing for Sustainable Global Development. 842–847.Google Scholar
- [160] . 2020. 2020 SANS network visibility and threat detection survey. SANS Institute. https://www.sans.org/webcasts/network-visibility-threat-detection-survey-112595.Google Scholar
- [161] . 2016. “Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.
arxiv:1602.04938 .Google ScholarDigital Library - [162] . 1978. A method for obtaining digital signatures and public-key cryptosystems. ACM Secure Communications and Asymmetric Cryptosystems 21, 2 (
Feb. 1978), 120–126. Google ScholarDigital Library - [163] . 2019. Zero Trust Architecture.
Technical Report .Google Scholar - [164] . 2021. Machine learning for detecting data exfiltration: A review. ACM Computing Surveys (CSUR) 54, 3 (2021), 1–47.Google Scholar
- [165] . 2019. Social engineering attacks: A survey. Future Internet 11, 4 (
4 2019), 89. Google ScholarCross Ref - [166] . 2008. A survey of insider attack detection research. Insider Attack and Cyber Security (2008), 69–90.Google ScholarCross Ref
- [167] . 1993. Lattice-based access control models. Computer 26, 11 (1993), 9–19. Google ScholarDigital Library
- [168] . 1998. Role-based access control. Advances in Computers 46, C (
Jan. 1998), 237–286. Google ScholarCross Ref - [169] . 1996. Computer role-based access control models. Computer 29, 2 (
Feb. 1996), 38–47. Google ScholarDigital Library - [170] . 1994. Access control: Principles and practice. IEEE Communications Magazine 32, 9 (1994), 40–48. Google ScholarDigital Library
- [171] . 2015. A descriptive study of Microsoft’s threat modeling technique. Requirements Engineering 20, 2 (
March 2015), 163–180.Google ScholarDigital Library - [172] . 2017. Social engineering defence mechanisms and counteracting training strategies. Information and Computer Security 25, 2 (2017), 206–222. Google ScholarCross Ref
- [173] . 1972. Protection-principles and practice. In Proceedings of the Spring Joint Computer Conference (AFIPS’72). 417–429.Google Scholar
- [174] . 2017. Current research and open problems in attribute-based access control. ACM Computing Surveys (CSUR) 49, 4 (2017), 1–45.Google ScholarDigital Library
- [175] . 2009. Active Learning Literature Survey. Technical Report (2009).Google Scholar
- [176] . 2011. From theories to queries: Active learning in practice. JMLR: Workshop and Conference Proceedings 16 (2011), 1–18.Google Scholar
- [177] . 2019. Privacy therapy with ARETHA: What if your firewall could talk? In Conference on Human Factors in Computing Systems - Proceedings.Google Scholar
- [178] . 2012. A survey of data leakage detection and prevention solutions. Springer Science & Business Media.Google Scholar
- [179] . 2016. SANS 2016 Security Analytics Survey. SANS Institute, Swansea.Google Scholar
- [180] . 1979. How to share a secret. Commun. ACM 22, 11 (
Nov. 1979), 612–613. Google ScholarDigital Library - [181] . 2020. User behavior analytics for anomaly detection using LSTM autoencoder: Insider threat detection. In Proceedings of the 11th International Conference on Advances in Information Technology. 1–9. Google Scholar
- [182] . 2014. Different firewall techniques: A survey. In 5th International Conference on Computing Communication and Networking Technologies (ICCCNT’14).Google Scholar
- [183] . 1984. Research and Modeling of Supervisory Control Behavior.
Technical Report .Google ScholarCross Ref - [184] . 2018. Threat Modeling: A Summary of Available Methods. Carnegie Mellon University Software Engineering Institute.Google Scholar
- [185] . 2008. Experiences threat modeling at Microsoft. MODSEC@ MoDELS, 2008, 35.Google Scholar
- [186] . 2014. Threat Modeling: Designing for Security. John Wiley & Sons.Google ScholarDigital Library
- [187] . 2005. A survey of data provenance in e-science. ACM SIGMOD Record 34, 3 (
Sept. 2005), 31–36. Google ScholarDigital Library - [188] . 2017. Hybrid emergency response model: Improving cyber situational awareness. In European Conference on Information Warfare and Security (ECCWS’17). 442–451. www.laurea.fi.Google Scholar
- [189] . 1991. Nonstop flying is safer than driving. Risk Analysis 11, 1 (1991), 145–148. Google ScholarCross Ref
- [190] . 1988. The data encryption standard: Past and future. Proc. IEEE 76, 5 (1988), 550–559. Google ScholarCross Ref
- [191] . 1997. Brittleness in the design of cooperative problem-solving systems: The effects on user performance. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans. 27, 3 (1997), 360–371. Google ScholarDigital Library
- [192] L. S. Snyder, Y. S. Lin, M. Karimzadeh, D. Goldwasser, and D. S. Ebert. 2019. Interactive learning for identifying relevant tweets to support real-time situational awareness. IEEE Transactions on Visualization and Computer Graphics 26, 1 (2019), 558–568.Google Scholar
- [193] . 2003. Honeypots: Catching the insider threat. In Proceedings - Annual Computer Security Applications Conference (ACSAC’03). 170–179. Google Scholar
- [194] . 2003. Honeytokens: The other honeypot.Google Scholar
- [195] . 2003. The honeynet project: Trapping the hackers. IEEE Security and Privacy 1, 2 (2003), 15–23. Google ScholarDigital Library
- [196] . 2020. Towards systematic honeytoken fingerprinting. In 13th International Conference on Security of Information and Networks. Google Scholar
- [197] . 2010. Threat modeling-perhaps it’s time. IEEE Security & Privacy 8, 3 (2010), 83–86.Google ScholarDigital Library
- [198] . 2008. Insider attack and cyber security: Beyond the hacker, Vol. 39. Springer Science & Business Media.Google Scholar
- [199] . 2020. Modeling attack, defense and threat trees and the cyber kill chain, ATTCK and STRIDE frameworks as blackboard architecture networks. In Proceedings - 2020 IEEE International Conference on Smart Cloud (SmartCloud’20). 148–153.Google ScholarCross Ref
- [200] . 2018. Mitre att&ck: Design and Philosophy. Technical Report (2018).Google Scholar
- [201] . 2004. Threat Modeling. Microsoft Press. Google ScholarDigital Library
- [202] . 2019. The biggest data breach fines, penalties and settlements so far. CSO, Framingham.Google Scholar
- [203] . 2020. The 15 biggest data breaches of the 21st century. CSO. Last Modified2020.Google Scholar
- [204] . 2015. Trusted tamper-evident data provenance. Proceedings - 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom’15). 646–653. Google Scholar
- [205] . 2014. Data leakage/loss prevention systems (DLP). In 2014 World Congress on Computer Applications and Information Systems (WCCAIS’14). Google ScholarCross Ref
- [206] . 2017. Reducing false positives of user-to-entity first-access alerts for user behavior analytics. In IEEE International Conference on Data Mining Workshops (ICDMW’17). 804–811. Google Scholar
- [207] . 2014. Development of a hybrid web application firewall to prevent web based attacks. In 8th IEEE International Conference on Application of Information and Communication Technologies (AICT’14) - Conference Proceedings.Google Scholar
- [208] . 2017. The analysis of firewall policy through machine learning and data mining. Wireless Personal Communications 96, 2 (
Sept. 2017), 2891–2909.Google ScholarDigital Library - [209] . 2018. Data exfiltration: A review of external attack vectors and countermeasures. Journal of Network and Computer Applications 101 (2018), 18–54.Google ScholarDigital Library
- [210] . 2014. An extensible pattern-based library and taxonomy of security threats for distributed systems. Computer Standards & Interfaces 36, 4 (2014), 734–747.Google Scholar
- [211] . 2016. SEcube™: Data at rest and data in motion protection. In International Conference Security and Management. 138–145. Google Scholar
- [212] . 2020. 2020 Data Breach Investigations Report. https://enterprise.verizon.com/resources/reports/dbir/.Google Scholar
- [213] . 2015. Security analytics: Essential data analytics knowledge for cybersecurity professionals and students. IEEE Security and Privacy 13, 6 (2015), 60–65. Google ScholarDigital Library
- [214] . 2020. Explainable security. In Proceedings - 5th IEEE European Symposium on Security and Privacy Workshops (Euro S and PW’20). 293–300.
arxiv:1807.04178 .Google ScholarCross Ref - [215] . 2004. Anomalous payload-based network intrusion detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3224 (2004), 203–222. Google Scholar
- [216] . 2020. You Are what you do: Hunting stealthy malware via data provenance analysis. In Network and Distributed Systems Security (NDSS’00) Symposium 2020. Google ScholarCross Ref
- [217] . 2008. The Honeynet Project: Data Collection Tools, Infrastructure, Archives and Analysis.
Technical Report . 24–30 pages. Google Scholar - [218] . 2015. Evaluating the effectiveness of microsoft threat modeling tool. In Proceedings of the 2015 Information Security Curriculum Development Conference.Google ScholarDigital Library
- [219] Martyn Williams. 2017. Inside the Russian hack of Yahoo: How they did it. https://www.csoonline.com/article/3180762/inside-the-russian-hack-of-yahoo-how-they-did-it.html.Google Scholar
- [220] . 2004. A quantitative study of firewall configuration errors. Computer 37, 6 (2004), 62–67. Google ScholarDigital Library
- [221] S. Wu and U. Manber. 1994. A Fast Algorithm for Multi-pattern Searching. Department of Computer Science, Tucson, AZ: University of Arizona. 1–11.Google Scholar
- [222] . 2012. Data loss prevention based on data-driven usage control. In Proceedings - International Symposium on Software Reliability Engineering (ISSRE’12). 151–160. Google Scholar
- [223] . 2022. Cyber security threat modeling based on the MITRE enterprise ATT&CK Matrix. Software and Systems Modeling 21, 1 (
Feb. 2022), 157–177.Google ScholarDigital Library - [224] . 2019. Threat modeling-A systematic literature review. Computers & Security 84 (2019), 53–69.Google Scholar
- [225] . 2018. Combining data owner-side and cloud-side access control for encrypted cloud storage. IEEE Transactions on Information Forensics and Security 13, 8 (
Aug. 2018), 2062–2074. Google ScholarCross Ref - [226] . 2015. Technical aspects of cyber kill chain. In International Symposium on Security in Computing and Communication. 438–452.Google ScholarCross Ref
- [227] R. Yahalom, E. Shmueli, and T. Zrihen. 2010. Constrained anonymization of production data: a constraint satisfaction problem approach. In Secure Data Management: 7th VLDB Workshop, (SDM’10, Singapore, September 17, 2010. Proceedings 7), Springer Berlin Heidelberg, 41–53.Google Scholar
- [228] . 2022. Threat classification model for security information event management focusing on model efficiency. Computers & Security 120 (
9 2022), 102789. Google ScholarDigital Library - [229] . 2017. Trustworthy data: A survey, taxonomy and future trends of secure provenance schemes. Journal of Network and Computer Applications 94 (
Sept. 2017), 50–68. Google ScholarDigital Library - [230] . 2018. Evaluation of machine learning techniques for network intrusion detection. In IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World (NOMS’18). 1–5. Google Scholar
- [231] . 2022. Phishing Campaign Delivering Three Fileless Malware: AveMariaRAT / BitRAT / PandoraHVNC - Part I. FortiGuard Labs.Google Scholar
- [232] . 2004. Intrusion prevention system design. In Proceedings - The 4th International Conference on Computer and Information Technology (CIT’04). 386–390. Google Scholar
Index Terms
- Implementing Data Exfiltration Defense in Situ: A Survey of Countermeasures and Human Involvement
Recommendations
Machine Learning for Detecting Data Exfiltration: A Review
Context: Research at the intersection of cybersecurity, Machine Learning (ML), and Software Engineering (SE) has recently taken significant steps in proposing countermeasures for detecting sophisticated data exfiltration attacks. It is important to ...
Detecting Insider Theft of Trade Secrets
Trusted insiders who misuse their privileges to gather and steal sensitive information represent a potent threat to businesses. Applying access controls to protect sensitive information can reduce the threat but has significant limitations. Even if ...
Data exfiltration
ContextOne of the main targets of cyber-attacks is data exfiltration, which is the leakage of sensitive or private data to an unauthorized entity. Data exfiltration can be perpetrated by an outsider or an insider of an organization. Given the increasing ...
Comments