ABSTRACT
Any persistent untagged, untapped and unclassified data can be termed as dark data. It has two common traits: first, it is not possible to determine its worth, and second, in most of the scenarios it is inadequately protected. Previous work and existing solutions are restricted to cater single node system. Moreover, they perform specialized processing of selected content, for example, logs. Further, there is total negligence of stakeholders and minimal focus on the data getting generated within the enterprise. From the perspective of an enterprise it is important to understand the distribution, nature and worth of dark data, as it helps in choosing right security controls, insurance or steps needed to pre-process a system before discarding it. In this paper we demonstrate a distributed system, called File WinOver, for File Lifecycle Management (FLM). The solution operates in a distributed environment where it identifies the dormant and active files on a system, filters them as per requirement and computes their fingerprint. Moreover, the content fingerprinting is utilized to detect closed user groups. After which, it classifies the content based on configured policies, and maps them with the stakeholders. This mapping is further used for valuating the risk exposure of the file. Thus, our system helps in identifying dark data and assigns quantitative risk value.
- Types of data breaches in 2014. https://www.privacyrights.org/data-breach/new. {Online; accessed 10-July-2015}.Google Scholar
- loggly - log management tool. https://www.loggly.com/.Google Scholar
- Young et al. Detecting unknown insider threat scenarios. In Security and Privacy Workshops (SPW), 2014 IEEE, pages 277--288. IEEE, 2014. Google ScholarDigital Library
- Gates et al. Detecting insider information theft using features from file access logs. In Computer Security-ESORICS 2014, pages 383--400. Springer, 2014.Google ScholarDigital Library
- Chen et al. Detection of anomalous insiders in collaborative environments via relational analysis of access logs. In Proceedings of the first ACM conference on Data and application security and privacy, pages 63--74. ACM, 2011. Google ScholarDigital Library
- Beaver et al. An approach to the automated determination of host information value. In Computational Intelligence in Cyber Security (CICS), 2011 IEEE Symposium on, pages 92--99. IEEE, 2011.Google ScholarCross Ref
- Park et al. Estimating asset sensitivity by profiling users. In Computer Security--ESORICS 2013, pages 94--110. Springer, 2013.Google ScholarCross Ref
- Park et al. System for automatic estimation of data sensitivity with applications to access control and other applications. In Proceedings of the 16th ACM symposium on Access control models and technologies, pages 145--146. ACM, 2011. Google ScholarDigital Library
- OWASP Risk Rating Methodology. https://www.owasp.org/index.php/OWASP_Risk_Rating_Methodology/.Google Scholar
Index Terms
- POSTER: WinOver Enterprise Dark Data
Recommendations
Poster: rethinking raid for SSD based HPC systems
SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis CompanionThe emerging Solid State Drives (SSDs) have changed the landscape of storage systems and have the potential to be widely deployed in computing systems including HPC systems. However, the cost and the capacity of SSDs have often been cited as the primary ...
A near-online approach to archive systems
LGDI '05: Proceedings of the 2005 IEEE International Symposium on Mass Storage Systems and TechnologyIn the face of growing data archiving needs, a better approach than tape based storage is needed. We describe how a MAID based storage and data system can be designed specifically to ideally meet the many requirements of archival data. These include the ...
Towards performance evaluation of cloud service providers for cloud data security
We have ciritcally evaluated cloud service providers business processes on cloud data security.We have also evaluated cloud computing adoption framework (CCAF).We have developed cloud data security models based on Business Process Modelling Notations (...
Comments