research-article

ALETHEIA: Improving the Usability of Static Security Analysis

Authors:

Omer Tripp,

Salvatore Guarnieri,

Marco Pistoia,

Aleksandr AravkinAuthors Info & Claims

CCS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

Pages 762 - 774

https://doi.org/10.1145/2660267.2660339

Published: 03 November 2014 Publication History

Get Access

Abstract

The scale and complexity of modern software systems complicate manual security auditing. Automated analysis tools are gradually becoming a necessity. Specifically, static security analyses carry the promise of efficiently verifying large code bases. Yet, a critical usability barrier, hindering the adoption of static security analysis by developers, is the excess of false reports. Current tools do not offer the user any direct means of customizing or cleansing the report. The user is thus left to review hundreds, if not thousands, of potential warnings, and classify them as either actionable or spurious. This is both burdensome and error prone, leaving developers disenchanted by static security checkers.

We address this challenge by introducing a general technique to refine the output of static security checkers. The key idea is to apply statistical learning to the warnings output by the analysis based on user feedback on a small set of warnings. This leads to an interactive solution, whereby the user classifies a small fragment of the issues reported by the analysis, and the learning algorithm then classifies the remaining warnings automatically. An important aspect of our solution is that it is user centric. The user can express different classification policies, ranging from strong bias toward elimination of false warnings to strong bias toward preservation of true warnings, which our filtering system then executes.

We have implemented our approach as the Aletheia tool. Our evaluation of Aletheia on a diversified set of nearly 4,000 client-side JavaScript benchmarks, extracted from 675 popular Web sites, is highly encouraging. As an example, based only on 200 classified warnings, and with a policy biased toward preservation of true warnings, Aletheia is able to boost precision by a threefold factor (x 2.868), while reducing recall by a negligible factor (x 1.006). Other policies are enforced with a similarly high level of efficacy.

References

[1]

N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 1950.

Abstract

References

Cited By

Index Terms

Recommendations

Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

Classification of cervical cancer using machine learning techniques: a review

Deep Learning-Based Computer-Aided Diagnosis Model for the Identification and Classification of Mammography Images

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations