Bayesian-model averaging using MCMCBayes for web-browser vulnerability discovery
Graphical abstract
Introduction
The rise of electronic crime, the proliferation of networked computing devices and their extensive customer usage, as well as the increasing interaction of software with various forms of sensitive customer information, pose significant information security and financial risks to both consumers and software-makers alike [1]. Because of the high cost of quality and other factors such as compressed-release schedules or the emergence of new security risk categories, vulnerabilities exist, and external researchers discover them post-release when performing security assessments. Public disclosures of post-release vulnerabilities increased significantly between 1999 and 2018 [2], [3], eroding the reputation of software vendors and reducing customer confidence in security quality. Addressing all of these problems is crucial for companies that develop software and computer hardware [4], [5], [6], [7], [8], [9], because maintaining customer satisfaction in product security is essential to their financial success [1].
Fortunately, strategies do exist to reduce risk and ensure customer satisfaction in security quality throughout the software security lifecycle (SSL). Software-makers can refine security processes and policies, reallocate critical resources, and alter release-cycle requirements or constraints, such as feature requirements or release-cycle schedule and budget limitations. These adjustments apply to one of two areas: (1) activities aiming to improve security quality pre-release, such as refinement of processes and policies, reallocation of resources, and alteration of requirements and constraints [10]; and (2) activities seeking to control customer perception of security quality post-release by reducing threat exposure or inhibiting external discoveries [11]. Unfortunately, due to the high costs of achieving quality, managers must often decide which alternatives will provide maximal impact—a decision that is aided by the security modeling techniques presented by this article and its augmented successor (see [12]).
One class of modeling techniques, vulnerability discovery modeling (VDM), helps managers make decisions on how best to reduce uncertainty by forecasting external security fault discoveries that might occur during future post-release security assessment cycles [13]. Vulnerability discovery models are an application of software reliability models [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27] that uses patterns found in historical discovery events following a software release (SR) to forecast discovery-event counts over time. Categorizing VDM techniques by their use of influential covariates on the phenomenon results in two types. The first, “Black-box” VDM methods [12], enables managers to allocate resources that help ensure post-release vulnerability handling and satisfactory response times for subsequent release cycles. “Black-box” VDM methods demonstrate varying levels of success and include: linear and polynomial regression models [13], [28], [29], [30], [31], growth-curve models [13], [28], [29], [30], [32], [33], models based on NHPPs [13], [29], [31], [34], [35], [36], effort-based models [32], [37], [38], [39], [40], time-series models [41], and various specialty models [36], [42], [43], [44], [45], [46]. The second, “Clear-box” VDM methods, are introduced by our successor article (see [12]) and support additional decisions for strategies to reduce risk.
Forecasting performance of VDM methods can be improved by using a Bayesian model average (BMA) and this is enabled through an approach that includes expert judgment methods. As it significantly differs from the traditional approach taken by VDM applications in the literature to date, three important comments regarding its justification and/or elaboration are warranted. First, Bayesian methods are more suitable when the data are scarce as they enable the use of numerous and diverse information types—including expert judgment [47]. Second, in situations warranting data-gathering from subject matter experts, techniques should incorporate structured elicitation methods [48]. Third, Bayesian analyses provide the ability to average over all the models, resulting in better average predictive ability when compared to any single best model [49], [50].
This article presents the first expertise-enhanced Bayesian analysis for the vulnerability discovery phenomenon in software. Data-gathering included the elicitation and analysis of subjective information provided by a small set of experts during several workshops held in the greater Washington, D.C. area in Fall, 2014, coupled with the collection of empirical web-browser data from the National Vulnerability Database (NVD) [3], [29]. BMA, a new state-of-the-art technique for this modeling application, is used to predict interval-grouped discoveries over time by combining a new, non-parametric, NHPP model along with five NHPP model variants, two regression models, and two growth-curve models. For transparency, the source code is available for the complete modeling framework, named MCMCBayes, which notably supports individual model choice through Bayes factors (BF), and BMA via power-posteriors.
Section snippets
Background
The phenomenon of interest, post-release vulnerability discovery external to software-makers, is influenced by many factors but is only one of many states in the vulnerability lifecycle (VL; see Fig. 2.1, a refinement of [51]). To ease the introduction of this complex phenomenon, this section provides a brief background by: (1) describing a basic workflow for all states in the VL in the context of the software lifecycle (Section 2.1); and (2) presenting its influential variables (Section 2.2).
Methodology
When predicting rare events, it is appropriate to choose the subjectivist, or evidential, view of probability [47] and to perform data analysis using the Bayesian approach. This subsection outlines the steps in the Bayesian approach to elicitation and data analysis, using a general form, and then introduces specific notation for Bayesian analysis of the VDM techniques in this article.
The structured, expert-judgment elicitation process associated with Bayesian analyses for gathering data
Results and discussion
In this section, outcomes are described and reviewed. Its two sub-sections are as follows: 4.1 discusses highlights of using Cooke's method for elicitation and data aggregation; and 4.2 highlights the results from the analysis and includes forecasting demonstrations from both individual and averaged models.
Limitations and usage recommendations
This section starts with a discussion on the limitations of the methodology and then provides usage recommendations.
Variables describing two controversial areas of ROI were deliberately omitted: political and military entities; and peer recognition. However, by using academic experts as the data source and defining assessment resources in the elicitation scenarios, we ensured that these omissions would not affect results.
As pointed out by Roger Cooke [88], an issue with the workshop results is
Conclusion
Software vulnerabilities that enable well-known exploit techniques for committing computer crimes are preventable, but they continue to be present in software releases. In general, software security modeling techniques help managers make decisions on how best to reduce risk; in particular, this article presents a significant improvement to vulnerability discovery modeling by using expert-judgment and BMA to combine results from ten popular models in the literature. What's more, the
Acknowledgements
This article describes a portion of the primary author's dissertation research for The George Washington University, in partial fulfillment of the requirements for the Doctor of Philosophy degree. The authors sincerely thank the study participants for their efforts. Additionally, we thank Steve Lipner for the valuable suggestions and review of the software security material, Refik Soyer for the methodology advice, Roger Cooke for the comments on the CCM results, Stacy Hill for the methodology
References (88)
- et al.
Multivariate Models Using MCMCBayes for Web-Browser Vulnerability Discovery
Reliab Eng Syst Saf
(2018) - et al.
Software reliability research
The Littlewood-Verrall model for software reliability compared with some rivals
J Syst Softw
(1979)- et al.
An approach to software reliability prediction based on time series modeling
J Syst Softw
(2013) - et al.
Measuring, analyzing and predicting security vulnerabilities in software systems
Comput Secur
(2007) - et al.
The sigmoidal growth of operating system security vulnerabilities: An empirical revisit
Comput Secur
(2015) - et al.
Modeling vulnerability discovery process in Apache and IIS HTTP servers
Comput Secur
(2011) Software vulnerability: Definition, modelling, and practical evaluation for e-mail transfer software
Int J Pressure Vessels Pip
(2006)- et al.
Time series modeling of vulnerabilities
Comput Secur
(2015) - et al.
Time between vulnerability disclosures: a measure of software product vulnerability
Comput Secur
(2016)
Quantifying information security risks using expert judgment elicitation
Comput Oper Res Spec Issue Oper Res Risk Manag
TU Delft expert judgment data base
Reliab Eng Syst Saf Spec Issue Expert Judgm
Expert judgment-based fragility assessment of reinforced concrete buildings exposed to fire
Reliab Eng Syst Saf
Economics of malware: security decisions, incentives and externalities, OECD science
Technol Ind Work Pap
Common Vulnerabilities and Exposures
Stand Inf Secur Vulnerab Names
National Vulnerability Database, automating vulnerability management
Secur Measur Compl Check
Software Techniques for Managing Speculation on AMD Processors
About speculative execution vulnerabilities in ARM-based and Intel CPUs
Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism
Google's Mitigations Against CPU Speculative Execution Attack Methods
Speculative Execution and Indirect Branch Prediction Side Channel Analysis Method
Protect your Windows devices against Spectre and Meltdown
Information systems success: the quest for the independent variables
J Manag Inf Syst
Application of vulnerability discovery models to major operating systems
IEEE Trans Reliab
A Bayesian reliability growth model for computer software
A theory of software reliability and its application
IEEE Trans Softw Eng
Time-dependent error-detection rate model for software reliability and other performance measures
IEEE Trans Reliab
Analysis of discrete software reliability models
RADC TR-80-84
S-Shaped Reliability Growth Modeling for Software Error Detection
IEEE Trans Reliab
Software reliability models: assumptions, limitations, and applicability
IEEE Trans Softw Eng
Murphy's law, the fitness of evolving species, and the limits of software reliability
UCAM-CL-TR-471
Bayesian nonparametric inference for nonhomogeneous Poisson processes
UConnStatTR
Software reliability modeling with software metrics data via gaussian processes
IEEE Trans Softw Eng
An empirical methodology to evaluate vulnerability discovery models
IEEE Trans Softw Eng
Is finding security holes a good idea?
IEEE Secur Privacy
Modeling Skewness in Vulnerability Discovery, Qual
Reliab Eng Int
Quantitative Security Evaluation for Software System from Vulnerability Database
J Softw Eng Appl Special Issue Softw Dependabil
An open-source tool to support the quantitative assessment of cyber security for software intensive system acquisition
J Inf Warf
Vulnerability scrying method for software vulnerability discovery prediction without a vulnerability database
IEEE Trans Reliab
A Study on software vulnerability assessment modeling and its application to e-mail distribution software system
J Reliab Eng Assoc Jpn
Cited by (10)
Machine learning techniques for software vulnerability prediction: a comparative study
2022, Applied IntelligenceReliability evaluation for products by fusing expert knowledge and lifetime data
2022, Kongzhi yu Juece/Control and DecisionSoftware Reliability Growth Model with Rate of Change in Application Characteristics
2022, Recent Advances in Computer Science and CommunicationsPredicting the Discovery Pattern of Publically Known Exploited Vulnerabilities
2022, IEEE Transactions on Dependable and Secure ComputingSQVDT: A scalable quantitative vulnerability detection technique for source code security assessment
2021, Software - Practice and Experience