Bayesian-model averaging using MCMCBayes for web-browser vulnerability discovery

doi:10.1016/j.ress.2018.11.030

Reliability Engineering & System Safety

Volume 183, March 2019, Pages 341-359

https://doi.org/10.1016/j.ress.2018.11.030 Get rights and content

Highlights

•
Describes vulnerability discovery phenomenon in the software security lifecycle.
•
Presents 46 software release (SR) and security assessment profile (SAP) variables.
•
Elicits dataset using Cooke's method; gathers empirical web-browser datasets.
•
Details Bayesian analysis of vulnerability discovery modeling (VDM) techniques.
•
Demonstrates new, non-parametric and Bayesian model average (BMA) VDM techniques.

ABSTRACT

Most software vulnerabilities are preventable, but they continue to be present in software releases. When Blackhats, or malicious researchers, discover vulnerabilities, they often release corresponding exploit software and malware. Therefore, customer confidence could be reduced if vulnerabilities—or discoveries of them—are not prevented, mitigated, or addressed. In addressing this, managers must choose which alternatives will provide maximal impact and could use vulnerability discovery modeling techniques to support their decision-making process. Applications of these techniques have used traditional approaches to analysis and, despite the dearth of data, have not included information from experts. This article takes an alternative approach, applying Bayesian methods to modeling the vulnerability-discovery phenomenon. Relevant data was obtained from security experts in structured workshops and from public databases. The open-source framework, MCMCBayes, was developed to automate performing Bayesian model averaging via power-posteriors. It combines predictions of interval-grouped discoveries by performance-weighting results from six variants of the non-homogeneous Poisson process (NHPP), two regression models, and two growth-curve models. The methodology is applicable to software-makers and persons interested in applications of expert-judgment elicitation or in using Bayesian analysis techniques with phenomena having non-decreasing counts over time.

Graphical abstract

Introduction

The rise of electronic crime, the proliferation of networked computing devices and their extensive customer usage, as well as the increasing interaction of software with various forms of sensitive customer information, pose significant information security and financial risks to both consumers and software-makers alike [1]. Because of the high cost of quality and other factors such as compressed-release schedules or the emergence of new security risk categories, vulnerabilities exist, and external researchers discover them post-release when performing security assessments. Public disclosures of post-release vulnerabilities increased significantly between 1999 and 2018 [2], [3], eroding the reputation of software vendors and reducing customer confidence in security quality. Addressing all of these problems is crucial for companies that develop software and computer hardware [4], [5], [6], [7], [8], [9], because maintaining customer satisfaction in product security is essential to their financial success [1].

Fortunately, strategies do exist to reduce risk and ensure customer satisfaction in security quality throughout the software security lifecycle (SSL). Software-makers can refine security processes and policies, reallocate critical resources, and alter release-cycle requirements or constraints, such as feature requirements or release-cycle schedule and budget limitations. These adjustments apply to one of two areas: (1) activities aiming to improve security quality pre-release, such as refinement of processes and policies, reallocation of resources, and alteration of requirements and constraints [10]; and (2) activities seeking to control customer perception of security quality post-release by reducing threat exposure or inhibiting external discoveries [11]. Unfortunately, due to the high costs of achieving quality, managers must often decide which alternatives will provide maximal impact—a decision that is aided by the security modeling techniques presented by this article and its augmented successor (see [12]).

One class of modeling techniques, vulnerability discovery modeling (VDM), helps managers make decisions on how best to reduce uncertainty by forecasting external security fault discoveries that might occur during future post-release security assessment cycles [13]. Vulnerability discovery models are an application of software reliability models [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27] that uses patterns found in historical discovery events following a software release (SR) to forecast discovery-event counts over time. Categorizing VDM techniques by their use of influential covariates on the phenomenon results in two types. The first, “Black-box” VDM methods [12], enables managers to allocate resources that help ensure post-release vulnerability handling and satisfactory response times for subsequent release cycles. “Black-box” VDM methods demonstrate varying levels of success and include: linear and polynomial regression models [13], [28], [29], [30], [31], growth-curve models [13], [28], [29], [30], [32], [33], models based on NHPPs [13], [29], [31], [34], [35], [36], effort-based models [32], [37], [38], [39], [40], time-series models [41], and various specialty models [36], [42], [43], [44], [45], [46]. The second, “Clear-box” VDM methods, are introduced by our successor article (see [12]) and support additional decisions for strategies to reduce risk.

Forecasting performance of VDM methods can be improved by using a Bayesian model average (BMA) and this is enabled through an approach that includes expert judgment methods. As it significantly differs from the traditional approach taken by VDM applications in the literature to date, three important comments regarding its justification and/or elaboration are warranted. First, Bayesian methods are more suitable when the data are scarce as they enable the use of numerous and diverse information types—including expert judgment [47]. Second, in situations warranting data-gathering from subject matter experts, techniques should incorporate structured elicitation methods [48]. Third, Bayesian analyses provide the ability to average over all the models, resulting in better average predictive ability when compared to any single best model [49], [50].

This article presents the first expertise-enhanced Bayesian analysis for the vulnerability discovery phenomenon in software. Data-gathering included the elicitation and analysis of subjective information provided by a small set of experts during several workshops held in the greater Washington, D.C. area in Fall, 2014, coupled with the collection of empirical web-browser data from the National Vulnerability Database (NVD) [3], [29]. BMA, a new state-of-the-art technique for this modeling application, is used to predict interval-grouped discoveries over time by combining a new, non-parametric, NHPP model along with five NHPP model variants, two regression models, and two growth-curve models. For transparency, the source code is available for the complete modeling framework, named MCMCBayes, which notably supports individual model choice through Bayes factors (BF), and BMA via power-posteriors.

Section snippets

Background

The phenomenon of interest, post-release vulnerability discovery external to software-makers, is influenced by many factors but is only one of many states in the vulnerability lifecycle (VL; see Fig. 2.1, a refinement of [51]). To ease the introduction of this complex phenomenon, this section provides a brief background by: (1) describing a basic workflow for all states in the VL in the context of the software lifecycle (Section 2.1); and (2) presenting its influential variables (Section 2.2).

Methodology

When predicting rare events, it is appropriate to choose the subjectivist, or evidential, view of probability [47] and to perform data analysis using the Bayesian approach. This subsection outlines the steps in the Bayesian approach to elicitation and data analysis, using a general form, and then introduces specific notation for Bayesian analysis of the VDM techniques in this article.

The structured, expert-judgment elicitation process associated with Bayesian analyses for gathering data

Results and discussion

In this section, outcomes are described and reviewed. Its two sub-sections are as follows: 4.1 discusses highlights of using Cooke's method for elicitation and data aggregation; and 4.2 highlights the results from the analysis and includes forecasting demonstrations from both individual and averaged models.

Limitations and usage recommendations

This section starts with a discussion on the limitations of the methodology and then provides usage recommendations.

Variables describing two controversial areas of ROI were deliberately omitted: political and military entities; and peer recognition. However, by using academic experts as the data source and defining assessment resources in the elicitation scenarios, we ensured that these omissions would not affect results.

As pointed out by Roger Cooke [88], an issue with the workshop results is

Conclusion

Software vulnerabilities that enable well-known exploit techniques for committing computer crimes are preventable, but they continue to be present in software releases. In general, software security modeling techniques help managers make decisions on how best to reduce risk; in particular, this article presents a significant improvement to vulnerability discovery modeling by using expert-judgment and BMA to combine results from ten popular models in the literature. What's more, the

Acknowledgements

This article describes a portion of the primary author's dissertation research for The George Washington University, in partial fulfillment of the requirements for the Doctor of Philosophy degree. The authors sincerely thank the study participants for their efforts. Additionally, we thank Steve Lipner for the valuable suggestions and review of the software security material, Refik Soyer for the methodology advice, Roger Cooke for the comments on the CCM results, Stacy Hill for the methodology

References (88)

R. Johnston et al.
Multivariate Models Using MCMCBayes for Web-Browser Vulnerability Discovery
Reliab Eng Syst Saf
(2018)
Z. Jelinski et al.
Software reliability research
B. Littlewood
The Littlewood-Verrall model for software reliability compared with some rivals
J Syst Softw
(1979)
A. Amin et al.
An approach to software reliability prediction based on time series modeling
J Syst Softw
(2013)
O.H. Alhazmi et al.
Measuring, analyzing and predicting security vulnerabilities in software systems
Comput Secur
(2007)
J. Ruohonen et al.
The sigmoidal growth of operating system security vulnerabilities: An empirical revisit
Comput Secur
(2015)
S.W. Woo et al.
Modeling vulnerability discovery process in Apache and IIS HTTP servers
Comput Secur
(2011)
M. Kimura
Software vulnerability: Definition, modelling, and practical evaluation for e-mail transfer software
Int J Pressure Vessels Pip
(2006)
Y. Roumani et al.
Time series modeling of vulnerabilities
Comput Secur
(2015)
P. Johnson et al.
Time between vulnerability disclosures: a measure of software product vulnerability
Comput Secur
(2016)

J. Ryan et al.

Quantifying information security risks using expert judgment elicitation

Comput Oper Res Spec Issue Oper Res Risk Manag

(2012)

R.M. Cooke et al.

TU Delft expert judgment data base

Reliab Eng Syst Saf Spec Issue Expert Judgm

(2008)

I. Ioannou et al.

Expert judgment-based fragility assessment of reinforced concrete buildings exposed to fire

Reliab Eng Syst Saf

(2017)

M.J. van Eeten et al.

Economics of malware: security decisions, incentives and externalities, OECD science

Technol Ind Work Pap

(2008)

Common Vulnerabilities and Exposures

Stand Inf Secur Vulnerab Names

(2017)

National Vulnerability Database, automating vulnerability management

Secur Measur Compl Check

(2017)

Software Techniques for Managing Speculation on AMD Processors

About speculative execution vulnerabilities in ARM-based and Intel CPUs

Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism

Google's Mitigations Against CPU Speculative Execution Attack Methods

Speculative Execution and Indirect Branch Prediction Side Channel Analysis Method

Protect your Windows devices against Spectre and Meltdown

Williams L., Gegick M., Vouk M., Predictive models for identifying software components prone to failure during security...

S. Petter et al.

Information systems success: the quest for the independent variables

J Manag Inf Syst

(2013)

O.H. Alhazmi et al.

Application of vulnerability discovery models to major operating systems

IEEE Trans Reliab

(2008)

B. Littlewood et al.

A Bayesian reliability growth model for computer software

Moranda P.B., Prediction of software reliability during debugging, In: Proceedings of the Annual Reliability and...

J.D. Musa

A theory of software reliability and its application

IEEE Trans Softw Eng

(1975)

A.L. Goel et al.

Time-dependent error-detection rate model for software reliability and other performance measures

IEEE Trans Reliab

(1979)

W. Brooks et al.

Analysis of discrete software reliability models

RADC TR-80-84

(1980)

S. Yamada et al.

S-Shaped Reliability Growth Modeling for Software Error Detection

IEEE Trans Reliab

(1983)

Musa J.D., Okumoto K., A logarithmic poisson execution time model for software reliability measurement, In: Proceedings...

A.L. Goel

Software reliability models: assumptions, limitations, and applicability

IEEE Trans Softw Eng

(1985)

R.M. Brady et al.

Murphy's law, the fitness of evolving species, and the limits of software reliability

UCAM-CL-TR-471

(1999)

L. Kuo et al.

Bayesian nonparametric inference for nonhomogeneous Poisson processes

UConnStatTR

(1997)

N. Torrado et al.

Software reliability modeling with software metrics data via gaussian processes

IEEE Trans Softw Eng

(2013)

F. Massacci et al.

An empirical methodology to evaluate vulnerability discovery models

IEEE Trans Softw Eng

(2014)

E. Rescorla

Is finding security holes a good idea?

IEEE Secur Privacy

(2005)

H. Joh et al.

Modeling Skewness in Vulnerability Discovery, Qual

Reliab Eng Int

(2014)

H. Okamura et al.

Quantitative Security Evaluation for Software System from Vulnerability Database

J Softw Eng Appl Special Issue Softw Dependabil

(2013)

V. Nagaraju et al.

An open-source tool to support the quantitative assessment of cyber security for software intensive system acquisition

J Inf Warf

(2017)

S. Rahimi et al.

Vulnerability scrying method for software vulnerability discovery prediction without a vulnerability database

IEEE Trans Reliab

(2013)

M. Kimura

A Study on software vulnerability assessment modeling and its application to e-mail distribution software system

J Reliab Eng Assoc Jpn

(2003)

Alhazmi O.H., Malaiya Y.K., Quantitative vulnerability assessment of systems software, In: Proceedings of the Annual...

Cited by (10)

Machine learning techniques for software vulnerability prediction: a comparative study
2022, Applied Intelligence
Reliability evaluation for products by fusing expert knowledge and lifetime data
2022, Kongzhi yu Juece/Control and Decision
Software Reliability Growth Model with Rate of Change in Application Characteristics
2022, Recent Advances in Computer Science and Communications
Predicting the Discovery Pattern of Publically Known Exploited Vulnerabilities
2022, IEEE Transactions on Dependable and Secure Computing
Searching deterministic chaotic properties in system-wide vulnerability datasets
2021, Informatics
SQVDT: A scalable quantitative vulnerability detection technique for source code security assessment
2021, Software - Practice and Experience

View all citing articles on Scopus

View full text

Bayesian-model averaging using MCMCBayes for web-browser vulnerability discovery

Highlights

ABSTRACT

Graphical abstract

Introduction

Section snippets

Background

Methodology

Results and discussion

Limitations and usage recommendations

Conclusion

Acknowledgements

Reliab Eng Syst Saf

J Syst Softw

J Syst Softw

Comput Secur

Comput Secur

Comput Secur

Int J Pressure Vessels Pip

Comput Secur

Comput Secur

Comput Oper Res Spec Issue Oper Res Risk Manag

Reliab Eng Syst Saf Spec Issue Expert Judgm

Reliab Eng Syst Saf

Economics of malware: security decisions, incentives and externalities, OECD science

Technol Ind Work Pap

Common Vulnerabilities and Exposures

Stand Inf Secur Vulnerab Names

National Vulnerability Database, automating vulnerability management

Secur Measur Compl Check

Software Techniques for Managing Speculation on AMD Processors

About speculative execution vulnerabilities in ARM-based and Intel CPUs

Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism

Google's Mitigations Against CPU Speculative Execution Attack Methods

Speculative Execution and Indirect Branch Prediction Side Channel Analysis Method

Protect your Windows devices against Spectre and Meltdown

Information systems success: the quest for the independent variables

J Manag Inf Syst

Application of vulnerability discovery models to major operating systems

IEEE Trans Reliab

A Bayesian reliability growth model for computer software

A theory of software reliability and its application

IEEE Trans Softw Eng

Time-dependent error-detection rate model for software reliability and other performance measures

IEEE Trans Reliab

Analysis of discrete software reliability models

RADC TR-80-84

S-Shaped Reliability Growth Modeling for Software Error Detection

IEEE Trans Reliab

Software reliability models: assumptions, limitations, and applicability

IEEE Trans Softw Eng

Murphy's law, the fitness of evolving species, and the limits of software reliability

UCAM-CL-TR-471

Bayesian nonparametric inference for nonhomogeneous Poisson processes

UConnStatTR

Software reliability modeling with software metrics data via gaussian processes

IEEE Trans Softw Eng

An empirical methodology to evaluate vulnerability discovery models

IEEE Trans Softw Eng

Is finding security holes a good idea?

IEEE Secur Privacy

Modeling Skewness in Vulnerability Discovery, Qual

Reliab Eng Int

Quantitative Security Evaluation for Software System from Vulnerability Database

J Softw Eng Appl Special Issue Softw Dependabil

An open-source tool to support the quantitative assessment of cyber security for software intensive system acquisition

J Inf Warf

Vulnerability scrying method for software vulnerability discovery prediction without a vulnerability database

IEEE Trans Reliab

A Study on software vulnerability assessment modeling and its application to e-mail distribution software system

J Reliab Eng Assoc Jpn