Probabilistic Inference on Integrity for Access Behavior Based Malware Detection

Mao, Weixuan; Cai, Zhongmin; Towsley, Don; Guan, Xiaohong

doi:10.1007/978-3-319-26362-5_8

Weixuan Mao¹⁶,
Zhongmin Cai¹⁶,
Don Towsley¹⁷ &
…
Xiaohong Guan¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9404))

Included in the following conference series:

International Symposium on Recent Advances in Intrusion Detection

3206 Accesses
1 Citations

Abstract

Integrity protection has proven an effective way of malware detection and defense. Determining the integrity of subjects (programs) and objects (files and registries) plays a fundamental role in integrity protection. However, the large numbers of subjects and objects, and intricate behaviors place burdens on revealing their integrities either manually or by a set of rules. In this paper, we propose a probabilistic model of integrity in modern operating system. Our model builds on two primary security policies, “no read down” and “no write up”, which make connections between observed access behaviors and the inherent integrity ordering between pairs of subjects and objects. We employ a message passing based inference to determine the integrity of subjects and objects under a probabilistic graphical model. Furthermore, by leveraging a statistical classifier, we build an integrity based access behavior model for malware detection. Extensive experimental results on a real-world dataset demonstrate that our model is capable of detecting 7,257 malware samples from 27,840 benign processes at 99.88 % true positive rate under 0.1 % false positive rate. These results indicate the feasibility of our probabilistic integrity model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Remotely Assessing Integrity of Software Applications by Monitoring Invariants: Present Limitations and Future Directions

Exact Detection of Information Leakage: Decidability and Complexity

Automatic Detection and Repair Recommendation for Missing Checks

Article 06 September 2019

Notes

1.
In fact, we find s is about 7 in our experiments.

References

Anderson, R.: Security Engineering: A Guide to Building Dependable Distributed Systems. John Wiley & Sons (2008)
Google Scholar
Apap, F., Honig, A., Hershkop, S., Eskin, E., Stolfo, S.J.: Detecting malicious software by monitoring anomalous windows registry accesses. In: Wespi, A., Vigna, G., Deri, L. (eds.) RAID 2002. LNCS, vol. 2516, p. 36. Springer, Heidelberg (2002)
Chapter Google Scholar
Bellovin, S.M.: Security and usability: windows vista, July 2007. https://www.cs.columbia.edu/ smb/blog/2007-07/2007-07-13.html
Biba, K.J.: Integrity considerations for secure computer systems. ESD-TR 76–372, MITRE Corp. (1977)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Canali, D., Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: A quantitative study of accuracy in system call-based malware detection. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, pp. 122–132. ACM (2012)
Google Scholar
Fraser, T.: Lomac: low water-mark integrity protection for cots environments. In: IEEE Symposium on Security and Privacy (S&P), pp. 230–245 (2000)
Google Scholar
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: IEEE Symposium on Security and Privacy (S&P), pp. 45–60 (2010)
Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian data analysis, vol. 2. Taylor & Francis (2014)
Google Scholar
Gu, Z., Pei, K., Wang, Q., Si, L., Zhang, X., Xu, D.: LEAPS: detecting camouflaged attacks with statistical learning guided by program analysis. In: 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2015)
Google Scholar
How the integrity mechanism is implemented in Windows Vista (2014). http://msdn.microsoft.com/en-us/library/bb625962.aspx,
Hsu, F., Chen, H., Ristenpart, T., Li, J., Su, Z.: Back to the future: a framework for automatic malware removal and system repair. In: 22nd Annual Computer Security Applications Conference, ACSAC 2006, pp. 257–268. IEEE (2006)
Google Scholar
King, S.T., Chen, P.M.: Backtracking intrusions. ACM Trans. Comput. Syst. 23, 51–76 (2005)
Article Google Scholar
Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)
Google Scholar
Kruegel, C., Kirda, E., Mutz, D., Robertson, W., Vigna, G.: Automating mimicry attacks using static binary analysis. In: Proceedings of the 14th conference on USENIX Security Symposium, vol. 14, pp. 11–11. USENIX Association (2005)
Google Scholar
Kruegel, C., Mutz, D., Valeur, F., Vigna, G.: On the detection of anomalous system call arguments. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 326–343. Springer, Heidelberg (2003)
Chapter Google Scholar
Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: Accessminer: using system-centric models for malware protection. In: Proceedings of the 17th ACM conference on Computer and Communications Security (CCS), pp. 399–412. ACM (2010)
Google Scholar
Manadhata, P.K., Yadav, S., Rao, P., Horne, W.: Detecting malicious domains via graph inference. In: Kutyłowski, M., Vaidya, J. (eds.) ICAIS 2014, Part I. LNCS, vol. 8712, pp. 1–18. Springer, Heidelberg (2014)
Google Scholar
Mandatory Integrity Control (2014). http://msdn.microsoft.com/en-us/library/windows/desktop/bb648648
Mao, W., Cai, Z., Guan, X., Towsley, D.: Centrality metrics of importance in access behaviors and malware detections. In: Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC 2014). ACM (2014)
Google Scholar
Mao, Z., Li, N., Chen, H., Jiang, X.: Combining discretionary policy with mandatory information flow in operating systems. ACM Trans. Inf. Syst. Secur. (TISSEC) 14(3), 24 (2011)
Article Google Scholar
Mark Russinovich, B.C.: Process monitor (2014). http://technet.microsoft.com/en-us/sysinternals/bb896645
Muthukumaran, D., Rueda, S., Talele, N., Vijayakumar, H., Teutsch, J., Jaeger, T., Edwards, N.: Transforming commodity security policies to enforce Clark-Wilson integrity. In: Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC 2012). ACM (2012)
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Sun, W., Sekar, R., Liang, Z., Venkatakrishnan, V.N.: Expanding malware defense by securing software installations. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 164–185. Springer, Heidelberg (2008)
Chapter Google Scholar
Sun, W., Sekar, R., Poothia, G., Karandikar, T.: Practical proactive integrity preservation: a basis for malware defense. In: IEEE Symposium on Security and Privacy (S&P), pp. 248–262 (2008)
Google Scholar
Symantec. Internet Security Threat Report, April 2015. https://www4.symantec.com/mktginfo/whitepaper/ISTR/21347932_GA-internet-security-threat-report-volume-20-2015-social_v2.pdf
Sze, W.-K., Sekar, R.: A portable user-level approach for system-wide integrity protection. In: Proceedings of the 29th Annual Computer Security Applications Conference (ACSAC 2013), pp. 219–228. ACM (2013)
Google Scholar
Tamersoy, A., Roundy, K., Chau, D.H.: Guilt by association: large scale malware detection by mining file-relation graphs. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp. 1524–1533. ACM (2014)
Google Scholar
VXHeaven (2010). http://vx.netlux.org/

Download references

Acknowledgments

We would like to thank our shepherd, Manos Antonakakis, and the anonymous reviewers for their insightful comments that greatly helped improve the presentation of this paper. This work is supported by NFSC (61175039, 61221063, 61403301), 863 High Tech Development Plan (2012AA011003), Research Fund for Doctoral Program of Higher Education of China (20090201120032), International Research Collaboration Project of Shaanxi Province (2013KW11) and Fundamental Research Funds for Central Universities (2012jdhz08). Any opinions, findings, and conclusions or recommendations expressed in this material are the authors’ and do not necessarily reflect those of the sponsor.

Author information

Authors and Affiliations

MOE KLINNS Lab, Xi’an Jiaotong University, Xi’an, Shaanxi, China
Weixuan Mao, Zhongmin Cai & Xiaohong Guan
School of Computer Science, University of Massachusetts, Amherst, MA, USA
Don Towsley

Authors

Weixuan Mao
View author publications
You can also search for this author in PubMed Google Scholar
Zhongmin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Don Towsley
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongmin Cai .

Editor information

Editors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Herbert Bos
University of North Carolina at Chapel H, Chapel-Hill, USA
Fabian Monrose
Université Paris-Saclay, Evry, France
Gregory Blanc

Appendix- Derivation of Eq. (8)

$P(E_I|Acc)\propto \sum _{T}{P(Acc|T)P(T|E_I)\sum _{D}{P(E_I|D)P(D)}}$, where

$$\begin{aligned} \sum _{D}{P(E_I|D)P(D)}= {\left\{ \begin{array}{ll} \sum _{D}{d_1P(D)}=\mathbb {E}_D(d_1)=\frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)<I(o), \\ \sum _{D}{d_2P(D)}=\mathbb {E}_D(d_2)=\frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)=I(o), \\ \sum _{D}{d_3P(D)}=\mathbb {E}_D(d_3)=\frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)>I(o). \end{array}\right. } \end{aligned}$$

(18)

And then,

(1.)
If $I(s)<I(o)$:
$$ \begin{aligned} P(<|Acc)\propto & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1}t_2^{\beta _2-1}t_3^{\beta _3-1}}{B(1+\beta _1, \beta _2, \beta _3)}} \mathop {}\!\mathrm {d}T, \nonumber \\= & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{B(N_r+\beta _1+1, N_w+\beta _2, N_{r \& w}\beta _3)}{B(1+\beta _1, \beta _2, \beta _3)}, \nonumber \\= & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{\beta _1+\beta _2+\beta _3}{\beta _1}\frac{N_r+\beta _1}{N+\beta _1+\beta _2+\beta _3}\Omega , \end{aligned}$$
(19)
where $ \Delta =\frac{\Gamma (N+1)}{\Gamma (N_r+1)\Gamma (N_w+1)\Gamma (N_{r \& w}+1)}$, $ \Omega =\frac{B(N_r+\beta _1, N_w+\beta _2, N_{r \& w}+\beta _3)}{B(\beta _1, \beta _2, \beta _3)}$, and $B(\beta _1, \beta _2, \beta _3)=\frac{\Gamma (\beta _1)\Gamma (\beta _2)\Gamma (\beta _3)}{\Gamma (\beta _1+\beta _2+\beta _3)}$.
(2.)
If $I(s)=I(o)$:
$$ \begin{aligned} P(=|Acc) \propto \frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1-1}t_2^{\beta _2-1}t_3^{\beta _3-1}}{B(\beta _1, \beta _2, \beta _3)}} \mathop {}\!\mathrm {d}T = \frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}\Delta \Omega . \end{aligned}$$
(20)
(3.)
If $I(s)>I(o)$:
$$ \begin{aligned} P(>|Acc)\propto & {} \frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1-1}t_2^{\beta _2}t_3^{\beta _3-1}}{B(\beta _1, \beta _2+1, \beta _3)}} \mathop {}\!\mathrm {d}T, \nonumber \\= & {} \frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{\beta _1+\beta _2+\beta _3}{\beta _2}\frac{N_w+\beta _2}{N+\beta _1+\beta _2+\beta _3}\Omega . \nonumber \\ \end{aligned}$$
(21)

Summing up Eqs. (19)–(21), we derive the posterior distribution of $E_I$ given Acc, i.e., $P(E_I|Acc)$, as shown in Eqs. (9)–(11).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mao, W., Cai, Z., Towsley, D., Guan, X. (2015). Probabilistic Inference on Integrity for Access Behavior Based Malware Detection. In: Bos, H., Monrose, F., Blanc, G. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2015. Lecture Notes in Computer Science(), vol 9404. Springer, Cham. https://doi.org/10.1007/978-3-319-26362-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-26362-5_8
Published: 12 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26361-8
Online ISBN: 978-3-319-26362-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Probabilistic Inference on Integrity for Access Behavior Based Malware Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Remotely Assessing Integrity of Software Applications by Monitoring Invariants: Present Limitations and Future Directions

Exact Detection of Information Leakage: Decidability and Complexity

Automatic Detection and Repair Recommendation for Missing Checks

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix- Derivation of Eq. (8)

Appendix- Derivation of Eq. (8)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us