Skip to main content

Training Set Camouflage

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11199))

Abstract

We introduce a form of steganography in the domain of machine learning which we call training set camouflage. Imagine Alice has a training set on an illicit machine learning classification task. Alice wants Bob (a machine learning system) to learn the task. However, sending either the training set or the trained model to Bob can raise suspicion if the communication is monitored. Training set camouflage allows Alice to compute a second training set on a completely different – and seemingly benign – classification task. By construction, sending the second training set will not raise suspicion. When Bob applies his standard (public) learning algorithm to the second training set, he approximately recovers the classifier on the original task. Training set camouflage is a novel form of steganography in machine learning. We formulate training set camouflage as a combinatorial bilevel optimization problem and propose solvers based on nonlinear programming and local search. Experiments on real classification tasks demonstrate the feasibility of such camouflage.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alfeld, S., Zhu, X., Barford, P.: Explicit defense actions against test-set attacks. In: AAAI, pp. 1274–1280 (2017)

    Google Scholar 

  2. Balbach, F.J., Zeugmann, T.: Teaching randomized learners. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 229–243. Springer, Heidelberg (2006). https://doi.org/10.1007/11776420_19

    Chapter  Google Scholar 

  3. Barreno, M., Nelson, B., Joseph, A.D., Tygar, J.: The security of machine learning. Mach. Learn. 81(2), 121–148 (2010)

    Article  MathSciNet  Google Scholar 

  4. Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security (2006)

    Google Scholar 

  5. Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. arXiv preprint arXiv:1712.03141 (2017)

  6. Brakerski, Z.: Fully homomorphic encryption without modulus switching from classical GapSVP. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 868–886. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_50

    Chapter  Google Scholar 

  7. Brakerski, Z., Gentry, C., Vaikuntanathan, V.: (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 6(3), 13 (2014)

    MathSciNet  MATH  Google Scholar 

  8. Brückner, M., Kanzow, C., Scheffer, T.: Static prediction games for adversarial learning problems. J. Mach. Learn. Res. 13, 2617–2654 (2012)

    MathSciNet  MATH  Google Scholar 

  9. Brückner, M., Scheffer, T.: Nash equilibria of static prediction games. In: Advances in Neural Information Processing Systems (2009)

    Google Scholar 

  10. Brückner, M., Scheffer, T.: Stackelberg games for adversarial prediction problems. In: ACM SIGKDD (2011)

    Google Scholar 

  11. Bulò, S.R., Biggio, B., Pillai, I., Pelillo, M., Roli, F.: Randomized prediction games for adversarial machine learning. IEEE Trans. Neural Netw. Learn. Syst. 28, 2466–2478 (2016)

    Article  MathSciNet  Google Scholar 

  12. Bussieck, M.R., Pruessner, A.: Mixed-integer nonlinear programming. SIAG/OPT Newsl. Views News 14(1), 19–22 (2003)

    Google Scholar 

  13. Cachin, C.: An information-theoretic model for steganography. In: Aucsmith, D. (ed.) IH 1998. LNCS, vol. 1525, pp. 306–318. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49380-8_21

    Chapter  Google Scholar 

  14. Chandramouli, R.: A mathematical approach to steganalysis. In: Proceedings SPIE, vol. 4675, pp. 4–25 (2002)

    Google Scholar 

  15. Cox, I.J., Kalker, T., Pakura, G., Scheel, M.: Information transmission and steganography. In: Barni, M., Cox, I., Kalker, T., Kim, H.-J. (eds.) IWDW 2005. LNCS, vol. 3710, pp. 15–29. Springer, Heidelberg (2005). https://doi.org/10.1007/11551492_2

    Chapter  Google Scholar 

  16. Dalvi, N., Domingos, P., Sanghai, S., Verma, D., et al.: Adversarial classification. In: ACM SIGKDD (2004)

    Google Scholar 

  17. Dziugaite, G.K., Roy, D.M., Ghahramani, Z.: Training generative neural networks via maximum mean discrepancy optimization. arXiv preprint arXiv:1505.03906 (2015)

  18. Fridrich, J.: Feature-based steganalysis for JPEG images and its implications for future design of steganographic schemes. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 67–81. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30114-1_6

    Chapter  Google Scholar 

  19. Gentry, C., Sahai, A., Waters, B.: Homomorphic encryption from learning with errors: conceptually-simpler, asymptotically-faster, attribute-based. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 75–92. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_5

    Chapter  Google Scholar 

  20. Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13(Mar), 723–773 (2012)

    MathSciNet  MATH  Google Scholar 

  21. Hardt, M., Megiddo, N., Papadimitriou, C., Wootters, M.: Strategic classification. In: ACM ITCS (2016)

    Google Scholar 

  22. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)

    Google Scholar 

  23. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)

    Article  Google Scholar 

  24. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  Google Scholar 

  25. Hopper, N.J., Langford, J., von Ahn, L.: Provably secure steganography. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 77–92. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45708-9_6

    Chapter  Google Scholar 

  26. Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)

    Book  Google Scholar 

  27. Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B.I., Tygar, J.: Adversarial machine learning. In: AISEC (2011)

    Google Scholar 

  28. Joachims, T.: A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. Technical report, Carnegie-Mellon University Pittsburgh PA, Department of Computer Science (1996)

    Google Scholar 

  29. Johnson, N.F., Jajodia, S.: Exploring steganography: seeing the unseen. Computer 31(2), 26–34 (1998)

    Article  Google Scholar 

  30. Juels, A., Ristenpart, T.: Honey encryption: security beyond the brute-force bound. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 293–310. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55220-5_17

    Chapter  Google Scholar 

  31. Katz, J., Menezes, A.J., Van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996)

    MATH  Google Scholar 

  32. Ker, A.D.: Steganalysis of LSB matching in grayscale images. IEEE Signal Process. Lett. 12(6), 441–444 (2005)

    Article  Google Scholar 

  33. Kerckhoffs, A.: La Cryptographie Militaire (Part I), vol. 9, pp. 5–38 (1883)

    Google Scholar 

  34. Kerckhoffs, A.: La Cryptographie Militaire (Part II), vol. 9, pp. 161–191 (1883)

    Google Scholar 

  35. Kloft, M., Laskov, P.: A poisoning attack against online anomaly detection. In: NIPS Workshop on Machine Learning in Adversarial Environments for Computer Security. Citeseer (2007)

    Google Scholar 

  36. Kloft, M., Laskov, P.: Online anomaly detection under adversarial impact. In: AISTATS, pp. 405–412 (2010)

    Google Scholar 

  37. Kloft, M., Laskov, P.: Online anomaly detection under adversarial impact (2011)

    Google Scholar 

  38. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14(2), pp. 1137–1145. Montreal, Canada (1995)

    Google Scholar 

  39. Krasin, I., et al.: Openimages: a public dataset for large-scale multi-label and multi-class image classification. Dataset (2017). https://github.com/openimages

  40. Krenn, R.: Steganography and steganalysis (2004)

    Google Scholar 

  41. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  42. Laskov, P., Kloft, M.: A framework for quantitative security analysis of machine learning. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence (2009)

    Google Scholar 

  43. Letchford, J., Vorobeychik, Y.: Optimal interdiction of attack plans. In: AAMAS (2013)

    Google Scholar 

  44. Liu, J., Zhu, X.: The teaching dimension of linear learners. J. Mach. Learn. Res. 17(162), 1–25 (2016)

    MathSciNet  MATH  Google Scholar 

  45. Liu, W., Chawla, S.: A game theoretical model for adversarial learning. In: IEEE International Conference on Data Mining Workshops 2009. ICDMW 2009 (2009)

    Google Scholar 

  46. López-Alt, A., Tromer, E., Vaikuntanathan, V.: On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption. In: Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, pp. 1219–1234. ACM (2012)

    Google Scholar 

  47. Lowd, D., Meek, C.: Adversarial learning. In: ACM SIGKDD, pp. 641–647. ACM (2005)

    Google Scholar 

  48. Maganbhai, P.A.K., Chouhan, K.: A study and literature review on image steganography. Int. J. Comput. Sci. Inf. Technol. 6, 685–688 (2015)

    Google Scholar 

  49. Mei, S., Zhu, X.: Using machine teaching to identify optimal training-set attacks on machine learners. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  50. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  51. Queirolo, F.: Steganography in images. Final Communications Report 3 (2011)

    Google Scholar 

  52. Reyzin, L., Russell, S.: More efficient provably secure steganography. Department of Computer Science, Boston University (2003)

    Google Scholar 

  53. Rich, E., Knight, K.: Artificial Intelligence. McGraw-Hill, New York (1991)

    Google Scholar 

  54. Rivest, R.L., Adleman, L., Dertouzos, M.L.: On data banks and privacy homomorphisms. Found. Secur. Comput. 4(11), 169–180 (1978)

    MathSciNet  Google Scholar 

  55. Simmons, G.J.: The prisoners’ problem and the subliminal channel. In: Chaum, D. (ed.) Advances in Cryptology, pp. 51–67. Springer, Heidelberg (1984). https://doi.org/10.1007/978-1-4684-4730-9_5

    Chapter  Google Scholar 

  56. Singh, K.U.: A survey on image steganography techniques. Int. J. Comput. Appl. 97(18) (2014)

    Google Scholar 

  57. Smart, N.P., Vercauteren, F.: Fully homomorphic encryption with relatively small key and ciphertext sizes. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS, vol. 6056, pp. 420–443. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13013-7_25

    Chapter  MATH  Google Scholar 

  58. Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. J. Mach. Learn. Res. 2(Nov), 67–93 (2001)

    MathSciNet  MATH  Google Scholar 

  59. Tan, K.M.C., Killourhy, K.S., Maxion, R.A.: Undermining an anomaly-based intrusion detection system using common exploits. In: Wespi, A., Vigna, G., Deri, L. (eds.) RAID 2002. LNCS, vol. 2516, pp. 54–73. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36084-0_4

    Chapter  MATH  Google Scholar 

  60. Thompson, A.: All the news (2017). https://www.kaggle.com/snapcrack/all-the-news

  61. Van Tilborg, H.C., Jajodia, S.: Encyclopedia of Cryptography and Security. Springer, Heidelberg (2014)

    Google Scholar 

  62. Vorobeychik, Y., Li, B.: Optimal randomized classification in adversarial settings. In: AAMAS (2014)

    Google Scholar 

  63. Wu, H.C.: The Karush-Kuhn-Tucker optimality conditions in an optimization problem with interval-valued objective function. Eur. J. Oper. Res. 176(1), 46–59 (2007)

    Article  MathSciNet  Google Scholar 

  64. Zhang, L., Wu, J., Zhou, N.: Image encryption with discrete fractional cosine transform and chaos. In: Fifth International Conference on Information Assurance and Security 2009. IAS 2009, vol. 2, pp. 61–64. IEEE (2009)

    Google Scholar 

  65. Zhang, X., Zhu, X., Wright, S.: Training set debugging using trusted items. In: AAAI (2018)

    Google Scholar 

Download references

Acknowledgment

This work is supported in part by NSF 1545481, 1704117, 1623605, 1561512, and the MADLab AF Center of Excellence FA9550-18-1-0166.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ayon Sen .

Editor information

Editors and Affiliations

A Appendix A: MMD as Eve’s Detection Function

A Appendix A: MMD as Eve’s Detection Function

One critical component of our camouflage framework is Eve’s detection function \(\varPsi \)—how she determines if a training set is suspicious or not. Eve’s detection function is a two-sample test as its goal is to discern if the two sets \({\mathcal {C}}, D\) are drawn from the same distribution or not. In what follows we discuss using Maximum Mean Discrepancy (MMD) [20] as Eve’s detection function, as we do in our experiments. MMD is a widely used two-sample test [17], but, of course other detection functions can be used in (1). We first review basic \(\mathbf{MMD }\) following [20]. Let p and \(p'\) be two Borel probability measures defined on a topological space \(\mathcal {Z}\). Given a class of functions \(\mathcal {F}\) such that \(f:\mathcal {Z}\mapsto {\mathbb R}, f\in \mathcal {F}\), \(\mathbf{MMD }\) is defined as \( \mathbf{MMD }(p,p')=\sup _{f\in \mathcal {F}}(E_{{\mathbf z}}[f({{\mathbf z}})]-E_{{{\mathbf z}}'}[f({{\mathbf z}}')]) \). Any unit ball in a reproducing kernel Hilbert space (RKHS) can be used as the function class \(\mathcal {F}\) if the kernel is universal (e.g., Gaussian and Laplace kernels [58]). Using this function space, \(\mathbf{MMD }\) is a metric. This means \(\mathbf{MMD }(p,p') = 0 \Leftrightarrow p = p'\). Computing \(\mathbf{MMD }\) requires the expectations to be known, which generally, is not the case in practice. We obtain an empirical estimation by replacing the population expectations with empirical mean computed on i.i.d. samples \(Z=\{{{\mathbf z}}_1,\ldots ,{{\mathbf z}}_n\}\) and \(Z'=\{{{\mathbf z}}'_1,\ldots ,{{\mathbf z}}'_m\}\) from p and \(p'\), respectively. We define

where k is the kernel of the RKHS. Let \(d=\vert \mathbf{MMD }(Z,Z')-\mathbf{MMD }(p,p')\vert \). Gretton et al. show that \( P\left( d > 2 \left( \sqrt{\frac{K}{n}} + \sqrt{\frac{K}{m}}\right) + \epsilon \right) \le 2 e^{-\frac{\epsilon ^2nm}{2K(n\,+\,m)}} \), where K is an upperbound on the kernel values. We convert the above bound into a one-sided hypothesis testing procedure. Under the null hypothesis \(p=p'\) we have \(\mathbf{MMD }(p,p')=0\). We consider positive deviations of \(\mathbf{MMD }(Z,Z')\) from \(\mathbf{MMD }(p,p')\). Equating the RHS with \(\alpha \) (probability of incorrectly stating \(p\ne p'\) also known as the type I error) gives a hypothesis test of level-\(\alpha \), where solving \(\epsilon \) as a function of \(\alpha \) gives \( \alpha = e^{-\frac{\epsilon ^2nm}{2K(n\,+\,m)}} \Rightarrow \epsilon = \sqrt{\frac{2K(n\,+\,m)}{nm}\log \frac{1}{\alpha }} \). We retain the null hypothesis if \( \mathbf{MMD }(Z,Z') - T <0 \), where the threshold is \(T = 2 \left( \sqrt{\frac{K}{n}} + \sqrt{\frac{K}{m}}\right) + \sqrt{\frac{2K(n\,+\,m)}{nm}\log \frac{1}{\alpha }}.\) This also defines Eve’s detection function (\(\varPsi ({\mathcal {C}},D)\)) at level-\(\alpha \): \( \varPsi ({\mathcal {C}},D)\equiv \mathbf{MMD }({\mathcal {C}},D) - T. \) If \(\varPsi ({\mathcal {C}},D) \ge 0\) then Eve realizes that \(D\) is not drawn i.i.d. from \(\mathbb {Q}_{({{\mathbf x}}, y)}\) and flags it as suspicious.

For all our experiments Eve used the RBF kernel \(k({{\mathbf z}}_i, {{\mathbf z}}_j) = \exp \left( -\frac{\Vert {{\mathbf z}}_i\,-\,{{\mathbf z}}_j \Vert ^2}{2\sigma ^2}\right) \). Eve set \(\sigma \) to be the median distance between points in the camouflage pool as proposed in [20]. Eve also included the scaled class label as a feature dimension: \([{{\mathbf x}}_i, c \mathbbm {1}\{y_i=1\}]\) where \(c=\max _{k,l\text { such that } y_k = y_l} \Vert {{\mathbf x}}_{k} - {{\mathbf x}}_{l}\Vert \) and \(\mathbbm {1}\{\cdot \}\) is the indicator function. This augmented feature enables Eve to monitor both features and labels. When using the NLP solver Alice only has to consider instances from camouflage pool. She calculated \(\mathbf{MMD }\) in the following manner:

(6)

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sen, A., Alfeld, S., Zhang, X., Vartanian, A., Ma, Y., Zhu, X. (2018). Training Set Camouflage. In: Bushnell, L., Poovendran, R., Başar, T. (eds) Decision and Game Theory for Security. GameSec 2018. Lecture Notes in Computer Science(), vol 11199. Springer, Cham. https://doi.org/10.1007/978-3-030-01554-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01554-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01553-4

  • Online ISBN: 978-3-030-01554-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics