Abstract
In various fields, ensemble models by supervised learning are effective, but the models cannot tell us how to modify the input vector so that we will increase the objective variable more than a given threshold or decrease it less than the threshold. In this paper, we propose a method, TRANS-AM, that can discover an input vector satisfying the condition of changing of the objective variable in regression problems by using a property of regression tree. The regression tree splits input space into subspaces. There are subspaces with corresponding objective variables satisfying such a condition. By transforming the input vector to new input vectors belonging to one of the subspaces, we can discover a new input vector whose distance from the original input vector is minimum by satisfying the condition to change the objective variable. The reason for “minimum” is the cost—if the new input vector is far from the original one, we need the significant cost to modify the original input vector to the new one. We evaluated the proposed method through numerical simulations and investigated that the proposed method works well; the ratio of the number of discovered input vectors satisfying the condition per the number of discovered input vectors is \(60\%\) for the datasets generated through logistic function.
H. Tanaka—Presently with Research Laboratory, NTT DOCOMO Inc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breiman, L.: Random forests. Mach Learn. 45(1), 5–32 (2001)
Tolomei, G., Silvestri, F., Haines, A., Lalmas, M.: Interpretable predictions of tree-based ensembles via actionable feature tweaking. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 465–474. ACM (2017)
Cao, L., Luo, D., Zhang, C.: Knowledge actionability: satisfying technical and business interestingness. IJBIDM 2, 496–514 (2007)
Hilderman, R.J., Hamilton, H.J.: Applying objective interestingness measures in data mining systems. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 432–439. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_47
Liu, B., Hsu, W.: Post-analysis of learned rules. In: AAAI/IAAI, vol. 1, pp. 828–834 (1996)
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 125–134. ACM (1999)
Cao, L., Zhang, C.: Domain-driven, actionable knowledge discovery. IEEE Intell. Syst. 22(4) (2007)
Cao, L., Zhao, Y., Zhang, H., Luo, D., Zhang, C., Park, E.K.: Flexible frameworks for actionable knowledge discovery. IEEE Trans. Knowl. Data Eng. 22(9), 1299–1312 (2010)
Du, J., Hu, Y., Ling, C.X., Fan, M., Liu, M.: Efficient action extraction with many-to-many relationship between actions and features. In: van Ditmarsch, H., Lang, J., Ju, S. (eds.) LORI 2011. LNCS (LNAI), vol. 6953, pp. 384–385. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24130-7_29
Karim, M., Rahman, R.M.: Decision tree and naive bayes algorithm for classification and generation of actionable knowledge for direct marketing. J. Softw. Eng. Appl. 6(04), 196 (2013)
Yang, Q., Yin, J., Ling, C., Pan, R.: Extracting actionable knowledge from decision trees. IEEE Trans. Knowl. Data Eng. 19(1), 43–56 (2007)
Yang, Q., Yin, J., Ling, C.X., Chen, T.: Postprocessing decision trees to extract actionable knowledge. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 685–688. IEEE (2003)
Friedman, J., Hastie, T., Tibshirani, R.: The elements of statistical learning. Springer Series in Statistics, vol. 1. Springer, New York (2001). https://doi.org/10.1007/978-0-387-21606-5
Cui, Z., Chen, W., He, Y., Chen, Y.: Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 179–188. ACM (2015)
Manindra, A., Thomas, T.: Satisfiability problems. Technical report (2000)
CPLEX, I.I.: V12. 1: User’s manual for cplex. Int. Bus. Mach. Corp. 46(53), 157 (2009)
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery And Data Mining, pp. 785–794. ACM (2016)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Tyree, S., Weinberger, K.Q., Agrawal, K., Paykin, J.: Parallel boosted regression trees for web search ranking. In: Proceedings of the 20th International Conference on World Wide Web, pp. 387–396. ACM (2011)
Acknowledgement
This research was partially supported by NAIST Big Data Project.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Tanaka, H., Suzuki, Y., Yoshino, K., Nakamura, S. (2018). TRANS-AM: Discovery Method of Optimal Input Vectors Corresponding to Objective Variables. In: Ordonez, C., Bellatreche, L. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2018. Lecture Notes in Computer Science(), vol 11031. Springer, Cham. https://doi.org/10.1007/978-3-319-98539-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-98539-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98538-1
Online ISBN: 978-3-319-98539-8
eBook Packages: Computer ScienceComputer Science (R0)