Skip to main content

A Novel Bagging Ensemble Approach for Variable Ranking and Selection for Linear Regression Models

  • Conference paper
  • First Online:
Multiple Classifier Systems (MCS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9132))

Included in the following conference series:

Abstract

With respect to variable selection for linear regression models, a novel bagging ensemble method is developed in this paper based on a ranked list of variables. Specifically, a mixed importance measure is assigned to each variable according to the order that it is selected by stepwise search algorithm into the final model as well as the improvement resulted from its inclusion. Considering that small permutations in training data may lead to some changes in the order that the variables enter the final model, the above process is repeated for multiple times with each executed on a bootstrap sample. Finally, the importance measure of each variable is averaged across the bootstrapping trials. The experiments conducted with some simulated data demonstrate that the novel method compares favorably with some other variable selection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bühlmann, P., Mandozzi, J.: High-dimensional variable screening and bias in subsequent inference, with an empirical comparison. Comput. Stat. 29(3–4), 407–430 (2014)

    Article  MATH  Google Scholar 

  2. Liu, C., Shi, T., Lee, Y.: Two tales of variable selection for high dimensional regression: screening and model building. Stat. Anal. Data Min. 7(2), 140–159 (2014)

    MathSciNet  Google Scholar 

  3. Fan, J.Q., Lv, J.C.: A selective overview of variable selection in high dimensional feature space. Stat. Sinica 20(1), 101–148 (2010)

    MATH  MathSciNet  Google Scholar 

  4. Wasserman, L., Roeder, K.: High-dimensional variable selection. Ann. Stat. 37(5A), 2178–2201 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  5. Shmueli, G.: To explain or to predict? Stat. Sci. 25(3), 289–310 (2010)

    Article  MathSciNet  Google Scholar 

  6. Xin, L., Zhu, M.: Stochastic stepwise ensembles for variable selection. J. Comput. Graph. Stat. 21(2), 275–294 (2012)

    Article  MathSciNet  Google Scholar 

  7. Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24(6), 2350–2383 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  8. Miller, A.: Subset Selection in Regression (Second Edition). Chapman & Hall/CRC Press, New Work (2002)

    Book  Google Scholar 

  9. Fan, J.Q., Li, R.Z.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  10. Meinshausen, N., Bühlmann, P.: Stability selection (with discussion). J. Royal Stat. Soc. (Ser. B) 72(4), 417–473 (2010)

    Article  Google Scholar 

  11. Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. Royal Stat. Soc. (Ser. B) 75(1), 55–80 (2013)

    Article  MathSciNet  Google Scholar 

  12. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. (Ser. B) 58(1), 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  13. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    Article  MATH  Google Scholar 

  14. Zhu, M., Chipman, H.A.: Darwinian evolution in parallel universes: a parallel genetic algorithm for variable selection. Technometrics 48(4), 491–502 (2006)

    Article  MathSciNet  Google Scholar 

  15. Wang, S.J., Nan, B., Rosset, S., Zhu, J.: Random lasso. Ann. Appl. Stat. 5(1), 468–485 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  16. Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. Taylor & Francis, Boca Raton (2012)

    Google Scholar 

  17. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  18. Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Sys. Sci. 55(1), 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  19. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)

    Article  MATH  Google Scholar 

  20. Mendes-Moreira, J., Soares, C., Jorge, A.M., de Sousa, J.F.: Ensemble approaches for regression: a survey. ACM Comput. Surv. 45(1), 40 (2012). Article 10

    Article  Google Scholar 

  21. Zhang, C.X., Wang, G.W., Liu, J.M.: RandGA: injecting randomness into parallel genetic algorithm for variable selection. J. Appl. Stat. 42(3), 630–647 (2015)

    Article  MathSciNet  Google Scholar 

  22. Zhu, M., Fan, G.Z.: Variable selection by ensembles for the Cox model. J. Stat. Comput. Simul. 81(12), 1983–1992 (2011)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Basic Research Program of China (973 Program, No. 2013CB329406), the National Natural Science Foundations of China (No. 11201367, 91230101), the Science Plan Foundation of the Education Bureau of Shaanxi Province of China (No. 14JK1672).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chun-Xia Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, CX., Zhang, JS., Wang, GW. (2015). A Novel Bagging Ensemble Approach for Variable Ranking and Selection for Linear Regression Models. In: Schwenker, F., Roli, F., Kittler, J. (eds) Multiple Classifier Systems. MCS 2015. Lecture Notes in Computer Science(), vol 9132. Springer, Cham. https://doi.org/10.1007/978-3-319-20248-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20248-8_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20247-1

  • Online ISBN: 978-3-319-20248-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics