Skip to main content

Advertisement

Log in

Optimal row and column ordering to improve table interpretation using estimation of distribution algorithms

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

Abstract

A common information representation task in research as well as educational and statistical practice is to comprehensively and intuitively express data in two-dimensional tables. Examples include tables in scientific papers, as well as reports and the popular press.

Data is often simple enough for users to reorder. In many other cases though, there are complex data patterns that make finding the best re-arrangement of rows and columns for optimum readability a tough problem.

We propose that row and column ordering should be regarded as a combinatorial optimization problem and solved using evolutionary computation techniques. The use of genetic algorithms has already been proposed in the literature. This paper proposes for the first time the use of estimation of distribution algorithms for table ordering. We also propose alternative ways of representing the problem in order to reduce its dimensionality. By learning a selective naive Bayes classifier, we can find out how to jointly combine the parameters of these algorithms to get good table orderings. Experimental examples in this paper are on 2D tables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Banfield, R., Raferty, A.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–822 (1992)

    Article  Google Scholar 

  • Bengoetxea, E., Larrañaga, P., Bloch, I., Perchant, A., Boeres, C.: Learning and simulation of Bayesian networks applied to inexact graph matching. Pattern Recognit. 35(12), 2867–2880 (2002)

    Article  MATH  Google Scholar 

  • Bertin, J.: Graphics and Graphic Information Processing. Walter de Gruyter, Berlin (1981)

    Book  Google Scholar 

  • Bielza, C., Fernández, J., Larrañaga, P., Bengoetxea, E.: Multidimensional statistical analysis of the parameterization of a genetic algorithm for the optimal ordering of tables. Expert Syst. Appl. 48(4)

  • Cabrera, J., McDougall, A.: Statistical Consulting. Springer, New York (2002)

    MATH  Google Scholar 

  • Cesar, J., Bengoetxea, E., Bloch, I., Larrañaga, P.: Inexact graph matching for model-based recognition: Evaluation and comparison of optimization algorithms. Pattern Recognit. 38(11), 2099–2113 (2005)

    Article  Google Scholar 

  • Consortium, E.: Elvira: An environment for creating and using probabilistic graphical models. In: Proceedings of the 1st European Workshop on Probabilistic Graphical Models, pp. 222–230. Cuenca, Spain (2002)

  • de Bonet, J., Isbell, C., Viola, P.: MIMIC: Finding optima by estimating probability densities. In: Mozer, M.J.M., Petsche, T. (eds.), Advances in Neural Information Processing Systems, vol. 9, pp. 424–431. Cambridge, MA (1997)

  • Etxeberria, R., Larrañaga, P.: Global optimization with Bayesian networks. In: Special Session on Distributions and Evolutionary Optimization, pp. 332–339. II Symposium on Artificial Intelligence, CIMAF99, La Habana, Cuba (1999)

  • Friedman, J., Rafsky, L.: Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann. Stat. 7, 679–717 (1979)

    Article  MathSciNet  Google Scholar 

  • Friendly, M.: Corrgramms: Exploratory displays for correlation matrices. Am. Stat. 56(4), 316–324 (2002)

    Article  MathSciNet  Google Scholar 

  • Garcia, C., Perez, D., Campos, V., Marti, R.: Variable neighborhood search for the linear ordering problem. Comput. Oper. Res. 33(12), 3549–3565 (2006)

    Article  MATH  Google Scholar 

  • Gómez, M., Bielza, C.: Node deletion sequences in influence diagrams using genetic algorithms. Stat. Comput. 14, 181–198 (2004)

    Article  MathSciNet  Google Scholar 

  • Henrion, M.: Propagating uncertainty in Bayesian networks by probabilistic logic sampling. In: Lemmer, J., Kanal, L. (eds.) Uncertainty in Artificial Intelligence. vol. 2, pp. 149–163. North-Holland, Amsterdam (1988)

    Google Scholar 

  • Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by Bayesian networks based optimization. Artif. Intell. 123(1–3), 157–184 (2000)

    Article  MATH  Google Scholar 

  • Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  • Koschat, M.: A case for simple tables. Am. Stat. 59(1), 31–40 (2005)

    Article  MathSciNet  Google Scholar 

  • Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Seattle, WA (1994)

  • Larrañaga, P., Etxeberria, R., Lozano, J.A., Peña, J.M.: Optimization in continuous domains by learning and simulation of Gaussian networks. In: Proceedings of the Workshop in Optimization by Building and Using Probabilistic Models, pp. 201–204. A Workshop within the 2000 Genetic and Evolutionary Computation Conference, GECCO 2000, Las Vegas, Nevada, USA (2000)

  • Larrañaga, P., Kuijpers, C.M.H., Murga, R.H., Inza, I., Dizdarevich, S.: Evolutionary algorithms for the travelling salesman problem: A review of representations and operators. Artif. Intell. Rev. 13, 120–170 (1999)

    Article  Google Scholar 

  • Larrañaga, P., Kuijpers, C.M.H., Murga, R.H., Yurramendi, Y.: Searching for the best ordering in the structure learning of Bayesian networks. IEEE Trans. Syst. Man Cybern. 41(4), 487–493 (1996)

    Google Scholar 

  • Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation. Kluwer Academic, Amsterdam (2001)

    Book  Google Scholar 

  • Liu, K., Feng, J., Young, S.: PowerMV: A software environment for molecular viewing, descriptor generation, data analysis and hit evaluation J. Chem. Inf. Model. 45(2), 515–522 (2005)

    Article  Google Scholar 

  • Miller, G.: The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychol. Rev. 62, 81–97 (1956)

    Article  Google Scholar 

  • Minsky, M.: Steps toward artificial intelligence. Trans. Inst. Radio Eng. 49, 8–30 (1961)

    MathSciNet  Google Scholar 

  • Mühlenbein, H.: The equation for response to selection and its use for prediction. Evol. Comput. 5(3), 303–346 (1998)

    Article  Google Scholar 

  • Niermann, S.: Optimizing the ordering of tables with evolutionary computation. Am. Stat. 59(1), 41–46 (2005)

    Article  MathSciNet  Google Scholar 

  • Niermann, S.: Letters to the editor. Am. Stat. 59, 354 (2005)

    Article  MathSciNet  Google Scholar 

  • Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, Palo Alto (1988)

    Google Scholar 

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 7(2), 461–464 (1978)

    Article  Google Scholar 

  • Shachter, R., Kenley, C.: Gaussian influence diagrams. Manag. Sci. 35(5), 527–550 (1989)

    Article  Google Scholar 

  • Walker, H., Durost, W.: Statistical Tables: Their Structure and Use. Bureau of Publications, Teachers College, Columbia University, New York (1936)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. Bengoetxea.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bengoetxea, E., Larrañaga, P., Bielza, C. et al. Optimal row and column ordering to improve table interpretation using estimation of distribution algorithms. J Heuristics 17, 567–588 (2011). https://doi.org/10.1007/s10732-010-9145-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-010-9145-z

Keywords

Navigation