Skip to main content

Efficient Clustering for Orders

  • Chapter
Book cover Mining Complex Data

Part of the book series: Studies in Computational Intelligence ((SCI,volume 165))

Abstract

Lists of ordered objects are widely used as representational forms. Such ordered objects include Web search results or best-seller lists. Clustering is a useful data analysis technique for grouping mutually similar objects. To cluster orders, hierarchical clustering methods have been used together with dissimilarities defined between pairs of orders. However, hierarchical clustering methods cannot be applied to large-scale data due to their computational cost in terms of the number of orders. To avoid this problem, we developed an k-o’means algorithm. This algorithm successfully extracted grouping structures in orders, and was computationally efficient with respect to the number of orders. However, it was not efficient in cases where there are too many possible objects yet. We therefore propose a new method (k-o’means-EBC), grounded on a theory of order statistics. We further propose several techniques to analyze acquired clusters of orders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Luaces, O., Bayón, G.F., Quevedo, J.R., Díez, J., del Coz, J.J., Bahamonde, A.: Analyzing sensory data using non-linear preference learning with feature subset selection. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 286–297. Springer, Heidelberg (2004)

    Google Scholar 

  2. Fujibuchi, W., Kiseleva, L., Horton, P.: Searching for similar gene expression profiles across platforms. In: Proc. of the 16th Int’l Conf. on Genome Informatics, p. 143 (2005)

    Google Scholar 

  3. Everitt, B.S.: Cluster Analysis, 3rd edn. Edward Arnold (1993)

    Google Scholar 

  4. Marden, J.I.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability, vol. 64. Chapman & Hall, Boca Raton (1995)

    MATH  Google Scholar 

  5. Branting, L.K., Broos, P.S.: Automated acquisition of user preference. Int’l Journal of Human-Computer Studies 46, 55–77 (1997)

    Article  Google Scholar 

  6. Joachims, T.: Optimizing search engines using clickthrough data. In: Proc. of The 8th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 133–142 (2002)

    Google Scholar 

  7. Olson, C.F.: Parallel algorithms for hierarchical clustering. Parallel Computing 21, 1313–1325 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  8. Kamishima, T., Fujiki, J.: Clustering orders. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 194–207. Springer, Heidelberg (2003)

    Google Scholar 

  9. Kendall, M., Gibbons, J.D.: Rank Correlation Methods, 5th edn. Oxford University Press, Oxford (1990)

    MATH  Google Scholar 

  10. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the Web. In: Proc. of The 10th Int’l Conf. on World Wide Web, pp. 613–622 (2001)

    Google Scholar 

  11. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  12. Fligner, M.A., Verducci, J.S.: Distance based ranking models. Journal of The Royal Statistical Society (B) 48(3), 359–369 (1986)

    MATH  MathSciNet  Google Scholar 

  13. Thurstone, L.L.: A law of comparative judgment. Psychological Review 34, 273–286 (1927)

    Article  Google Scholar 

  14. Mosteller, F.: Remarks on the method of paired comparisons: I — the least squares solution assuming equal standard deviations and equal correlations. Psychometrika 16(1), 3–9 (1951)

    Article  Google Scholar 

  15. de Borda, J.C.: On elections by ballot (1784). In: McLean, I., Urken, A.B. (eds.) Classics of Social Choice, pp. 81–89. The University of Michigan Press (1995)

    Google Scholar 

  16. Mallows, C.L.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)

    MATH  MathSciNet  Google Scholar 

  17. Arnold, B.C., Balakrishnan, N., Nagaraja, H.N.: A First Course in Order Statistics. John Wiley & Sons, Inc., Chichester (1992)

    MATH  Google Scholar 

  18. Kamishima, T., Motoyoshi, F.: Learning from cluster examples. Machine Learning 53, 199–233 (2003)

    Article  MATH  Google Scholar 

  19. Kamishima, T.: Nantonac collaborative filtering: Recommendation based on order responses. In: Proc. of The 9th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 583–588 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kamishima, T., Akaho, S. (2009). Efficient Clustering for Orders. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88067-7_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88066-0

  • Online ISBN: 978-3-540-88067-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics