Skip to main content
Log in

A three-step unsupervised neural model for visualizing high complex dimensional spectroscopic data sets

  • Industrial and Commercial Application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The interdisciplinary research presented in this study is based on a novel approach to clustering tasks and the visualization of the internal structure of high-dimensional data sets. Following normalization, a pre-processing step performs dimensionality reduction on a high-dimensional data set, using an unsupervised neural architecture known as cooperative maximum likelihood Hebbian learning (CMLHL), which is characterized by its capability to preserve a degree of global ordering in the data. Subsequently, the self organising-map (SOM) is applied, as a topology-preserving architecture used for two-dimensional visualization of the internal structure of such data sets. This research studies the joint performance of these two neural models and their capability to preserve some global ordering. Their effectiveness is demonstrated through a case of study on a real-life high complex dimensional spectroscopic data set characterized by its lack of reproducibility. The data under analysis are taken from an X-ray spectroscopic analysis of a rose window in a famous ancient Gothic Spanish cathedral. The main aim of this study is to classify each sample by its date and place of origin, so as to facilitate the restoration of these and other historical stained glass windows. Thus, having ascertained the sample’s chemical composition and degree of conservation, this technique contributes to identifying different areas and periods in which the stained glass panels were produced. The combined method proposed in this study is compared with a classical statistical model that uses principal component analysis (PCA) as a pre-processing step, and with some other unsupervised models such as maximum likelihood Hebbian learning (MLHL) and the application of the SOM without a pre-processing step. In the final case, a comparison of the convergence processes was performed to examine the efficacy of the CMLHL/SOM combined model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ahmad A, Dey L (2005) A feature selection technique for classificatory analysis. Pattern Recogn Lett 26(1):43–56

    Article  Google Scholar 

  2. Kohonen T (1988) Self-organisation and associative memory, vol 8, Springer series in information sciences. Springer-Verlag, New York

    Google Scholar 

  3. Erwin E, Obermayer K, Schulten K (1992) Self-organizing maps: ordering convergence properties and energy functions. Biol Cybern 67:47–55

    Article  MATH  Google Scholar 

  4. Wiskott L, Sejnowski TJ (1998) Constrained optimization for neural map formation: a unifying framework for weight growth and normalization. Neural Comput 10(3):671–716

    Article  Google Scholar 

  5. Svensen M (1999). The generative topographic mapping PhD thesis. Aston University, UK

  6. Corchado E, MacDonald D, Fyfe C, (2004). Maximum and minimum likelihood Hebbian learning for exploratory projection pursuit. Data mining and knowledge discovery. Kluwer Academic Publishing 8(3):203–225

    Google Scholar 

  7. Seung HS, Socci ND, Lee D (1998) The rectified Gaussian distribution. Advances in neural information processing systems 10:350

    Google Scholar 

  8. Laaksonen J, Koskela M, Laakso S, Oja E (2001) Self-organising maps as a relevance feedback technique in content-based image retrieval. Pattern Anal Appl 4(2–3):140–152

    MathSciNet  MATH  Google Scholar 

  9. Lagus K, Kaski S, Kohonen T (2004) Mining massive document collections by the WEBSOM method. Inf Sci 163(1–3):135–156

    Article  Google Scholar 

  10. Corchado E, Fyfe C (2003). Connectionist techniques for the identification and suppression of interfering underlying factors. International journal of pattern recognition and artificial. Intelligence. 17(8):1447–1466

    Google Scholar 

  11. Pearson K (1901) On Lines and Planes of Closest Fit to Systems of Points in Space. Philos Mag 2:559–572

    Google Scholar 

  12. Hotelling H (1993) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–444

    Article  Google Scholar 

  13. Fyfe C, MacDonald D (2002) Epsilon-insensitive Hebbian learning. Neurocomputing 47(1–4):35–57

    MATH  Google Scholar 

  14. Ahmadi A, Omatu S, Kosaka T (2003) A PCA based method for improving the reliability of bank note classifier machines. In: Loncaric S, Neri A, Babic H (eds), ISPA 2004 Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis (IEEE Cat. No. 03EX651), vol 1. Univ. of Zagreb, Zagreb, Croatia, pp 494–499. doi:10.1109/ISPA.2003.1296947

  15. Hyvärinen A (1997). New approximations of differential entropy for independent component analysis and projection pursuit. NIPS 1997

  16. Diaconis P, Freedman D (1984) Asymptotics of graphical projections. Ann Stat 12(3):793–815

    Article  MathSciNet  MATH  Google Scholar 

  17. Sanger D (1989) A technique for assigning responsibilities to hidden units in connectionist networks contribution analysis. Conn Sci 1(2):115–138

    Article  Google Scholar 

  18. Demtröder W (2008) Laser spectroscopy: experimental techniques, 4th edn. Springer, Berlin

    Google Scholar 

  19. MacDonald D, Corchado E, Fyfe C et al. (2003). Maximum-likelihood competitive learning for the analysis of spectroscopic data. 2nd International Workshop on Practical Applications of Agents and Multiagent Systems–IWPAMS 2003

  20. Yang HC, Lee CH (2004) A text mining approach on automatic generation of web directories and hierarchies. Expert Syst Appl 27(4):645–663

    Article  MathSciNet  Google Scholar 

  21. Yang HC, Lee CH (2004) Mining text documents for thematic hierarchies using self-organizing maps. Comput Rev 45(2):117–118

    MathSciNet  Google Scholar 

  22. Yang HC, Lee CH (2005) A text mining approach for automatic construction of hypertexts. Expert Syst Appl 29(4):723–734

    Article  Google Scholar 

  23. Kohonen T (2000) Data mining by the self-organising map method. In: Bouchon-Meunier B, Yager RR, Zadeh LA (eds.) Uncertainty in intelligent and information systems. Advances in fuzzy systems—applications and theory, vol 20. World Scientific, Singapore, pp 3–22

  24. Abonyi J, Nemeth S, Vincze C, Arva P (2003) Process analysis and product quality estimation by self-organizing maps with an application to polyethylene production. Comput Ind 52(3):221–234

    Article  Google Scholar 

  25. Lessmann B, Degenhard A, Kessar P, Pointon L, Khazen M, Leach M O, Nattkemper T W (2005). SOM-based wavelet filtering for the exploration of medical images. In: Artificial neural networks: biological inspirations–ICANN 2005, Pt. 1, Proceedings, Lecture Notes in Computer Science, pp 671–676

  26. Krell G, Rebmann R, Seiffert U, Michaelis B (2003). Improving still image coding by an SOM-controlled associative memory. In: Sanfeliu A, Ruiz-Shulcloper J (eds.) Progress in pattern recognition, speech and image analysis. 8th Iberoamerican Congress on Pattern Recognition, CIARP 2003. Proceedings Lecture Notes in Computer Science. Springer-Verlag, Berlin, pp 571–579

  27. Lin S, Si J (1998) Weight-value convergence of the SOM algorithm for discrete input. Neural Comput 10(4):807–814

    Article  Google Scholar 

  28. Corchado JM, Aiken J, Corchado E, Fernández F (2005) Evaluating the air-sea interactions and fluxes using an instance-based reasoning system. AI Communication 18(4):247–256

    MATH  Google Scholar 

  29. Herrero A, Corchado E, Pellicer MA, Abraham A (2009) MOVIH-IDS: a mobile-visualization hybrid intrusion detection system. Neurocomputing 72(13–15):2775–2784

    Article  Google Scholar 

  30. Herrero A, corchado E, Gastaldo P, Zunino R (2009) Neural projection techniques for the visual inspection of network traffic. Neurocomputing 72(16–18):3649–3658

    Article  Google Scholar 

  31. Bogdan G, Baruque B, Corchado E (2006) Outlier resistant PCA ensembles. In: Knowledge-based intelligent information and engineering systems, 10th international conference, KES 2006, Bournemouth, UK. KES. LNAI, vol. 3. Springer, Heidelberg, pp 432–440

  32. Yin H (2002) Data Visualisation and Manifold Mapping Using the Visom. Neural Networks 15:1005–1016

    Article  Google Scholar 

  33. Baruque B, Corchado E (2007) Fusion of visualization induced SOM. Innovations in hybrid intelligent systems series: advances in soft computing, vol 44. Springer, Berlin

    Google Scholar 

  34. Bertsekas DP (1995) Nonlinear programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  35. Baruque B, Corchado E (2010). A weighted voting summarization of SOM ensembles. Data mining and knowledge discovery. Springer. 21(3):398–426. doi:10.1007/s10618-009-0160-3

    Google Scholar 

  36. Herrero A, Corchado E, Sáiz L, Abraham A (2010) DIPKIP: a connectionist knowledge management system to identify knowledge deficits in practical cases. Comput Intell 26(1):26–56

    Article  Google Scholar 

  37. Yan W, Chen CH, Khoo LP (2005) A web-enabled product definition and customization system for product conceptualization. Expert Syst 22(5):279–293

    Article  Google Scholar 

  38. Liu H, Liu L, Zhang H (2009). Boosting feature selection using information metric for classification. In: Neurocomputing. vol 73(1–3). Elsevier Science, Amsterdam

  39. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics, vol 23(19). Bioinformatics Oxford University Press, Oxford, pp 2507–2517

  40. Vinaya V, Bulsara N, Gadgil CJ, Gadgil M (2009) Comparison of feature selection and classification combinations for cancer classification using microarray data. Int J Bioinform Res Appl 5(4):417–431

    Article  Google Scholar 

  41. Nemati S, Basiri ME, Ghasem-Aghaee N, Aghdam MH (2009) A novel ACO-GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl Int J 36(10):12086–12094

    Article  Google Scholar 

  42. Hua J, Tembe WD, Dougherty ER (2009) Performance of feature-selection methods in the classification of high-dimension data. Pattern Recogn 42(3):409–424

    Article  MATH  Google Scholar 

  43. Gunter S, Bunke H (2004). An evaluation of ensemble methods in handwritten word recognition based on feature selection. Pattern Recogn. ICPR 2004

  44. Gunter S, Bunke H (2004) Handwritten word recognition using classifier ensembles generated from multiple prototypes. Int J Pattern Recogn Artif Intell 18(5):388–392

    Google Scholar 

  45. Sun NQ, Li Y (2009) Intrusion detection based on back-propagation neural network and feature selection mechanism. FGIT 2009. LNCS 5899:151–159

    Google Scholar 

  46. Földiák P (1991) Models of sensory coding, PhD dissertation, University of Cambridge (reprinted as Technical Report No. CUED/F-INFENG/TR 91, Department of Engineering, University of Cambridge, 1992)

  47. Khuwaja GA (2005) Merging face and finger images for human identification. Pattern Anal Appl 8:188–198

    Article  MathSciNet  Google Scholar 

  48. Hurtado L F, Griol D, Segarra E, Sanchís E (2006) A stochastic approach for dialog management based on neural networks. In: Proceedings of the 9th international conference on spoken language processing interspeech, Pittsburgh, pp 49–52

  49. Chow TWS, Rahman MKM, Wu S (2006) Content-based image retrieval by using tree-structured features and multi-layer self-organizing map. Pattern Anal Appl 9:1–20

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was supported by projects TIN2010-21272-C02-01 from the Spanish Ministry of Science and Innovation and BU006A08 of the JCyL. The authors would also like to thank the manufacturer of components for vehicle interiors, Grupo Antolin Ingeniería, S.A. which provided support through MAGNO 2008 – 1028 – CENIT funded by the Spanish Ministry of Science and Innovation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilio Corchado.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corchado, E., Perez, J.C. A three-step unsupervised neural model for visualizing high complex dimensional spectroscopic data sets. Pattern Anal Applic 14, 207–218 (2011). https://doi.org/10.1007/s10044-010-0187-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-010-0187-5

Keywords

Navigation