Abstract
Data science has been empowered with the emerging concept of big data enabling data scalability in many ways. Effective prediction systems for complex analytical problems dealing with big data can be created using evolutionary computing, associate feature selection and reduction techniques. In the current work, we put forward a big data analytical scheme to analyze solar energy receptors based on a set of features. Correct estimation of pressure loss coefficients (PLC) greatly improves the design of a solar collector. Evaluation of PLC is a time and resource consuming process as the flow rate and Reynolds number changes at every junction. Moreover, a suitable and appropriate algebraic expression is not yet defined in the laminar region of flow for approximation of the complex relationship among different geometrical features and flow variables. The overall heat gain of the solar receptor is dependent upon flow rates and flow distribution in risers. Also, the local disturbances during the flow division and combining process from manifold to risers affects the performance of the solar collector. Owing to these reasons, mostly they are calculated using experiments, primarily due to the complexity involved. The proposed big data framework involves acquiring huge feature sets at each point along the flow of thermal fluid. The data is experimentally acquired in a set of around forty features for large number of Reynolds number and discharge ratio variations. Reynolds number varies from 200 to 15,000 while discharge ratio variation is in the range of 0–1. Feature reduction in the big data set is done by calculating the relevancy score using ReliefF algorithm that extracts the most relevant features. Later, the framework employs a suitably selected optimal ANN architecture of layers, neurons and activation functions. The selected topology is trained using reduced features sets using Levenberg–Marquardt backpropagation algorithm. Test and validation results bespeaks the efficacy of the proposed strategy and indicate that future PLC values can be forecasted close to experimental data. The relative percent error is around 10% of the experimental data set and is found better than computational fluid dynamics based approaches in terms of memory and processing time.
Similar content being viewed by others
References
Abdulwahhab M, Injeti NK, Dakhil SF (2013) Numerical prediction of pressure loss of fluid in a T junction. Int J Energy Environ 4(2):253–264
Ahmadi A, Han D, Karamouz M, Remesan R (2009) Input data selection for solar radiation estimation. Hydrol Process 23(19):2754–2764
Aladag CH (2011) A new architecture selection method based on tabu search for artificial neural networks. Expert Syst Appl 38(4):3287–3293
Al-Ayyoub M, Jararweh Y, Rabab’ah A, Aldwairi M (2017) Feature extraction and selection for Arabic tweets authorship authentication. J Ambient Intell Hum Comput 8(3):383–393
Al-Refaie A, Chen T, Al-Athamneh R, Wu HC (2016) Fuzzy neural network approach to optimizing process performance by using multiple responses. J Ambient Intell Hum Comput 7(6):801–816
Andreu J, Angelov P (2013) An evolving machine learning method for human activity recognition systems. J Ambient Intell Hum Comput 4(2):195–206
Azzini A (2006) A new generic approach for neural network design and optimization (Ph.D. Thesis), University of Milan
Badar AW, Buchholz R, Lou Y, Ziegler F (2012) CFD based analysis of flow distribution in a coaxial vacuum tube solar collector with laminar flow conditions. Int J Energy Environ Eng 3(1):24
Bassett MD, Winterbone DE, Pearson RJ (2001) Calculation of steady flow pressure loss coefficients for pipe junctions. Proc Inst Mech Eng Part C J Mech Eng Sci 215(8):861–881
Bava F, Furbo S (2016) A numerical model for pressure drop and flow distribution in a solar collector with U-connected absorber pipes. Sol Energy 134:264–272
Beyer MA, Laney D (2012) The importance of ‘big data’: a definition. Gartner, Stamford, pp 2014–2018
Bingham JF, Blair GP (1985) An improved branched pipe model for multi-cylinder automotive engine calculations. Proc Inst Mech Eng Part D Transp Eng 199(1):65–77
Caner M, Gedik E, Keçebaş A (2011) Investigation on thermal performance calculation of two type solar air collectors using artificial neural network. Expert Syst Appl 38(3):1668–1674
Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Ghemawat S, Gobioff H, Leung ST (2003) The google file system. In: SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles, vol 37, no 5. Bolton Landing, NY, USA, pp 29–43. https://doi.org/10.1145/945445.945450
Glembin J, Rockendorf G, Scheuren J (2010) Internal thermal coupling in direct-flow coaxial vacuum tube collectors. Sol Energy 84(7):1137–1146
Gropp W, Lusk E, Sterling T (2012) Enabling technologies in Beowulf cluster computing with Linux, 2nd edn, vol 3, no 14. The MIT Press, Cambridge
Hager WH (1984) An approximate treatment of flow in branches and bends. Proc Inst Mech Eng Part C J Mech Eng Sci 198(1):63–69
Hendrickson S (2010) Getting started with Hadoop with Amazon’s elastic MapReduce. EMR (1/43)
Hilbert M, López P (2011) The world’s technological capacity to store, communicate, and compute information. Science 332(6025):60–65
Hoffman KA, Chiang ST (2000) Computational fluid dynamics for engineers. Engineering education system, 2nd edn. https://www.amazon.com/Computational-Fluid-Dynamics-Engineers-Hoffmann/dp/0962373176
Houle ME, Kriegel HP, Kröger P, Schubert E, Zimek A (2010) Can shared-neighbor distances defeat the curse of dimensionality? In: Gertz M, Ludäscher B (eds) Scientific and statistical database management. Lecture notes in computer science, vol 6187. Springer, Berlin, pp 482–500. https://doi.org/10.1007/978-3-642-13818-8_34
Idelchik IE (2017) Flow resistance: a design guide for engineers. Routledge, London
Jayalakshmi T, Santhakumaran A (2011) Statistical normalization and back propagation for classification. Int J Comput Theory Eng 3(1):1793–8201
Jia J, Yang N, Zhang C, Yue A, Yang J, Zhu D (2013) Object-oriented feature selection of high spatial resolution images using an improved relief algorithm. Math Comput Model 58(3–4):619–626
Jones GF, Lior N (1994) Flow distribution in manifolded solar collectors with negligible buoyancy effects. Sol Energy 52(3):289–300
Kalogirou SA (2001) Artificial neural networks in renewable energy systems applications: a review. Renew Sustain Energy Rev 5(4):373–401
Kumar S, Kaur T (2016) Development of ANN based model for solar potential assessment using various meteorological parameters. Energy Procedia 90:587–592
Liu Y, Starzyk JA, Zhu Z (2008) Optimized approximation algorithm in neural networks without overfitting. IEEE Trans Neural Netw 19(6):983–995
Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey
Mokryani G, Siano P, Piccolo A (2013) Optimal allocation of wind turbines in microgrids by using genetic algorithm. J Ambient Intell Hum Comput 4(6):613–619
Montgomery DC (2014) Big data and the quality profession. Qual Reliab Eng Int 30(4):447
Moujaes SF, Deshmukh S (2006) Three-dimensional CFD predications and experimental comparison of pressure drop of some common pipe fittings in turbulent flow. J Energy Eng 132(2):61–66
Müller E, Schiffer M, Seidl T (2011) Statistical selection of relevant subspace projections for outlier ranking. In: 2011 IEEE 27th international conference on data engineering. IEEE, Hannover, Germany, pp 434–445
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
Olken F, Gruenwald L (2008) Data stream management: aggregation, classification, modeling, and operator placement. IEEE Internet Comput 12(6):9–12
Paul A, Jeyaraj R (2019) Internet of things: a primer. Hum Behav Emerg Technol 1(1):37–47
Paul A, Victoire TAA, Jeyakumar AE (2003) Particle swarm approach for retiming in VLSI. In: 2003 46th midwest symposium on circuits and systems, vol 3. IEEE, Cairo, Egypt, pp 1532–1535
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Şahin M, Kaya Y, Uyar M (2013) Comparison of ANN and MLR models for estimating solar radiation in Turkey using NOAA/AVHRR data. Adv Space Res 51(5):891–904
Salmasi F, Yıldırım G, Masoodi A, Parsamehr P (2013) Predicting discharge coefficient of compound broad-crested weir by using genetic programming (GP) and artificial neural network (ANN) techniques. Arab J Geosci 6(7):2709–2717
Schubert E, Zimek A, Kriegel HP (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237
Shafi I, Ahmad J, Shah SI, Kashif FM (2007) Evolutionary time–frequency distributions using Bayesian regularised neural network model. IET Signal Proc 1(2):97–106
Shafi I, Ahmad J, Shah SI, Kashif FM (2008) Computing deblurred time-frequency distributions using artificial neural networks. Circuits Syst Signal Process 27(3):277–294
Stanczyk U (2014) RELIEF-based selection of decision rules. Procedia Comput Sci 35:299–308
Stone R (2001) Design techniques for engine manifolds: wave action methods for IC engines/theory of engine manifold design: wave action methods for IC engines. Proc Inst Mech Eng 215(3):403
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Data classification: algorithms and applications. CRC Press, pp 37–64
Voyant C, Notton G, Kalogirou S, Nivet ML, Paoli C, Motte F, Fouilloy A (2017) Machine learning methods for solar radiation forecasting: a review. Renew Energy 105:569–582
Weitbrecht V, Lehmann D, Richter A (2002) Flow distribution in solar collectors with laminar flow conditions. Sol Energy 73(6):433–441
Wu X, Zhu X, Wu GQ, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107
Yousaf S, Shafi I, Ahmad J (2018) Calculation of pressure loss coefficients in combining flows of a solar collector using artificial neural networks. Int J Adv Comput Sci Appl 9(9):555
Zafra A, Pechenizkiy M, Ventura S (2010) Feature selection is the ReliefF for multiple instance learning. In: Intelligent systems design and applications (ISDA), 2010 10th international conference on. IEEE, pp 525–532
Zhai Y, Ong YS, Tsang IW (2014) The emerging “big dimensionality”. IEEE Comput Intell Mag 9(3):14–26. https://doi.org/10.1109/MCI.2014.2326099
Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387
Acknowledgements
This study was also supported by the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (NRF-2017R1C1B5017464).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yousaf, S., Shafi, I., Din, S. et al. A big data analytical framework for analyzing solar energy receptors using evolutionary computing approach. J Ambient Intell Human Comput 10, 4071–4083 (2019). https://doi.org/10.1007/s12652-019-01443-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01443-7