Abstract
The 2021 Smoky Mountains Computational Sciences and Engineering Conference enlists scientists from across Oak Ridge National Laboratory (ORNL) and industry to be data sponsors and help create data analytics and edge computing challenges for eminent datasets in a variety of scientific domains. This work describes the significance of each of the eight datasets and their associated challenge questions. The challenge questions for each dataset were required to cover multiple difficulty levels. An international call for participation was sent to students, asking them to form teams of up to six people and apply novel data analytics and edge computing methods to solve these challenges.
This manuscript has been co-authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan http://energy.gov/downloads/doe-public-access-plan).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
References
Akbarian, D., et al.: Understanding the influence of defects and surface chemistry on ferroelectric switching: a ReaxFF investigation of BaTiO 3. Phys. Chem. Chem. Phys. 21(33), 18240–18249 (2019)
Biomedical Data Translator Consortium, et al.: Toward a universal biomedical data translator. Clin. Transl. Sci. 12(2), 86 (2019)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robot Learning, pp. 1–16. PMLR (2017)
Herrmannova, D., et al.: Scalable knowledge-graph analytics at 136 petaflop/s – data readme. DOI (2020)
Kelley, K.P., et al.: Tensor factorization for elucidating mechanisms of piezoresponse relaxation via dynamic Piezoresponse Force Spectroscopy. npj Comput. Mater. 6(1), 1–8 (2020)
Landhuis, E.: Scientific literature: information overload. Nature 535(7612), 457–458 (2016)
Office of Science and Technology Policy: Call to action to the tech community on new machine readable COVID-19 dataset. Online (2020). Accessed 18 Apr 2020
Ostrouchov, G., Maxwell, D., Ashraf, R.A., Engelmann, C., Shankar, M., Rogers, J.H.: GPU lifetimes on Titan supercomputer: survival analysis and reliability. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–14. IEEE (2020)
Passian, A., Imam, N.: Nanosystems, edge computing, and the next generation computing systems. Sensors 19(18), 4048 (2019)
Swanson, D.R.: Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspect. Biol. Med. 30(1), 7–18 (1986)
Swanson, D.R., Smalheiser, N.R.: An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artif. Intell. 91(2), 183–203 (1997)
Swanson, D.R., Smalheiser, N.R., Torvik, V.I.: Ranking indirect connections in literature-based discovery: the role of medical subject headings. J. Am. Soc. Inform. Sci. Technol. 57(11), 1427–1439 (2006)
Thilakaratne, M., Falkner, K., Atapattu, T.: A systematic review on literature-based discovery: general overview, methodology, & statistical analysis. ACM Comput. Surv. (CSUR) 52(6), 1–34 (2019)
Tshitoyan, V., et al.: Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571(7763), 95–98 (2019)
Wang, F., Oral, S., Sen, S., Imam, N.: Learning from five-year resource-utilization data of titan system. In: 2019 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–6. IEEE (2019)
Yang, H.T., Ju, J.H., Wong, Y.T., Shmulevich, I., Chiang, J.H.: Literature-based discovery of new candidates for drug repurposing. Brief. Bioinform. 18(3), 488–497 (2017)
Acknowledgment
Dataset generation for Challenge 1 was supported by the Center for Nanophase Materials Sciences, which is a DOE Office of Science User Facility. Through the ASCR Leadership Computing Challenge (ALCC) program, this research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Dataset generation for Challenge 2 was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Robinson Pino, program manager, under contract number DE-AC05-00OR22725. Dataset generation for Challenge 3 used resources from General Motors.
Dataset generation for Challenge 4 used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Dataset generation for Challenge 5 was completed by researchers at Oak Ridge National Laboratory sponsored by the DOE Office of Science as a part of the research in Multi-Sector Dynamics within the Earth and Environmental System Modeling Program as part of the Integrated Multiscale Multisector Modeling (IM3) Scientific Focus Area led by Pacific Northwest National Laboratory. The dataset for Challenge 7 was acquired at the Spallation Neutron Source which is sponsored by the User Facilities Division of the Department of Energy. The research for generating datasets for challenges 6 and 8 was conducted at and partially supported by the at the Center for Nanophase Materials Sciences, a US DOE Office of Science User Facility.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Devineni, P. et al. (2022). Smoky Mountain Data Challenge 2021: An Open Call to Solve Scientific Data Challenges Using Advanced Data Analytics and Edge Computing. In: Nichols, J., et al. Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation. SMC 2021. Communications in Computer and Information Science, vol 1512. Springer, Cham. https://doi.org/10.1007/978-3-030-96498-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-96498-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96497-9
Online ISBN: 978-3-030-96498-6
eBook Packages: Computer ScienceComputer Science (R0)