Abstract
The prevailing trend towards large models that demand extensive computational resources threatens to marginalize smaller research labs, constraining innovation and diversity in the field. This position paper advocates for a strategic pivot of small institutions to research directions that are computationally economical, specifically through a modular approach inspired by neurobiological mechanisms. We argue for a balanced approach that draws inspiration from the brain’s energy-efficient processing and specialized structures, yet is liberated from the evolutionary constraints of biological growth. By focusing on modular architectures that mimic the brain’s specialization and adaptability, we can strive to keep energy consumption within reasonable bounds. Recent research into forward-only training algorithms has opened up concrete avenues to include such modules into existing networks. This approach not only aligns with the imperative to make AI research more sustainable and inclusive but also leverages the brain’s proven strategies for efficient computation. We posit that there exists a middle ground between the brain and datacenter-scale models that eschews the need for excessive computational power, fostering an environment where innovation is driven by ingenuity rather than computational capacity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Assran, M., et al.: Self-supervised learning from images with a joint-embedding predictive architecture. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15619–15629 (2023)
Bear, D.M., et al.: Unifying (machine) vision via counterfactual world modeling. arXiv preprint arXiv:2306.01828 (2023)
Besiroglu, T., Bergerson, S.A., Michael, A., Heim, L., Luo, X., Thompson, N.: The compute divide in machine learning: a threat to academic contribution and scrutiny? arXiv preprint arXiv:2401.02452 (2024)
Buchanan, M.: The limits of machine prediction. Nat. Phys. 15(4), 304 (2019)
Chandran, K.S., Paul, A.M., Paul, A., Ghosh, K.: Psychophysics may be the game-changer for deep neural networks (DNNs) to imitate the human vision. Behav. Brain Sci. 46, e388 (2023)
Cirne, W., et al.: Labs of the world, unite!!! J. Grid Comput. 4, 225–246 (2006)
Crick, F.: The recent excitement about neural networks. Nature 337(6203), 129–132 (1989)
Dean, J., et al.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Downing, P.E., Jiang, Y., Shuman, M., Kanwisher, N.: A cortical area selective for visual processing of the human body. Science 293(5539), 2470–2473 (2001)
Epstein, R., Kanwisher, N.: A cortical representation of the local visual environment. Nature 392(6676), 598–601 (1998)
Feldman, M., Siegel, D.S., Wright, M.: New developments in innovation and entrepreneurial ecosystems. Ind. Corp. Chang. 28(4), 817–826 (2019)
Gaier, A., Ha, D.: Weight agnostic neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Gardner, R.J., et al.: Toroidal topology of population activity in grid cells. Nature 602(7895), 123–128 (2022)
Goodale, M.A., Milner, A.D.: Separate visual pathways for perception and action. Trends Neurosci. 15(1), 20–25 (1992)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), 1789–1819 (2021)
Harris, J.J., Jolivet, R., Attwell, D.: Synaptic energy use and supply. Neuron 75(5), 762–777 (2012)
Hawkins, J., Ahmad, S., Cui, Y.: A theory of how columns in the neocortex enable learning the structure of the world. Front. Neural Circuits 11, 295079 (2017)
Hiesinger, P.R.: The Self-assembling Brain: How Neural Networks Grow Smarter. Princeton University Press, Princeton (2021)
Hinton, G.: The forward-forward algorithm: some preliminary investigations. arXiv preprint arXiv:2212.13345 (2022)
Huang, J.: Nvidia GTC keynote speech (2024). Quote around the 50:04 min mark
Ignat, O., et al.: A PhD student’s perspective on research in NLP in the era of very large language models. arXiv preprint arXiv:2305.12544 (2023)
Itō, M.: The Cerebellum and Neural Control. Raven Press (1984)
Justin, M., Hubert, M.B., Betchewe, G., Doka, S.Y., Crepin, K.T.: Chaos in human brain phase transition. In: Bracken, P. (ed.) Research Advances in Chaos Theory, chap. 6. IntechOpen, Rijeka (2019). https://doi.org/10.5772/intechopen.86667
Kanwisher, N., McDermott, J., Chun, M.M.: The fusiform face area: a module in human extrastriate cortex specialized for face perception (2002)
Kohan, A., Rietman, E.A., Siegelmann, H.T.: Signal propagation: the framework for learning and inference in a forward pass. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Kübler-Ross, E.: The Five Stages of Grief. Routledge (1969)
Küfeoğlu, S., Özkuran, M.: Bitcoin mining: a global review of energy and power demand. Energy Res. Soc. Sci. 58, 101273 (2019)
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017)
Land, M.F., Fernald, R.D.: The evolution of eyes. Annu. Rev. Neurosci. 15(1), 1–29 (1992)
Leiserson, C.E., et al.: There’s plenty of room at the top: What will drive computer performance after Moore’s law? Science 368(6495), eaam9744 (2020)
Leopold, D.A., Logothetis, N.K.: Multistable phenomena: changing views in perception. Trends Cogn. Sci. 3(7), 254–264 (1999)
Li, L., Fan, Y., Tse, M., Lin, K.Y.: A review of applications in federated learning. Comput. Industr. Eng. 149, 106854 (2020)
Lukianov, M., Verbitsky, I., Cadaval, E.R., Strzelecki, R.: An overview of bidirectional EV chargers: empowering traction grid-powered chargers. In: Kyrylenko, O., Denysiuk, S., Strzelecki, R., Blinov, I., Zaitsev, I., Zaporozhets, A. (eds.) Power Systems Research and Operation. Studies in Systems, Decision and Control, vol. 512, pp. 191–230. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-44772-3_9
Manto, M., et al.: Consensus paper: roles of the cerebellum in motor control-the diversity of ideas on cerebellar involvement in movement. Cerebellum 11, 457–487 (2012)
Molinari, M., Leggio, M.G., Silveri, M.C.: Verbal fluency and agrammatism. Int. Rev. Neurobiol. 41, 325–339 (1997)
Moser, E.I., Moser, M.B., McNaughton, B.L.: Spatial representation in the hippocampal formation: a history. Nat. Neurosci. 20(11), 1448–1464 (2017)
Muller, L., Reynaud, A., Chavane, F., Destexhe, A.: The stimulus-evoked population response in visual cortex of awake monkey is a propagating wave. Nat. Commun. 5(1), 3675 (2014)
O’Keefe, J., Dostrovsky, J.: The hippocampus as a spatial map: preliminary evidence from unit activity in the freely-moving rat. Brain Res. (1971)
Ólafsdóttir, H.F., Bush, D., Barry, C.: The role of hippocampal replay in memory and planning. Curr. Biol. 28(1), R37–R50 (2018)
Pande, V., et al.: Folding@ home. Distrib. Comput. (2010)
Pfeiffer, J., Ruder, S., Vulić, I., Ponti, E.: Modular deep learning. Trans. Mach. Learn. Res. (2023)
Ramsauer, H., et al.: Hopfield networks is all you need. arXiv preprint arXiv:2008.02217 (2020)
Robbins, P.: Modularity of mind. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Winter 2017 edn. Metaphysics Research Lab, Stanford University (2017)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Saphra, N., Fleisig, E., Cho, K., Lopez, A.: First tragedy, then parse: history repeats itself in the new era of large language models. arXiv preprint arXiv:2311.05020 (2023)
Schmahmann, J.D.: The cerebellum and cognition. Neurosci. Lett. 688, 62–75 (2019)
Schwartz, R., Dodge, J., Smith, N.A., Etzioni, O.: Green AI. Commun. ACM 63(12), 54–63 (2020)
Shazeer, N., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In: Proceedings of the 5th International Conference on Learning Representations (2017)
Sherman, S.M., Guillery, R.: The role of the thalamus in the flow of information to the cortex. Philos. Trans. Roy. Soc. London. Ser. B: Biol. Sci. 357(1428), 1695–1708 (2002)
Shrestha, A., Fang, H., Mei, Z., Rider, D.P., Wu, Q., Qiu, Q.: A survey on neuromorphic computing: models and hardware. IEEE Circuits Syst. Mag. 22(2), 6–35 (2022)
Sorbaro, M., Liu, Q., Bortone, M., Sheik, S.: Optimizing the energy consumption of spiking neural networks for neuromorphic applications. Front. Neurosci. 14 (2020). https://doi.org/10.3389/fnins.2020.00662, https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2020.00662
Strubell, E., Ganesh, A., Mccallum, A.: Energy and policy considerations for deep learning in NLP. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3645–3650 (2019)
Su, N.M., Crandall, D.J.: The affective growth of computer vision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9291–9300 (2021)
Sutton, R.: The bitter lesson. Incomplete Ideas (Blog) 13(1), 38 (2019)
Taherkhani, A., Belatreche, A., Li, Y., Cosma, G., Maguire, L.P., McGinnity, T.M.: A review of learning in biologically plausible spiking neural networks. Neural Netw. 122, 253–272 (2020)
Togelius, J., Yannakakis, G.: Point of view: Choose your weapon: survival strategies for depressed AI academics. Proc. IEEE 112(1), 0018–9219 (2024)
Vaswani, A., et al.: Attention is all you need. Advances in Neural Information Processing Systems, vol. 30 (2017)
de Vries, A.: The growing energy footprint of artificial intelligence. Joule 7(10), 2191–2194 (2023)
Warren, G.S.: Regulating pot to save the polar bear: energy and climate impacts of the marijuana industry. Columbia J. Environ. law 40, 385 (2015)
Whittington, J.C., Warren, J., Behrens, T.E.: Relating transformers to models and neural representations of the hippocampal formation. In: International Conference on Learning Representations (2021)
Xiao, T.P., Bennett, C.H., Feinberg, B., Agarwal, S., Marinella, M.J.: Analog architectures for neural network acceleration based on non-volatile memory. Appl. Phys. Rev. 7(3) (2020)
Yassa, M.A., Stark, C.E.: Pattern separation in the hippocampus. Trends Neurosci. 34(10), 515–525 (2011)
Zador, A.M.: A critique of pure learning and what artificial neural networks can learn from animal brains. Nat. Commun. 10(1), 3770 (2019)
Zhou, X., Zhang, W., Xu, H., Zhang, T.: Effective sparsification of neural networks with global sparsity constraint. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3599–3608 (2021)
Zhuang, F., et al.: A comprehensive survey on transfer learning. Proc. IEEE 109(1), 43–76 (2020)
Acknowledgments
This research was supported by Flanders Make, the strategic research centre for the manufacturing industry in the NORM.AI project, and the Research Foundation - Flanders (FWO)(1SHDZ24N).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Put, J., Michiels, N., Vanherle, B., Zoomers, B. (2024). Brains Over Brawn: Small AI Labs in the Age of Datacenter-Scale Compute. In: Fred, A., Hadjali, A., Gusikhin, O., Sansone, C. (eds) Deep Learning Theory and Applications. DeLTA 2024. Communications in Computer and Information Science, vol 2172. Springer, Cham. https://doi.org/10.1007/978-3-031-66705-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-66705-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-66704-6
Online ISBN: 978-3-031-66705-3
eBook Packages: Computer ScienceComputer Science (R0)