skip to main content
10.1145/3477495.3531697acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

SpaceQA: Answering Questions about the Design of Space Missions and Space Craft Concepts

Published: 07 July 2022 Publication History

Abstract

We present SpaceQA, to the best of our knowledge the first open-domain QA system in Space mission design. SpaceQA is part of an initiative by the European Space Agency (ESA) to facilitate the access, sharing and reuse of information about Space mission design within the agency and with the public. We adopt a state-of-the-art architecture consisting of a dense retriever and a neural reader and opt for an approach based on transfer learning rather than fine-tuning due to the lack of domain-specific annotated data. Our evaluation on a test set produced by ESA is largely consistent with the results originally reported by the evaluated retrievers and confirms the need of fine tuning for reading comprehension. As of writing this paper, ESA is piloting SpaceQA internally.

References

[1]
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et al. 2016. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016).
[2]
Massimo Bandecchi, B. Melton, and Franco Ongaro. 1999. Concurrent Engineering Applied to Space Mission Assessment and Design.
[3]
Audrey Berquand, Paul Darm, and Annalisa Riccardi. 2021. SpaceTransformers: Language Modeling for Space Systems. IEEE Access 9 (2021), 133111--133122. https://doi.org/10.1109/ACCESS.2021.3115659
[4]
Audrey Berquand, Yashar Moshfeghi, and Annalisa Riccardi. 2020. Space mission design ontology: extraction of domain-specific entities and concepts similarity analysis. In AIAA Scitech 2020 Forum. https://doi.org/10.2514/6.2020-2253
[5]
Eric Brill, Susan Dumais, and Michele Banko. 2002. An Analysis of the AskMSR Question-Answering System. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002). Association for Computational Linguistics, 257--264. https://doi.org/10.3115/1118693.1118726
[6]
Brian Cairns, Rodney D. Nielsen, James J. Masanz, James H. Martin, Martha Palmer, Wayne H. Ward, and Guergana K. Savova. 2011. The MiPACQ clinical question answering system. AMIA ... Annual Symposium proceedings. AMIA Symposium 2011 (2011), 171--80.
[7]
YongGang Cao, Feifan Liu, Pippa Simpson, Lamont Antieau, Andrew Bennett, James J. Cimino, John Ely, and Hong Yu. 2011. AskHERMES: An online question answering system for complex clinical questions. Journal of Biomedical Informatics 44, 2 (2011), 277--288. https://doi.org/10.1016/j.jbi.2011.01.004
[8]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1870--1879. https: //doi.org/10.18653/v1/P17-1171
[9]
Danqi Chen and Wen-tau Yih. 2020. Open-Domain Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts. Association for Computational Linguistics, Online, 34--37. https://doi.org/10.18653/v1/2020.acl-tutorials.8
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19-1423
[11]
David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty. 2010. Building Watson: An Overview of the DeepQA Project. AI Magazine 31, 3 (Jul. 2010), 59--79. https://doi.org/10.1609/ aimag.v31i3.2303
[12]
Luyu Gao and Jamie Callan. 2021. Condenser: a Pre-training Architecture for Dense Retrieval. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 981--993. https://doi.org/10.18653/v1/2021. emnlp-main.75
[13]
Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. REALM: Retrieval-Augmented Language Model Pre-Training. ArXiv abs/2002.08909 (2020).
[14]
Qiao Jin, Zheng Yuan, Guangzhi Xiong, Qianlan Yu, Huaiyuan Ying, Chuanqi Tan, Mosha Chen, Songfang Huang, Xiaozhong Liu, and Sheng Yu. 2022. Biomedical Question Answering: A Survey of Approaches and Challenges. ACM Comput. Surv. 55, 2, Article 35 (jan 2022), 36 pages. https://doi.org/10.1145/3490238
[15]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (2021), 535--547. https: //doi.org/10.1109/TBDATA.2019.2921572
[16]
Vladimir Karpukhin, Barlas O?uz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open- domain question answering. arXiv preprint arXiv:2004.04906 (2020).
[17]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. Association for Computing Machinery, New York, NY, USA, 39--48. https://doi.org/10.1145/3397271.3401075
[18]
Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics 7 (2019), 452--466. https://doi.org/10.1162/tacl_a_00276
[19]
Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 6086--6096. https://doi.org/10.18653/v1/P19-1612
[20]
Sharon Levy, Kevin Mo, Wenhan Xiong, and William Yang Wang. 2021. Open- Domain Question-Answering for COVID-19 and Other Emergent Domains. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 259--266. https://doi.org/10.18653/ v1/2021.emnlp-demo.30
[21]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs.CL]
[22]
Sewon Min, Danqi Chen, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2019. A Discrete Hard EM Approach for Weakly Supervised Question Answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 2851--2864. https://doi.org/10.18653/v1/D19--1284
[23]
Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea, Roxana Girju, Richard Goodrum, and Vasile Rus. 2000. The Structure and Performance of an Open-Domain Question Answering System. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. Association for Computa- tional Linguistics, Hong Kong, 563--570. https://doi.org/10.3115/1075218.1075289
[24]
John M Prager. 2006. Open-Domain Question-Answering. Found. Trends Inf. Retr. 1, 2 (2006), 91--231.
[25]
Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don't know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822 (2018).
[26]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3982--3992. https://doi.org/10.18653/v1/D19-1410
[27]
Adam Roberts, Colin Raffel, and Noam Shazeer. 2020. How Much Knowledge Can You Pack Into the Parameters of a Language Model?. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 5418--5426. https://doi.org/ 10.18653/v1/2020.emnlp-main.437
[28]
Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance frame- work: BM25 and beyond. Now Publishers Inc.
[29]
Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2019. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4430--4441. https://doi.org/10.18653/v1/P19-1436
[30]
Ellen M. Voorhees. 1999. The TREC-8 Question Answering Track Report. In In Proceedings of TREC-8. 77--82.
[31]
Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. 2019. End-to-End Open-Domain Question Answering with BERTserini. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, 72--77. https://doi.org/10. 18653/v1/N19-4013
[32]
Xiaochi Zhou, Daniel Nurkowski, Sebastian Mosbach, Jethro Akroyd, and Markus Kraft. 2021. Question Answering System for Chemistry. Journal of Chemical Information and Modeling 61, 8 (2021), 3868--3880. https://doi.org/10.1021/acs. jcim.1c00275 arXiv:https://doi.org/10.1021/acs.jcim.1c00275 34338504.

Cited By

View all
  • (2025)Unveiling the power of language models in chemical research question answeringCommunications Chemistry10.1038/s42004-024-01394-x8:1Online publication date: 5-Jan-2025

Index Terms

  1. SpaceQA: Answering Questions about the Design of Space Missions and Space Craft Concepts

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2022
      3569 pages
      ISBN:9781450387323
      DOI:10.1145/3477495
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 July 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. dense retrievers
      2. language models
      3. neural networks
      4. open-domain question answering
      5. reading comprehension
      6. space mission design

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      SIGIR '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)14
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Unveiling the power of language models in chemical research question answeringCommunications Chemistry10.1038/s42004-024-01394-x8:1Online publication date: 5-Jan-2025

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media