Loading [MathJax]/extensions/MathMenu.js
Scientific Formula Retrieval via Tree Embeddings | IEEE Conference Publication | IEEE Xplore

Scientific Formula Retrieval via Tree Embeddings


Abstract:

Exploiting the ever-growing corpus of scientific content calls for new ways and means to effectively organize, search, and retrieve scientific formulae. We propose a new ...Show More

Abstract:

Exploiting the ever-growing corpus of scientific content calls for new ways and means to effectively organize, search, and retrieve scientific formulae. We propose a new data-driven framework for retrieving similar scientific formulae via learned formula representations based on tree embeddings. FORTE (for FOrmula Representation learning via Tree Embeddings) leverages operator tree representations of symbolic scientific formulae (such as math equations) to explicitly capture their inherent structural and semantic properties. FORTE employs i) a tree encoder that encodes the formula’s operator tree into an embedding vector and ii) a tree decoder that directly generates a formula’s operator tree from the embedding vector. We also develop a novel tree beam search algorithm that improves the quality of the decoded operator trees. We demonstrate that FORTE (sometimes significantly) outperforms various baseline methods on formula reconstruction and retrieval using a real-world dataset comprising 770k scientific formulae collected on-line.
Date of Conference: 15-18 December 2021
Date Added to IEEE Xplore: 13 January 2022
ISBN Information:
Conference Location: Orlando, FL, USA

Contact IEEE to Subscribe

References

References is not available for this document.