ABSTRACT
Many research implementations of search engines are written in C, C++, or Java. They are difficult to understand and modify because they are at least a few thousand lines of code and contain many low-level details. In this paper, we show how to achieve a much shorter and higher level implementation: one in about a few hundred lines. We accomplish this result through the use of a high-level functional programming language, F#, and some of its features such as sequences, pipes and structured input and output. By using a search engine implementation as a case study, we argue that functional programming fits the domain of Information Retrieval problems much better than imperative/OO languages like C++ and Java.
Functional programming languages are ideal for rapid algorithm prototyping and data exploration in the field of Information Retrieval (IR).
Additionally, our implementation can be used as case study in an IR course since it is a very high level, but nevertheless executable specification of a search engine.
- The lemur toolkit. http://www.lemurproject.org/Google Scholar
- GALAGO. http://www.galagosearch.org/Google Scholar
- Terrier. http://ir.dcs.gla.ac.uk/terrier/\endthebibliographyGoogle Scholar
Index Terms
- A search engine in a few lines.: yes, we can!
Comments