Abstract
One year ago, in the SIGIR Forum issue of December 2018, I ranted about the "neural hype" [9]. One year later, I write again to publicly recant my heretical beliefs. What a difference a year makes! In accelerated "deep learning" time, a year seems like an eternity---so much exciting progress has been made in the previous months!
- A. Arampatzis, T. Tsoris, C. H. A. Koster, and T. P. van der Weide. Phrase-based information retrieval. Information Processing and Management, 34(6):693--707, December 1998. Google ScholarDigital Library
- T. G. Armstrong, A. Moffat, W. Webber, and J. Zobel. Improvements that don't add up: Ad-hoc retrieval results since 1998. In Proceedings of the 18th International Conference on Information and Knowledge Management (CIKM 2009), pages 601--610, Hong Kong, China, 2009. Google ScholarDigital Library
- P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu, R. Majumder, A. McNamara, B. Mitra, T. Nguyen, M. Rosenberg, X. Song, A. Stoica, S. Tiwary, and T. Wang. MS MARCO: A human generated MAchine Reading COmprehension dataset. arXiv:1611.09268v3, 2018.Google Scholar
- M. F. Dacrema, P. Cremonesi, and D. Jannach. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In Proceedings of the 13th ACM Conference on Recommender Systems (RecSys '19), pages 101--109, Copenhagen, Denmark, 2019. Google ScholarDigital Library
- Z. Dai and J. Callan. Deeper text understanding for IR with contextual neural language modeling. In Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 985--988, Paris, France, 2019. Google ScholarDigital Library
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171--4186, Minneapolis, Minnesota, June 2019.Google Scholar
- J. L. Fagan. Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Technical Report TR87-868, Cornell University, Department of Computer Science, September 1987. Google Scholar
- S. Hofstätter and A. Hanbury. Let's measure run time! Extending the IR replicability infrastructure to include performance aspects. In Proceedings of the Open-Source IR Replicability Challenge (OSIRRC 2019): CEUR Workshop Proceedings Vol-2409, pages 12--16, Paris, France, 2019.Google Scholar
- J. Lin. The neural hype and comparisons against weak baselines. SIGIR Forum, 52(2):40--51, 2018. Google ScholarDigital Library
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692, 2019.Google Scholar
- S. MacAvaney, A. Yates, A. Cohan, and N. Goharian. CEDR: Contextualized embeddings for document ranking. In Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1101--1104, Paris, France, 2019. Google ScholarDigital Library
- R. Nogueira and K. Cho. Passage re-ranking with BERT. arXiv:1901.04085, 2019.Google Scholar
- R. Nogueira, W. Yang, J. Lin, and K. Cho. Document expansion by query prediction. arXiv:1904.08375, 2019.Google Scholar
- H. Padigela, H. Zamani, and W. B. Croft. Investigating the successes and failures of BERT for passage re-ranking. arXiv:1905.01758, 2019.Google Scholar
- M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227--2237, New Orleans, Louisiana, June 2018.Google ScholarCross Ref
- Y. Qiao, C. Xiong, Z. Liu, and Z. Liu. Understanding the behaviors of BERT in ranking. arXiv:1904.07531, 2019.Google Scholar
- A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. Improving language understanding by generative pre-training, 2018.Google Scholar
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683, 2019.Google Scholar
- M. Sanderson. Word-sense disambiguation and information retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1994), pages 142--151, Dublin, Ireland, 1994. Google ScholarDigital Library
- A. F. Smeaton, R. O'Donnell, and F. Kelledy. Indexing structures derived from syntax in TREC-3: System description. In Proceedings of the Third Text REtrieval Conference (TREC-3), Gaithersburg, Maryland, 1994.Google Scholar
- E. M. Voorhees. Using WordNet to disambiguate word senses for text retrieval. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1993), pages 171--180, Pittsburgh, Pennsylvania, 1993. Google ScholarDigital Library
- W. Yang, K. Lu, P. Yang, and J. Lin. Critically examining the "neural hype": weak baselines and the additivity of effectiveness gains from neural ranking models. In Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pages 1129--1132, Paris, France, 2019. Google ScholarDigital Library
- W. Yang, H. Zhang, and J. Lin. Simple applications of BERT for ad hoc document retrieval. In arXiv:1903.10972, 2019.Google Scholar
- Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le. XLNet: Generalized autoregressive pretraining for language understanding. arXiv:1906.08237, 2019.Google Scholar
- Z. A. Yilmaz, W. Yang, H. Zhang, and J. Lin. Cross-domain modeling of sentence-level evidence for document retrieval. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3481--3487, Hong Kong, China, 2019.Google ScholarCross Ref
Recommendations
The Neural Hype and Comparisons Against Weak Baselines
Recently, the machine learning community paused in a moment of self-reflection. In a widelydiscussed paper at ICLR 2018, Sculley et al. [13] wrote: "We observe that the rate of empirical advancement may not have been matched by consistent increase in ...
Math, Data or Hype Driven?
An old story tells about a group of people who encounter an elephant in the dark and touch on of its parts. Each person, given they know only what they can feel, comes up with a different view of what an elephant is. I was reminded of it by the three ...
Beyond the hype (panel): do patterns and frameworks reduce discovery costs?
OOPSLA '97: Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applicationsPatterns and frameworks are two approaches to the development of both new and evolving software systems. An implicit hypothesis is that "discovery costs" are reduced by leveraging knowledge previously collected, analyzed, organized, and packaged. "...
Comments