ABSTRACT
We report on a four year academic research project to build a natural language processing platform in support of a large media company. The Computable News platform processes news stories, producing a layer of structured data that can be used to build rich applications. We describe the underlying platform and the research tasks that we explored building it. The platform supports a wide range of prototype applications designed to support different newsroom functions. We hope that this qualitative review provides some insight into the challenges involved in this type of project.
- T. Dawborn and J. R. Curran. docrep: A lightweight and efficient document representation framework. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 762--771, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics.Google Scholar
- B. Hachey, J. Nothman, and W. Radford. Cheap and easy entity evaluation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 464--469, Baltimore, Maryland, June 2014.Google ScholarCross Ref
- B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran. Evaluating entity linking with Wikipedia. Artificial Intelligence, 194:130--150, January 2013. Google ScholarDigital Library
- J. Nothman. Grounding event references in news. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2014.Google Scholar
- J. Nothman, T. Dawborn, and J. R. Curran. Command-line utilities for managing and exploring annotated corpora. In Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies, Dublin, Ireland, August 2014.Google ScholarCross Ref
- J. Nothman, M. Honnibal, B. Hachey, and J. R. Curran. Event linking: grounding event reference in a news archive. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 228--232, Jeju, Korea, July 2012. Google ScholarDigital Library
- T. O'Keefe. Extracting and Attributing Quotes in Text and Assessing them as Opinions. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2014.Google Scholar
- T. O'Keefe, J. R. Curran, P. Ashwell, and I. Koprinska. An annotated corpus of quoted opinions in news articles. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 516--520, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.Google Scholar
- T. O'Keefe, S. Pareti, J. R. Curran, I. Koprinska, and M. Honnibal. A sequence labelling approach to quote attribution. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 790--799, Jeju, Korea, July 2012. Google ScholarDigital Library
- G. Pink, W. Radford, W. Cannings, A. Naoum, J. Nothman, D. Tse, and J. R. Curran. SYDNEY-CMCRC at TAC 2013. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2013. National Institute of Standards and Technology.Google Scholar
- W. Radford. Linking Named Entities to Wikipedia. PhD thesis, School of Information Technologies, University of Sydney, Sydney, Australia, 2015.Google Scholar
- W. Radford, W. Cannings, A. Naoum, J. Nothman, G. Pink, D. Tse, and J. R. Curran. (Almost) Total Recall -- SYDNEY-CMCRC at TAC 2012. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2012. National Institute of Standards and Technology.Google Scholar
- W. Radford and J. R. Curran. Joint apposition extraction with syntactic and semantic constraints. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 671--677, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.Google Scholar
- W. Radford, B. Hachey, M. Honnibal, J. Nothman, and J. R. Curran. Naive but effective NIL clustering baselines -- CMCRC at TAC 2011. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2011. National Institute of Standards and Technology.Google Scholar
- W. Radford, B. Hachey, J. Nothman, M. Honnibal, and J. R. Curran. Document-level entity linking: CMCRC at TAC 2010. In Proceedings of the Text Analysis Conference, Gaithersburg, MD USA, November 2010. National Institute of Standards and Technology.Google Scholar
Index Terms
The Computable News project: Research in the Newsroom
Recommendations
Evaluating Entity Linking with Wikipedia
Named Entity Linking (nel) grounds entity mentions to their corresponding node in a Knowledge Base (kb). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate ...
AIDA-Social: Entity Linking on the Social Stream
ESAIR '14: Proceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information RetrievalNamed Entity Linking (NEL) in microblogs is a challenging task due to the use of cryptic abbreviations, insufficient contextual information, and the time-varying importance of entities. We propose three techniques to target these challenges: Mention ...
Exploring Representations for Singular and Multi-Concept Relations for Biomedical Named Entity Normalization
WWW '22: Companion Proceedings of the Web Conference 2022Since the rise of the COVID-19 pandemic, peer-reviewed biomedical repositories have experienced a surge in chemical and disease related queries. These queries have a wide variety of naming conventions and nomenclatures from trademark and generic, to ...
Comments