To read this content please select one of the options below:

A hidden Markov model‐based approach for extracting information from web news

Brandt Tso (Department of Information Management, Management College, NDU, Taipei, Taiwan)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 28 September 2007

Downloads

309

Abstract

Purpose

–

This paper aims to present a method based on hidden Markov models (HMM) for extracting information from web news.

Design/methodology/approach

–

The samples under study are derived from the contents of PROC “People's Daily Online,” a web‐based news publication containing non‐structured archives. This study focuses on developing HMM‐based tools for news filtering in order to retrieve terms of interest, such as “Geo‐location,” “System,” and “Personas.” The experiments are performed in two stages. In the first stage, each HMM being built is exclusively serving for extracting unique target term in order to evaluate the fundamental information extraction (IE) capability. In the second stage, the experiment is then extended to resolve a more complex, multi‐term extraction issue.

Findings

–

The results reveal that, by using HMMs as a basis, the accuracies (F‐measure) for unique IE tasks can achieve more than 70 per cent on average, while no fewer than 66 per cent accuracies are obtained for multi‐term extraction.

Originality/value

–

The study reveals the promising of using HMM for developing automatic tool in filtering free‐structured data.

Keywords

Citation

Tso, B. (2007), "A hidden Markov model‐based approach for extracting information from web news", International Journal of Web Information Systems, Vol. 3 No. 1/2, pp. 104-115. https://doi.org/10.1108/17440080710829243

Publisher

:

Emerald Group Publishing Limited

To read this content please select one of the options below:

Please note you do not have access to teaching notes

A hidden Markov model‐based approach for extracting information from web news

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information