Loading [a11y]/accessibility-menu.js
Facilitating wrapper generation with page analysis | IEEE Conference Publication | IEEE Xplore

Facilitating wrapper generation with page analysis


Abstract:

Current approaches for generating wrappers for web page extraction suffer from the requirement of huge amount of labeled training pages to obtain satisfying results. On t...Show More

Abstract:

Current approaches for generating wrappers for web page extraction suffer from the requirement of huge amount of labeled training pages to obtain satisfying results. On the other hand, the quality of data extracted by fully automatic methods is not reliable. In this paper, we propose a novel method to facilitate wrapper generation by combining wrapper induction and page analysis approaches. In addition to manually labeled data, we also take advantage of a set of unlabeled pages to improve the quality of induced wrappers. Our experiments demonstrate that our system achieves a satisfying result with fewer manually labeled training pages.
Date of Conference: 08-11 June 2009
Date Added to IEEE Xplore: 26 June 2009
ISBN Information:
Conference Location: Dallas, TX

Contact IEEE to Subscribe

References

References is not available for this document.