Extraction of relevant components using shallow structure of HTML documents | IEEE Conference Publication | IEEE Xplore