ABSTRACT
We describe a method to extract style and branding elements from multiple web pages in a given site for content repurposing. Style and branding elements convey the values of the site owners effectively and connect with the target prospects. They are manifested through logos, graphical elements, background color, font styles, font colors and other illustrations. Our method automatically extracts color and image elements appearing frequently and prominently on multiple pages throughout the site. We rely on a DOM tree matching method to obtain the frequency of re-occurring elements and use relative sizes and positions of elements to determine the type of elements. Note that approximate locations of these elements provide an added clue to the content repurposing engine as to where to place the elements in the repurposed document. The obtained results show that the proposed method can efficiently extract style and branding elements with high accuracy.
- }}Seungyup Paek and John R. Smith, "Detecting Image Purpose in World-wide Web Documents," Proc. SPIE Symp. Electronic Imaging-Document Recognition, SPIE, Bellingham, Washington, Jan. 1998.Google Scholar
- }}Epimenides Voutsakis, Euripides G.M. Petrakis, and Evangelos Milios. Weighted Link Analysis for Logo and Trademark Image Retrieval on the Web. In Proc. IEEE/WIC/ACM Intern. Conf. on Web Intelligence (WI2005), pages 581--585, Compiegne, France, Sept. 2005. Google ScholarDigital Library
- }}Euripides G.M. Petrakis, epimenides Voutsakis and Evangelos e. Milios. Searching for Logo and Trademark Images on the Web. CIVR'07, July 9--11, 2007, Amsterdam, The Netherlands. Google ScholarDigital Library
- }}Euripides G.M. Petrakis, Klaydios Kontis and Epimenidis Voutsakis. Relevance Feedback Methods for Logo and Trademark Image Retrieval on the Web. SAC'06 April 23--27, 2006, Dijon, France. Google ScholarDigital Library
- }}Subhajit Sanyal, S. H. Srinivasan. LogoSeeker: A System for Detecting and Matching Logos in Natural Images. MM'07, September 23--28, 2007, Augsburg, Bavaria, Germany. Google ScholarDigital Library
- }}W. Yang, 1991 Identifying Syntactic Differences between Two Programs. Software-Practice and Experience, vol. 21, no. 7, pp. 739--755, 1991. Google ScholarDigital Library
- }}Davi de Castro Reis, Paulo B. Golgher, Altigran S. da Silva, Alberto H. F. Laender. 2004 Automatic Web News Extraction Using Tree Edit Distance. WWW2004, May 17--22, 2004, New York, USA.Google Scholar
- }}Yanhong Zhai, BingLiu. Web Data Extraction based on Partial Tree Alignment. WWW05. Google ScholarDigital Library
- }}Shuyi Zheng, Di Wu, Ruihua Song, Ji-rong Wen, Joint Optimization of Wrapper Generation and Template Detection, KDD'07 Google ScholarDigital Library
- }}Michael S. Lew, Nicu Sebe, Chabane Djeraba, Ramesh Jain. Content-Based Multimedia Information Retrieval: State of the Art and challenges. ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 2, No. 1, February 2006, Pages 1--19 Google ScholarDigital Library
Index Terms
Style and branding elements extraction from businessweb sites
Recommendations
Co-Locating Style-Defining Elements on 3D Shapes
We introduce a method for co-locating style-defining elements over a set of 3D shapes. Our goal is to translate high-level style descriptions, such as “Ming” or “European” for furniture models, into explicit and localized regions over the geometric ...
Co-Locating Style-Defining Elements on 3D Shapes
We introduce a method for co-locating style-defining elements over a set of 3D shapes. Our goal is to translate high-level style descriptions, such as “Ming” or “European” for furniture models, into explicit and localized regions over the geometric ...
The Enriched Crouzeix---Raviart Elements are Equivalent to the Raviart---Thomas Elements
For both the Poisson model problem and the Stokes problem in any dimension, this paper proves that the enriched Crouzeix---Raviart elements are actually identical to the first order Raviart---Thomas elements in the sense that they produce the same ...
Comments