Grammars have exceptions☆
References (49)
- The araneus Project Home...
- Britannica Sporting Record: The Olympic...
- The Capodimonte Museum Web...
- The Louvre Web...
- The Oncolink Web...
- The Python Language Home...
- The Uffizi Web...
- World-wide Track and Field statistics...
- et al.
Querying documents in object databases
Journal of Digital Libraries
(1997) - et al.
Querying and updating the file
The Lorel query language for semistructured data
Journal of Digital Libraries
WebOQL: Restructuring documents, databases and Webs
Wrapper generation for semistructured Internet sources
Cut and paste
To weave the web
Design and maintenance of data-intensive web sites
WebSuite — a tools suite for harnessing web data
Text/relational database management systems: harmonizing SQL and SGML
Language features for flexible handling of exceptions in information systems
ACM Transactions on Database Systems
Accommodating exceptions in databases, and refining the schema by learning from them
A query language and optimization techniques for unstructured data
The TSIMMIS project: Integration of heterogenous information sources
Cited by (87)
DERIN: A data extraction method based on rendering information and n-gram
2017, Information Processing and ManagementCitation Excerpt :In addition, some studies used examples provided by a GUI interface (Graphical User Interface) to find and extract data by analyzing similar objects (Adelberg, 1998; Laender, Ribeiro-Neto, & da Silva, 2002). Finally, some works based on ontologies and languages also exists (Arocena & Mendelzon, 1999; Crescenzi & Mecca, 1998; Hammer, McHugh, & Garcia-Molin, 1997). Analyzing structures and visual information have also been proposed by the research community (Ansari & Vasishtha, 2015; Cai, Yu, Wen, & Ma, 2003; Chang et al., 2003; Chu, Hsu, Lee, & Tsai, 2015; Don, Chu, & Ling, 2015; Fumarola, Weninger, Barber, Malerba, & Han, 2011; Grigalis & Cenys, 2014; Hiremath & Algur, 2009; Irmak & Suel, 2006; Kadam & Pakle, 2014; Krupl-Sypien, Fayzrakhmanov, Holzinger, Panzenbïck, & Baumgartner, 2011; Liu, Meng, & Meng, 2010; Shi, Liu, Shen, Yuan, & Huang, 2015; Simon & Lausen, 2005; Trieschnigg, Tjin-Kam-Jet, & Hiemstra, 2012; Uzun, Agun, & Yerlikaya, 2013; Velloso & Dorneles, 2013; Zhai & Liu, 2005; Zhao, Meng, Wu, Raghavan, & Yu, 2005).
AutoRM: An effective approach for automatic Web data record mining
2015, Knowledge-Based SystemsCitation Excerpt :The manual approaches demand the users to construct Web data extraction wrappers manually using specially-designed or general-purpose languages. Typical manual approaches include TSIMMIS [19], Minerva [12], WebOQL [3], etc. To improve efficiency and reduce human labor, many researchers proposed semi-automatic approaches which require less human intervention.
A class of neural-network-based transducers for web information extraction
2014, NeurocomputingTEX: An efficient and effective unsupervised web information extractor
2013, Knowledge-Based SystemsTrends in web data extraction using machine learning
2021, Web IntelligenceThe smallest extraction problem
2021, Proceedings of the VLDB Endowment
- ☆
Recommended by Gottfried Vossen.