S. Abiteboul, Issues in Monitoring Web Data, Proc. DEXA, 2002.
DOI : 10.1007/3-540-46146-9_1

E. Adar, J. Teevan, S. T. Dumais, and J. L. Elsas, The web changes everything, Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, 2009.
DOI : 10.1145/1498759.1498837

M. Alvarez, A. Pan, J. Raposo, F. Bellas, and F. Cacheda, Extracting lists of data records from semi-structured web pages, Data & Knowledge Engineering, vol.64, issue.2, 2008.
DOI : 10.1016/j.datak.2007.10.002

Y. Jung-an, J. Geller, Y. Wu, and S. A. Chun, Semantic deep Web: automatic attribute extraction from the deep Web data sources, Proc. SAC, 2007.

Y. Jung-an, S. A. Chun, . Kuo-chuan, J. Huang, and . Geller, Enriching ontology for deep Web search, Proc. DEXA, 2008.

A. Arasu and H. Garcia-molina, Extracting structured data from Web pages, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, 2003.
DOI : 10.1145/872757.872799

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

H. Artail and K. Fawaz, A fast HTML web page change detection approach based on hashing and reducing the number of similarity computations, Data & Knowledge Engineering, vol.66, issue.2, 2008.
DOI : 10.1016/j.datak.2008.04.003

N. Augsten, M. Böhlen, and J. Gamper, Approximate matching of hierarchical data using pq-grams, Proc. VLDB, 2005.

R. Baeza-yates, C. Castillo, and F. Saint-jean, Web dynamics, structure, and page quality, 2004.
DOI : 10.1007/978-3-662-10874-1_5

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

R. Balakrishnan and S. Kambhampati, SourceRank: Relevance and trust assessment for deep Web sources based on inter-source agreement, Proc. WWW, 2011.

Z. Bar-yossef and S. Rajagopalan, Template detection via data mining and its applications, Proceedings of the eleventh international conference on World Wide Web , WWW '02, 2002.
DOI : 10.1145/511446.511522

L. Barbosa and J. Freire, Siphoning hidden-Web data through keyword-based interfaces, Art. J. Information and Data Management, vol.1, issue.1, 2004.

D. T. Barnard, G. Clarke, and N. Duncan, Tree-to-tree correction for document trees, 1995.

E. Beisswanger, Exploiting Relation Extraction for Ontology Alignment, Proc. ISWC, 2010.
DOI : 10.1007/11574620_52

P. Dimitri, D. A. Bertsekas, and . Castañon, Parallel asynchronous Hungarian methods for the assignment problem, Art. INFORMS J. Computing, vol.5, issue.3, 1993.

C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker et al., DBpedia - A crystallization point for the Web of Data, Web Semantics: Science, Services and Agents on the World Wide Web, vol.7, issue.3, 2009.
DOI : 10.1016/j.websem.2009.07.002

P. Bohunsky and W. Gatterbauer, Visual structure-based web page clustering and retrieval, Proceedings of the 19th international conference on World wide web, WWW '10, 2010.
DOI : 10.1145/1772690.1772807

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

A. Z. Broder, On the resemblance and containment of documents, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171), 1997.
DOI : 10.1109/SEQUEN.1997.666900

A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, Syntactic clustering of the Web, Computer Networks and ISDN Systems, vol.29, issue.8-13, pp.8-13, 1997.
DOI : 10.1016/S0169-7552(97)00031-7

E. Bruno, N. Faessel, H. Glotin, J. L. Maitre, and M. Scholl, Indexing and querying segmented web pages: the block Web model, Art. World Wide Web, vol.14, pp.5-6, 2011.

D. Buttler, A short survey of document structure similarity algorithms, Proc. International Conference on Internet Computing, 2004.

D. Buttler, L. Liu, and C. Pu, A fully automated object extraction system for the World Wide Web, Proceedings 21st International Conference on Distributed Computing Systems, 2001.
DOI : 10.1109/ICDSC.2001.918966

D. Cai, S. Yu, J. Wen, and W. Ma, VIPS: a vision-based page segmentation algorithm, 2003.

E. Teixeira-cardoso, I. Vita-jabour, E. S. Laber, R. Rodrigues, and P. Cardoso, An efficient language-independent method to extract content from news Webpages, Proc. DocEng, 2011.

J. Caverlee, L. Liu, and D. Buttler, Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web, Proceedings. 20th International Conference on Data Engineering, 2004.
DOI : 10.1109/ICDE.2004.1319988

D. Chakrabarti and R. R. Mehta, The paths more taken, Proceedings of the 19th international conference on World wide web, WWW '10, 2010.
DOI : 10.1145/1772690.1772713

C. Chang, M. Kayed, M. R. Girgis, and K. F. Shaalan, A survey of Web information extraction systems, Art. IEEE Trans. on Knowl. and Data Eng, issue.10, p.18, 2006.

S. Chawathe and H. Garcia-molina, Meaningful change detection in structured data, Proc. SIGMOD, 1997.

S. Sudarshan, A. Chawathe, H. Rajaraman, J. Garcia-molina, and . Widom, Change detection in hierarchically structured information, Proc. ACM, 1996.

M. Chen, X. Liu, and J. Qin, Semantic relation extraction from sociallygenerated tags: a methodology for metadata generation, Proc. DC, 2008.

J. Cho and H. Garcia-molina, The evolution of the Web and implications for an incremental crawler, Proc. VLDB, 2000.

J. Cho and H. Garcia-molina, Estimating frequency of change, ACM Transactions on Internet Technology, vol.3, issue.3, 2003.
DOI : 10.1145/857166.857170

P. Cimiano, G. Ladwig, and S. Staab, Gimme' the context, Proceedings of the 14th international conference on World Wide Web , WWW '05, 2005.
DOI : 10.1145/1060745.1060796

L. R. Clausen, Concerning Etags and datestamps, Proc. IWAW, 2004.

G. Cobéna and T. Abdessalem, A Comparative Study of XML Change Detection Algorithms, Service and Business Computing Solutions with XML. IGI Global, 2009.
DOI : 10.4018/978-1-60566-330-2.ch002

G. Cobéna, S. Abiteboul, and A. Marian, Detecting changes in XML documents, Proceedings 18th International Conference on Data Engineering, 2002.
DOI : 10.1109/ICDE.2002.994696

G. Valter-crescenzi, P. Mecca, and . Merialdo, Roadrunner: Towards automatic data extraction from large Web sites, Proc. VLDB, 2001.

P. Valter-crescenzi, P. Merialdo, and . Missier, Clustering Web pages based on their structure, Art. Data and Knowledge Engineering, vol.54, issue.3, 2005.

H. Davulcu, S. Vadrevu, and S. Nagarajan, OntoMiner, Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters , WWW Alt. '04, 2004.
DOI : 10.1145/1013367.1013545

C. Davi-de, P. B. Reis, A. S. Golgher, A. H. Da-silva, and . Laender, Automatic Web news extraction using tree edit distance, Proc. WWW, 2004.

A. Dimulescu and J. Dessalles, Understanding narrative interest: Some evidence on the role of unexpectedness, Proc. CogSci, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00479573

Y. Ding and Q. Li, Building the schema of a Web entity dynamically, Art. Journal of Computational Information Systems, 2011.

F. Douglis, A. Feldmann, B. Krishnamurthy, and J. Mogul, Rate of change and other metrics: a live study of the World Wide Web, Proc. USITS, 1997.

F. Douglis and T. Ball, Yih-Farn Chen, and Elefherios Koutsofios. The AT&T Internet difference engine: Tracking and viewing changes on the Web, World Wide Web, vol.1, issue.1, 1998.

W. Fang, Z. Cui, and P. Zhao, Ontology-Based Focused Crawling of Deep Web Sources, Proc. KSEM, 2007.
DOI : 10.1007/978-3-540-76719-0_51

D. Fetterly, M. Manasse, M. Najork, and J. Wiener, A large-scale study of the evolution of Web pages, Proc. WWW, 2003.

E. Finkelstein, Syndicating Web Sites with RSS Feeds for Dummies, 2005.

S. Flesca and E. Masciari, Efficient and effective Web page change detection, Art. Data and Knowledge Engineering, vol.46, issue.2, 2007.
DOI : 10.1016/s0169-023x(02)00210-0

T. Furche, G. Gottlob, G. Grasso, X. Guo, G. Orsi et al., Real understanding of real estate forms, Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS '11, 2011.
DOI : 10.1145/1988688.1988704

T. Furche, G. Gottlob, X. Guo, C. Schallhart, A. Sellers et al., How the Minotaur Turned into Ariadne: Ontologies in Web Data Extraction, Proc. ICWE, 2011.
DOI : 10.1007/978-3-642-22233-7_2

T. Furche, G. Gottlob, X. Guo, G. Orsi, and C. Schallhart, Forms form patterns: reusable form understanding, Proc. WWW, 2012.

T. Furche, G. Grasso, G. Orsi, C. Schallhart, and C. Wang, Automatically learning gazetteers from the deep web, Proceedings of the 21st international conference companion on World Wide Web, WWW '12 Companion, 2012.
DOI : 10.1145/2187980.2188044

D. Gibson, K. Punera, and A. Tomkins, The volume and evolution of web page templates, Special interest tracks and posters of the 14th international conference on World Wide Web , WWW '05, 2005.
DOI : 10.1145/1062745.1062763

S. Grumbach and G. Mecca, In Search of the Lost Schema, Proc. ICDT, 1999.
DOI : 10.1007/3-540-49257-7_20

L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, XRANK, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, 2003.
DOI : 10.1145/872757.872762

K. Bin-he, . Chen-chuan, J. Chang, and . Han, Discovering complex matchings across Web query interfaces: A correlation mining approach, Proc. KDD, 2004.

Z. He, J. Hong, and D. Bell, Schema Matching across Query Interfaces on the Deep Web, Proc. BNCOD, 2008.
DOI : 10.1007/978-3-540-70504-8_6

Y. Hedley, M. Younas, A. James, and M. Sanderson, Sampling, information extraction and summarisation of Hidden Web databases, Data & Knowledge Engineering, vol.59, issue.2, 2006.
DOI : 10.1016/j.datak.2006.01.009

M. Daniel, T. Herzig, and . Tran, Heterogeneous Web data search using relevance-based on the fly data integration, Proc. WWW, 2012.

J. Hunter and S. Choudhury, Implementing Preservation Strategies for Complex Multimedia Objects, Proc. ECDL, 2003.
DOI : 10.1007/978-3-540-45175-4_43

P. Logasa-bogen, I. , J. Johnson, U. P. Karadkar, R. Furuta et al., Application of Kalman filters to identify unexpected change in blogs, Proc. JCDL, 2008.

G. Panagiotis, L. Ipeirotis, and . Gravano, Distributed search over the hidden Web: hierarchical database sampling and selection, Proc. VLDB, 2002.

J. Jacob, A. Sanka, N. Pandrangi, and S. Chakravarthy, WebVigiL: An Approach to Just-In-Time Information Propagation in Large Network-Centric Environments, 2004.
DOI : 10.1007/978-3-662-10874-1_13

J. Jacob, A. Sachde, and S. Chakravarthy, CX-DIFF: a change detection algorithm for XML content and change visualization for WebVigiL, Data & Knowledge Engineering, vol.52, issue.2, 2005.
DOI : 10.1016/S0169-023X(04)00102-8

A. Jatowt, Y. Kawai, and K. Tanaka, Detecting age of page content, Proceedings of the 9th annual ACM international workshop on Web information and data management , WIDM '07, 2007.
DOI : 10.1145/1316902.1316925

H. Kao, J. Ho, and M. Chen, WISDOM: Web intrapage informative structure mining based on document object model, IEEE Transactions on Knowledge and Data Engineering, vol.17, issue.5, pp.614-627, 2005.
DOI : 10.1109/TKDE.2005.84

H. P. Khandagale and P. P. Halkarnikar, A Novel Approach for Web Page Change Detection System, International Journal of Computer Theory and Engineering, vol.2, issue.3, 2010.
DOI : 10.7763/IJCTE.2010.V2.168

R. Khare, Y. An, and I. Song, Understanding deep web search interfaces, Proc. SIGMOD, p.39, 2010.
DOI : 10.1145/1860702.1860708

I. Khoury, R. M. El-mawas, O. El-rawas, E. F. Mounayar, and H. Artail, An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm, IEEE Transactions on Knowledge and Data Engineering, vol.19, issue.5, 2007.
DOI : 10.1109/TKDE.2007.1014

C. Kohlschütter, P. Fankhauser, and W. Nejdl, Boilerplate detection using shallow text features, Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, 2010.
DOI : 10.1145/1718487.1718542

E. Raphail, V. K. Krichevsky, and . Trofimov, The performance of universal encoding. Art, IEEE Transactions on Information Theory, vol.27, issue.2, pp.199-206, 1981.

S. Kumar, A. Kumar-yadav, R. Bharti, and R. Choudhary, Accurate and efficient crawling the deep Web: Surfacing hidden value, Art. International J. Computer Science and Information Security, vol.9, issue.5, 2011.

N. Kushmerick, D. S. Weld, and R. Doorenbos, Wrapper induction for information extraction, Proc. IJCAI, 1997.

K. Lee, Y. Choy, and S. Cho, An efficient algorithm to compute differences between structured documents, IEEE Trans. on Knowl. and Data Eng, issue.8, p.16, 2004.

H. Liang and F. Ren, Wanli Zuo, and Fengling He. Ontology based automatic attributes extracting and queries translating for deep Web, Art. J. Software, vol.5, 2008.

J. Seung, Y. Lim, and . Ng, An automated change-detection algorithm for HTML documents based on semantic hierarchies, Proc. ICDE, 2001.

G. Limaye, S. Sarawagi, and S. Chakrabarti, Annotating and searching web tables using entities, types and relationships, Proc. VLDB, 2010.
DOI : 10.14778/1920841.1921005

C. Lin, ROUGE: a package for automatic evaluation of summaries, Proc. Workshop on Text Summarization Branches Out (WAS), 2004.

B. Liu and Y. Zhai, NET ??? A System for Extracting Web Data from Flat and Nested Data Records, Proc. WISE, 2005.
DOI : 10.1007/11581062_39

B. Liu, R. Grossman, and Y. Zhai, Mining data records in Web pages, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, 2003.
DOI : 10.1145/956750.956826

L. Liu, C. Pu, and W. Tang, -detecting and delivering information changes on the web, Proceedings of the ninth international conference on Information and knowledge management , CIKM '00, 2000.
DOI : 10.1145/354756.354860

URL : https://hal.archives-ouvertes.fr/tel-00259428

G. Mar¸sicmar¸sic, Temporal Processing of News: Annotation of Temporal Expressions, Verbal Events and Temporal Relations

K. Luke, M. Mcdowell, and . Cafarella, Ontology-driven information extraction with OntoSyphon, Proc. ISWC, 2006.

R. Rupesh, P. Mehta, H. Mitra, and . Karnick, Extracting semantic structure of Web documents using content and visual information, Proc. WWW, 2005.

G. Miao, J. Tatemura, W. Hsiung, A. Sawires, and L. E. Moser, Extracting data records from the web using tag path clustering, Proceedings of the 18th international conference on World wide web, WWW '09, 2009.
DOI : 10.1145/1526709.1526841

G. Navarro, A guided tour to approximate string matching, ACM Computing Surveys, vol.33, issue.1, 2001.
DOI : 10.1145/375360.375365

S. Nestorov, S. Abiteboul, and R. Motwani, Extracting schema from semistructured data, Proc. SIGMOD, 1998.

A. Ntoulas, J. Cho, and C. Olston, What's new on the Web? the evolution of the Web from a search engine perspective, Proc. WWW, 2004.

S. Nunes, C. Ribeiro, and G. David, Using neighbors to date web documents, Proceedings of the 9th annual ACM international workshop on Web information and data management , WIDM '07, 2007.
DOI : 10.1145/1316902.1316924

URL : http://hdl.handle.net/10216/5255

M. Oita and P. Senellart, Archivage du contenu éphémère du Web à l'aide des flux Web, Proc. BDA Conference without formal proceedings. (Demonstration), 2010.

M. Oita and P. Senellart, Archiving data objects using Web feeds, Proc. IWAW, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00537962

M. Oita and P. Senellart, Deriving dynamics of Web pages: A survey, Proc. TWAW, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00588715

M. Oita and P. Senellart, FOREST, Proceedings of the 18th International Workshop on Web and Databases, WebDB'15, 2012.
DOI : 10.1145/2767109.2767112

URL : https://hal.archives-ouvertes.fr/hal-01178402

M. Oita, A. Amarilli, and P. Senellart, Cross-fertilizing deep Web analysis and ontology enrichment, Proc. VLDS, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00737941

J. Pasternack and D. Roth, Extracting article text from the web with maximum subsequence segmentation, Proceedings of the 18th international conference on World wide web, WWW '09, 2009.
DOI : 10.1145/1526709.1526840

Z. Pehlivan, M. B. Saad, and S. Gançarski, A novel Web archiving approach based on visual pages analysis, Proc. IWAW, 2009.

M. Pennock and R. Davis, ArchivePress: A really simple solution to archiving blog content, Proc. iPRES, 2009.

A. Quattoni, S. Wang, L. Morency, M. Collins, and T. Darrell, Hidden-state Conditional Random Fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
DOI : 10.1109/tpami.2007.1124

S. Raghavan and H. Garcia-molina, Crawling the hidden Web, Proc. VLDB, 2001.

L. Ramaswamy, A. Iyengar, L. Liu, and F. Douglis, Automatic detection of fragments in dynamically generated web pages, Proceedings of the 13th conference on World Wide Web , WWW '04, 2004.
DOI : 10.1145/988672.988732

C. Reynaud and B. Safar, Exploiting WordNet as background knowledge, Proc. ISWC Ontology Matching (OM-07) Workshop, 2007.

D. Rocco, D. Buttler, and L. Liu, Page Digest for large-scale Web services, IEEE International Conference on E-Commerce, 2003. CEC 2003., 2003.
DOI : 10.1109/COEC.2003.1210274

S. , C. Sahinalp, and A. Utis, Hardness of string similarity search and other indexing problems, Proc. ICALP, 2004.

P. Senellart, A. Mittal, D. Muschick, R. Gilleron, and M. Tommasi, Automatic wrapper induction from hidden-web sources with domain knowledge, Proceeding of the 10th ACM workshop on Web information and data management, WIDM '08, 2008.
DOI : 10.1145/1458502.1458505

URL : https://hal.archives-ouvertes.fr/inria-00337098

K. Cheung-sia, J. Cho, and H. Cho, Efficient Monitoring Algorithm for Fast News Alerts, IEEE Transactions on Knowledge and Data Engineering, vol.19, issue.7, 2007.
DOI : 10.1109/TKDE.2007.1041

K. Sigurðsson, Incremental crawling with Heritrix, Proc. IWAW, 2005. sitemaps.org. Sitemaps XML format, 2008.

R. Song, H. Liu, J. Wen, and W. Ma, Learning block importance models for web pages, Proceedings of the 13th conference on World Wide Web , WWW '04, 2004.
DOI : 10.1145/988672.988700

M. Spaniol, D. Denev, A. Mazeika, and G. Weikum, Catch me if you can. Temporal coherence of Web archives, Proc. IWAW, 2008.

G. Stoilos, G. B. Stamou, and S. D. Kollias, A String Metric for Ontology Alignment, Proc. ISWC, 2005.
DOI : 10.1007/11574620_45

S. Strodl, C. Becker, R. Neumayer, and A. Rauber, How to choose a digital preservation strategy, Proceedings of the 2007 conference on Digital libraries , JCDL '07, 2007.
DOI : 10.1145/1255175.1255181

W. Su, J. Wang, and F. H. Lochovsky, ODE, ACM Transactions on Database Systems, vol.34, issue.2, 2009.
DOI : 10.1145/1538909.1538914

M. Fabian, G. Suchanek, G. Ifrim, and . Weikum, Combining linguistic and statistical analysis to extract relations from Web documents, Proc. KDD, 2006.

M. Fabian, G. Suchanek, G. Kasneci, and . Weikum, YAGO: A core of semantic knowledge unifying WordNet and Wikipedia, Proc. WWW, 2007.

M. Fabian, S. Suchanek, P. Abiteboul, and . Senellart, PARIS: Probabilistic alignment of relations, instances, and schema, Proc. VLDB Endow, 2011.

C. Sun, C. Chan, and A. K. Goenka, Multiway SLCA-based keyword search in XML data, Proceedings of the 16th international conference on World Wide Web , WWW '07, 2007.
DOI : 10.1145/1242572.1242713

S. Daniel, F. Swaney, M. L. Mccown, and . Nelson, Dynamic Web file format transformations with Grace, Proc. IWAW, 2005.

M. Thiam, N. Pernelle, and N. Bennacer, Contextual and metadatabased approach for the semantic annotation of heterogeneous documents, Proc. SeMMA, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00293255

M. Thiam, N. Bennacer, N. Pernelle, and M. Lo, Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents, Proc. DEXA, 2009.
DOI : 10.1075/term.9.1.06dro

URL : https://hal.archives-ouvertes.fr/hal-00423575

N. Tiezheng, Y. Ge, S. Derong, K. Yue, and L. Wei, Extracting result schema based on query instances in the deep Web, Art. Wuhan University J. Natural Sciences, vol.12, issue.5, 2007.

K. Vieira, A. S. Da-silva, and N. Pinto, A fast and robust method for web page template detection and removal, Proceedings of the 15th ACM international conference on Information and knowledge management , CIKM '06, 2006.
DOI : 10.1145/1183614.1183654

J. Wang and F. H. Lochovsky, Data extraction and label assignment for web databases, Proceedings of the twelfth international conference on World Wide Web , WWW '03, 2003.
DOI : 10.1145/775152.775179

J. Wang, J. Wen, F. Lochovsky, and W. Ma, Instance-based Schema Matching for Web Databases by Domain-specific Query Probing, Proc. VLDB, 2004.
DOI : 10.1016/B978-012088469-8.50038-3

C. Richard, W. W. Wang, and . Cohen, Language-independent set expansion of named entities using the Web, Proc. ICDM, 2007.

Y. Wang, D. J. Dewitt, and J. Cai, X-Diff: an effective change detection algorithm for XML documents, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405), 2003.
DOI : 10.1109/ICDE.2003.1260818

T. Weninger, W. H. Hsu, and J. Han, Content extraction via tag ratios, Proc. WWW, 2010.

W. Wu, A. Doan, and C. Yu, Merging interface schemas on the deep Web via clustering aggregation, Proc. Data Mining, 2005.

W. Wu, A. Doan, C. Yu, and W. Meng, Bootstrapping Domain Ontology for Semantic Web Services from Source Web Sites, Proc. VLDB Workshop on Technologies for E-Services, 2005.
DOI : 10.1007/11607380_2

D. Yadav, A. K. Sharma, and J. P. Gupta, Change Detection in Web Pages, 10th International Conference on Information Technology (ICIT 2007), 2007.
DOI : 10.1109/ICIT.2007.37

D. Yadav, A. K. Sharma, and J. P. Gupta, Parallel crawler architecture and Web page change detection, Art. WSEAS Transactions on Computers, vol.7, issue.7, 2008.

S. Yu, D. Cai, J. Wen, and W. Ma, Improving pseudo-relevance feedback in web information retrieval using web page segmentation, Proceedings of the twelfth international conference on World Wide Web , WWW '03, 2003.
DOI : 10.1145/775152.775155

X. Yuan, H. Zhang, Z. Yang, and Y. Wen, Understanding the Search Interfaces of the Deep Web Based on Domain Model, 2009 Eighth IEEE/ACIS International Conference on Computer and Information Science, 2009.
DOI : 10.1109/ICIS.2009.32

Y. Zhai and B. Liu, Web data extraction based on partial tree alignment, Proceedings of the 14th international conference on World Wide Web , WWW '05, 2005.
DOI : 10.1145/1060745.1060761

Z. Zhang, B. He, and K. Chang, Understanding Web query interfaces, Proceedings of the 2004 ACM SIGMOD international conference on Management of data , SIGMOD '04, 2004.
DOI : 10.1145/1007568.1007583

J. Zhu, Z. Nie, J. Wen, B. Zhang, and W. Ma, Simultaneous record detection and attribute labeling in web data extraction, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '06, 2006.
DOI : 10.1145/1150402.1150457