Understanding Query Interfaces by Statistical Parsing

Tuesday, 21st January 2014, 10:11 am (PDCC Discussion Room)
Speaker: Dr. Weifeng Su (BNU-HKBU United International College (UIC), China)
Title: Understanding Query Interfaces by Statistical Parsing

Users submit queries to an online database via its query interface. Query interface parsing, which is important for many applications, understands the query capabilities of a query interface. Since most query interfaces are organized hierarchically, we present a novel query interface parsing method, StatParser (Statistical Parser), to automatically extract the hierarchical query capabilities of query interfaces. StatParser automatically learns from a set of parsed query interfaces and parses new query interfaces. StatParser starts from a small grammar and enhances the grammar with a set of probabilities learned from parsed query interfaces under the maximum-entropy principle. Given a new query interface, the probability-enhanced grammar identifies the parse tree with the largest global probability to be the query capabilities of the query interface. Experimental results show that StatParser very accurately extracts the query capabilities and can effectively overcome the problems of existing query interface parsers.


Weifeng SU is an associate professor and head at the Computer Science & Technology Programme, BNU-HKBU United International College (UIC), Zhuhai, China. He received his PhD in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology in 2007. His research interests include Deep Web, Data Mining, Machine Learning, Word Sense Disambiguation, and Natural Language Processing. He published papers in top database journals and conference, including TODS, TKDE, TWEB, ICDE, EDBT and etc.