This document summarizes research on implementing search-as-you-type functionality in relational database forms. It motivates this approach by noting limitations of existing search paradigms like SQL and keyword search. Key challenges include enabling fast prefix matching, synchronizing local and global search results, handling errors and misspellings, and improving scalability for large databases. Initial achievements include a prototype called Seaform-DBLP that supports basic prefix search of a single database table, but has limitations around error tolerance, returning all results rather than top-k, and being memory-resident rather than native to a database system. Overall, search-as-you-type in database forms shows promise for balancing usability and functionality, but addressing
1 of 27
More Related Content
Seaform ºÝºÝߣs in VLDB 2010 PhD Workshop
1. DatabaseResearchGroupSearch-As-You-Type in Forms:Leveraging the Usability and the Functionalityof Search Paradigm in Relational DatabasesHao WuSupervised by Prof. Lizhu ZhouDatabase Research Group, Tsinghua UniversityVLDB PhD Workshop ¨C Sept. 13, Singapore
4. MotivationRelational databases are widely used.There are many search paradigms:Structured Query Language (SQL)Keyword Search (KS)Query-By-Example (QBE)Different search paradigms are needed by different users.10/8/20104Hao Wu, DB Group, Tsinghua University
5. Motivation#1: SQL is complex.SELECT*FROMAuthor A, Autor_Paper AP, Paper PWHERE title LIKE'keyword' AND title LIKE'search' AND authors LIKE'g%' ANDA.id = AP.aidAND P.id = AP.pid10/8/20105Hao Wu, DB Group, Tsinghua University
6. Motivation#2: Traditional keyword search is imprecise.Title? Conf. name? Author name?keyword search g10/8/20106Hao Wu, DB Group, Tsinghua University
7. Motivation#3: Form is awkward.UCI Directory: http://directory.uci.edu/index.php?form_type=advanced_search10/8/20107Hao Wu, DB Group, Tsinghua University
13. Problem StatementQuery:A set of keywords (prefixes) split by fields.A focus indicator.10/8/2010Hao Wu, DB Group, Tsinghua University13Title:Author:alFocus = Authorxml
17. Challenges: Search-As-You-TypePrefix matching:E.g.al? albert, alice, ¡Trie structure w/ cache.Fast response:Synchronization of local resultsand global results yields heavycomputational cost.On-demand synchronization and dual-list trie.10/8/2010Hao Wu, DB Group, Tsinghua University17
18. Challenges: Error ToleranceMisplacing of keywords:E.g. input "albert"into the Title input box.Automatic query refinement (given a query, how can we modify it to obtain more results?)Large search space; rely on precise estimation and probabilistic model.Fuzzy matching:E.g. input "albrt" instead of "albert".Edit-distance computation on trie structure.Ranking issue of local results: should local results be sorted by edit-distance, or by aggregation values?10/8/2010Hao Wu, DB Group, Tsinghua University18
19. Challenges: ScalabilityHandle large-scale databases:There are large number of tuples.1) Top-k algorithmPrecise aggregation is impossible in this case.2) Using RDBMS itselfIndex structure should be redesigned for DBMS; performance issues.Handle multiple tables:Data are regularized to several tables.Generalize the single-table local-global computation and reduce on-the-fly joins using pre-joined tables.It is hard to determine which tables are the most necessary to pre-join; extra storage cost.10/8/2010Hao Wu, DB Group, Tsinghua University19
30. ConclusionsSearch-as-you-type with form is a good choice to balance the usability and functionality.There are still many problems to solve:More effective index other than trie + inverted lists.Support error tolerance.Native DBMS support.Top-k algorithms.Pre-join (materialize) tables....10/8/2010Hao Wu, DB Group, Tsinghua University26