ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
DatabaseResearchGroupSearch-As-You-Type in Forms:Leveraging the Usability and the Functionalityof Search Paradigm in Relational DatabasesHao WuSupervised by Prof. Lizhu ZhouDatabase Research Group, Tsinghua UniversityVLDB PhD Workshop ¨C Sept. 13, Singapore
MotivationProblem StatementChallengesInitial AchievementsConclusions
MotivationProblem StatementChallengesInitial AchievementsConclusions
MotivationRelational databases are widely used.There are many search paradigms:Structured Query Language (SQL)Keyword Search (KS)Query-By-Example (QBE)Different search paradigms are needed by different users.10/8/20104Hao Wu, DB Group, Tsinghua University
Motivation#1: SQL is complex.SELECT*FROMAuthor A, Autor_Paper AP, Paper PWHERE  title LIKE'keyword' AND       title LIKE'search' AND       authors LIKE'g%'      ANDA.id    =    AP.aidAND       P.id    =    AP.pid10/8/20105Hao Wu, DB Group, Tsinghua University
Motivation#2: Traditional keyword search is imprecise.Title? Conf. name? Author name?keyword search g10/8/20106Hao Wu, DB Group, Tsinghua University
Motivation#3: Form is awkward.UCI Directory: http://directory.uci.edu/index.php?form_type=advanced_search10/8/20107Hao Wu, DB Group, Tsinghua University
Motivation#4: The "Search" button is not convenient.10/8/20108Hao Wu, DB Group, Tsinghua University
Motivation+    Keyword Search+    Form-Style Interface+    Search-as-you-typeSeaform=10/8/20109Hao Wu, DB Group, Tsinghua University
MotivationProblem StatementChallengesInitial AchievementsConclusions
MotivationProblem StatementChallengesInitial AchievementsConclusions
Problem StatementData:Single relational table.Several searchable attributes.10/8/2010Hao Wu, DB Group, Tsinghua University12
Problem StatementQuery:A set of keywords (prefixes) split by fields.A focus indicator.10/8/2010Hao Wu, DB Group, Tsinghua University13Title:Author:alFocus = Authorxml
Problem StatementResults:Global results: corresponding tuples.Local results: corresponding attribute values.Aggregations.10/8/2010Hao Wu, DB Group, Tsinghua University14xml database (albert)xml search (albert)xml security (alice)Title:Author:alalbert2alice1xml
MotivationProblem StatementChallengesInitial AchievementsConclusions
MotivationProblem StatementChallengesInitial AchievementsConclusions
Challenges: Search-As-You-TypePrefix matching:E.g.al? albert, alice, ¡­Trie structure w/ cache.Fast response:Synchronization of local resultsand global results yields heavycomputational cost.On-demand synchronization and dual-list trie.10/8/2010Hao Wu, DB Group, Tsinghua University17
Challenges: Error ToleranceMisplacing of keywords:E.g. input "albert"into the Title input box.Automatic query refinement (given a query, how can we modify it to obtain more results?)Large search space; rely on precise estimation and probabilistic model.Fuzzy matching:E.g. input "albrt" instead of "albert".Edit-distance computation on trie structure.Ranking issue of local results: should local results be sorted by edit-distance, or by aggregation values?10/8/2010Hao Wu, DB Group, Tsinghua University18
Challenges: ScalabilityHandle large-scale databases:There are large number of tuples.1) Top-k algorithmPrecise aggregation is impossible in this case.2) Using RDBMS itselfIndex structure should be redesigned for DBMS; performance issues.Handle multiple tables:Data are regularized to several tables.Generalize the single-table local-global computation and reduce on-the-fly joins using pre-joined tables.It is hard to determine which tables are the most necessary to pre-join; extra storage cost.10/8/2010Hao Wu, DB Group, Tsinghua University19
MotivationProblem StatementChallengesInitial AchievementsConclusions
MotivationProblem StatementChallengesInitial AchievementsConclusions
Initial AchievementsSeaform-DBLPFeatures:Single table.
Prefix matching.
Average response time is less than 30 ms.Limitations:Does not tolerate errors.
Non-top-k, i.e. it returns all matching results.
Memory-resident.10/8/201022Hao Wu, DB Group, Tsinghua University
Demonstrations:Sept. 14, Tuesday214:00 to 15:30Sept. 15, Wednesday514:00 to 15:30

More Related Content

Seaform ºÝºÝߣs in VLDB 2010 PhD Workshop