This document proposes a method to improve query facet mining by leveraging knowledge bases in addition to search results. Existing methods mine facets from frequent lists in top search results, which has limited coverage. The proposed method first generates facets based on entity properties in knowledge bases like Freebase that correspond to the query, and then expands initial facets mined from search results by finding similar entities in the knowledge base. An evaluation shows this approach significantly improves the coverage of facet items over state-of-the-art algorithms.
1 of 4
Download to read offline
More Related Content
GENERATING QUERY FACETS USING KNOWLEDGE BASES
1. GENERATING QUERY FACETS USING KNOWLEDGE BASES
ABSTRACT
A query facet is a significant list of information nuggets that explains an underlying aspect of a
query. Existing algorithms mine facets of a query by extracting frequent lists contained in top
search results. The coverage of facets and facet items mined by this kind of methods might be
limited, because only a small number of search results are used. In order to solve this problem,
we propose mining query facets by using knowledge bases which contain high-quality structured
data. Specifically, we first generate facets based on the properties of the entities which are
contained in Freebase and correspond to the query. Second, we mine initial query facets from
search results, then expanding them by finding similar entities from Freebase. Experimental
results show that our proposed method can significantly improve the coverage of facet items over
the state-of-the-art algorithms.
EXISTING SYSTEM
Existing query facet mining algorithms mainly rely on the top search results from search engines.
Dou et al. first introduced the concept of query dimensions [4], which is the same concept as
query facet discussed in this paper. They proposed QDMiner, a system that can automatically
mine query facets by aggregating frequent lists contained in the results. The lists are extracted by
HTML tags (like <select> and <table>), text patterns, and repeat content blocks contained in web
pages. Kong et al proposed two supervised methods, namely QF-I and QF-J, to mine query facets
from the results. In all these existing solutions, facet items are extracted from the top search
results from a search engine (e.g., top 100 search results from Bing.com). More specifically,
facet items are extracted from the lists contained in the results. The problem is that the coverage
of facets mined using this kind of methods might be limited, because some useful words or
phrases might not appear in a list within the search results used and they have no opportunity to
be mined.
DISADVANTAGES
ï‚· Previous studies show that many users are not satisfied with this kind of conventional
search result pages.
2. ï‚· Users often have to click and view many documents to summarize the information they
are seeking, especially when they want to learn about a topic that covers different aspects.
ï‚· This usually takes a lot of time and troubles the users.
ï‚· Mining query facets (or query dimensions) is an emerging approach to solve the problem.
ï‚· Existing query facet mining algorithms mainly rely on the top search results from search
engines
PROPOSED SYSTEM
In order to solve this problem, we propose leveraging a knowledge base as a complementary data
source to improve the quality of query facets. Knowledge bases contain highquality structured
information such as entities and their properties and are especially useful when the query is
related to an entity. We propose using both knowledge bases and search results to mine query
facets in this paper. The reason why we don’t abandon search results is that search results reflect
user intent and provide abundant context for facet generation and expansion. Our target is to
improve the recall of facet and facet items by utilizing entities and their properties contained in
knowledge bases, and at the same time, make sure that the accuracy of facet items are not
harmed too much. Our approach consists of two methods which are facet generation and facet
expansion.
Advantages:
By leveraging both knowledge bases and search results, QDMKB breaks the limitation of
only using search results to generate query facets, thus could improve the quality of
facets, especially recall.
Objectives:
3. ï‚· Existing query facet mining algorithms, including QDMiner, QF-I, and QF-J mainly rely
on the top search results from the search engines.
ï‚· The coverage of facets mined using this kind of methods might be limited, because
usually only a small number of results are used.
ï‚· We propose leveraging knowledge bases as complementary data sources.
ï‚· We use two methods, namely facet generation and facet expansion, to mine query facets.
Facet generation directly uses properties in Freebase as candidates, while facet expansion
intends to expand initial facets mined by QDMiner in propertybased and type-based
manners
SYSTEM CONFIGURATION:
HARDWARE REQUIREMENTS:
Hardware - Pentium
Speed - 1.1 GHz
RAM - 1GB
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
SOFTWARE REQUIREMENTS:
4. Operating System : Windows
Technology : Java and J2EE
Web Technologies : Html, JavaScript, CSS
IDE : My Eclipse
Web Server : Tomcat
Tool kit : Android Phone
Database : My SQL
Java Version : J2SDK1.5