Presentation given TPDL\'11: the International Conference on Theory and Practice of Digital Libraries
1 of 29
More Related Content
TPDL'11: Query Operators Shown Beneficial for Improving Search Results
1. TPDL11: International Conference on Theory
and Practice of Digital Libraries
September 25-29, Berlin, Germany
Query Operators Shown Beneficial for
Improving Search Results
Gilles Hubert, Guillaume Cabanac,
Christian Sallaberry, Damien Palacio
2. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
2
3. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
3
4. 1. Context Operators in Search Queries G. Hubert et al.
Search Engines Offer Query Operators
Information need
Im looking for research projects funded in the DL domain
Regular query Query with operators
Various Operators
Quotation marks, Must appear (+), boosting operator (^),
Boolean operators, proximity operators
4
5. 1. Context Operators in Search Queries G. Hubert et al.
Search Engines Offer Query Operators
Information need
Im looking for research projects funded in the DL domain
Regular query Query with operators
Case 1: What designers of search engines may expect
5
6. 1. Context Operators in Search Queries G. Hubert et al.
Search Engines Offer Query Operators
Information need
Im looking for research projects funded in the DL domain
Regular query Query with operators
Case 2: What users of search engines may believe
6
7. 1. Context Operators in Search Queries G. Hubert et al.
Search Engines Offer Query Operators
Information need
Im looking for research projects funded in the DL domain
Regular query Query with operators
Case 3: What designers of search engines may fear
7
8. 1. Context Operators in Search Queries G. Hubert et al.
Usage of Query Operators
Quantitative Studies
Excite
Altavista [Jansen et al. 2000]
[Silverstein et al., 1999] Excite Google+MSN Search+Yahoo!
[Spink et al., 2001] [White and Morris, 2007]
25%
Queries with operators
20%
15%
10%
5%
0%
1999 2000 2001 2002 2003 2004 2005 2006 2007
Possible Explanations
Unknown features?
No improvement observed? 8
9. 1. Context Operators in Search Queries G. Hubert et al.
Usage of Query Operators
Qualitative Studies
Users
Average users not comfortable with advanced means of searching
[Jansen et al., 2000]
Expert users recourse to query operators more frequently
[H旦lscher and Strube, 2000; Lucas and Topi, 2002; White and Morris, 2007]
Information Needs
More used in dedicated search
[Jansen and Pooch, 2001]
Difficulty in finding information (e.g., complex information needs)
[Aula et al., 2010]
Appropriateness
Operators used in a semantically appropriate manner
[Eastman and Jansen, 2004]
9
10. 1. Context Operators in Search Queries G. Hubert et al.
Usage of Query Operators
Effects of Query Operators on Effectiveness
Eastman and Jansen studied queries with operators
Real users: AOL, Google and MSN Search
Operators: AND, OR, MUST APPEAR and PHRASE
No statistically significant improvement P@10
[Eastman and Jansen, 2003]
10
11. 1. Context Operators in Search Queries G. Hubert et al.
Usage of Query Operators
Effects of Query Operators on Effectiveness
Study on 20% of all queries
Expert users
Complex needs (Queries with operators)
[Eastman and Jansen, 2003]
11
12. 1. Context Operators in Search Queries G. Hubert et al.
Usage of Query Operators
Effects of Query Operators on Effectiveness
What about the other 80% of all queries ?!
Average users
Regular queries (no operators)
[Eastman and Jansen, 2001]
12
13. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
13
14. 2. Methodology Assessing the effects of query operators G. Hubert et al.
Our Research Questions
Q = Do query operators lead to improved search results?
Q1 = Maximum gain in Q2 = Do users succeed in
effectiveness when enriching formulating better queries
a query with operators? involving operators?
14
15. 2. Methodology Assessing the effects of query operators G. Hubert et al.
Our Methodology in a Nutshell
. VN
V4 . .
V3
V2
Regular query V1: Query variant with operators
15
16. 3. Methodology Assessing the effects of query operators G. Hubert et al.
Overview of the Methodology
preOps
postOps Query Variant {v1, , vi, , vn}
Generator
query
corpus Search l(vi)
IR model Engine
Evaluation
qrels measures of
Procedure effectiveness
metrics
Usual evaluation framework in IR
Components introduced for this study
16
17. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
17
18. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Experiment Settings
Standard Test Collections
TREC-7
TREC-8
Variant # Query variants generated with preOps and postOps
Query Operators 1 encryption equipment export
2 encryption +equipment +export
Must appear (+)
Term boosting (^N) 124 encryption +equipment export^10
338 encryption^30 equipment^40 export^50
Variant Generation
Must appear + only
Boost ^ only with weights ^10, ^20, ^30, ^40, and ^50
Both + and ^
Search engine
Terrier with various models: BM25, DFR_BM25, InL2, PL2, TF_IDF
18
19. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
TREC-7 per Topic Analysis: Boxplots
+ and ^
19
20. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
Per Topic Analysis: Boxplot
0.4 Query variant highest AP
AP (Average Precision)
0.3
AP of TRECs regular query
0.2
0.1
Query variant lowest AP
Topics
32 20
21. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
TREC-7 Per Topic Analysis MAP = 0.1554
+35.1%
+ and ^ MAP = 0.2099
21
22. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
TREC-8 per Topic Analysis MAP = 0.1840
+24.3%
+ and ^ MAP = 0.2288
22
23. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
Global Analysis: MAP
+ only
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.1836 9.5** 0.1957 0.2154 10.2*
DFR_BM25 0.1683 0.1843 9.5** 0.1965 0.2162 10.0*
InL2 0.1710 0.1852 8.3** 0.1996 0.2172 8.8*
PL2 0.1554 0.1826 17.5** 0.1840 0.2106 14.5**
TF_IDF 0.1674 0.1833 9.5** 0.1964 0.2158 9.9**
Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)
23
24. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
Global Analysis: MAP
^ only
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.2027 20.9** 0.1957 0.2312 18.1**
DFR_BM25 0.1683 0.2034 20.9** 0.1965 0.2316 17.9**
InL2 0.1710 0.2059 20.4** 0.1996 0.2352 17.8**
PL2 0.1554 0.1926 23.9** 0.1840 0.2173 18.1**
TF_IDF 0.1674 0.2026 21.0** 0.1964 0.2312 17.7**
Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)
24
25. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Results
Global Analysis: MAP
+ and ^
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.2132 27.1** 0.1957 0.2381 21.7**
DFR_BM25 0.1683 0.2133 26.7** 0.1965 0.2387 21.5**
InL2 0.1710 0.2144 25.4** 0.1996 0.2407 20.6**
PL2 0.1554 0.2099 35.1** 0.1840 0.2288 24.3**
TF_IDF 0.1674 0.2131 27.3** 0.1964 0.2383 21.3**
Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)
25
26. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
26
27. 4. Conclusion and Future Work G. Hubert et al.
Conclusions
H: the Proper Use of Query Operators Improves Search Results
Methodology to Validate H
Standard IR Test Collections: TREC-7 and TREC-8
Must Appear (+) and Boosting Operators (^)
Findings
Observed gain up to 35.1%
Statistically signi鍖cant
For all tested IR models and collections
Users Should Use Query Operators More Often
27
28. 4. Conclusion and Future Work G. Hubert et al.
Future Work
Short Term
Experimenting our methodology in various contexts
Additional IR collections
Additional IR models
Additional query operators
Medium Term
Address Q2: Do users succeed in formulating queries with operators,
so that these lead to a significant gain in effectiveness?
Study other factors
Number of terms
Selection of terms
Long Term
Additional dimensions of information
Geographic IR 28