際際滷

際際滷Share a Scribd company logo
TPDL11: International Conference on Theory
              and Practice of Digital Libraries
              September 25-29, Berlin, Germany




  Query Operators Shown Beneficial for
       Improving Search Results


    Gilles Hubert, Guillaume Cabanac,
    Christian Sallaberry, Damien Palacio
Query Operators Shown Beneficial for Improving Search Results      G. Hubert et al.



                                Outline

1. Context                    Operators in Search Queries

2. Methodology                Assessing the effects of query operators

3. Experiments                Potential of effectiveness yielded
   and Results                by operators

4. Conclusion and Future Work




                                                                         2
Query Operators Shown Beneficial for Improving Search Results      G. Hubert et al.



                                Outline

1. Context                    Operators in Search Queries

2. Methodology                Assessing the effects of query operators

3. Experiments                Potential of effectiveness yielded
   and Results                by operators

4. Conclusion and Future Work




                                                                         3
1. Context  Operators in Search Queries                                    G. Hubert et al.



Search Engines Offer Query Operators

                                    Information need
               Im looking for research projects funded in the DL domain



              Regular query                              Query with operators




    Various Operators
         Quotation marks, Must appear (+), boosting operator (^),
          Boolean operators, proximity operators
                                                                                      4
1. Context  Operators in Search Queries                                    G. Hubert et al.



Search Engines Offer Query Operators

                                Information need
           Im looking for research projects funded in the DL domain



          Regular query                              Query with operators




                                                    
                                                         
                                                                       



     Case 1: What designers of search engines may expect
                                                                                   5
1. Context  Operators in Search Queries                                   G. Hubert et al.



Search Engines Offer Query Operators

                                Information need
           Im looking for research projects funded in the DL domain



          Regular query                              Query with operators




                                                         
                                                                   



       Case 2: What users of search engines may believe
                                                                                  6
1. Context  Operators in Search Queries                                   G. Hubert et al.



Search Engines Offer Query Operators

                                Information need
           Im looking for research projects funded in the DL domain



           Regular query                             Query with operators




                                                       
                                                                  



       Case 3: What designers of search engines may fear
                                                                                  7
1. Context  Operators in Search Queries                                                            G. Hubert et al.


Usage of Query Operators
    Quantitative Studies
                             Excite
        Altavista      [Jansen et al. 2000]
[Silverstein et al., 1999]                  Excite                                 Google+MSN Search+Yahoo!
                                      [Spink et al., 2001]                          [White and Morris, 2007]
                                   25%
          Queries with operators




                                   20%

                                   15%

                                   10%

                                   5%

                                   0%
                                         1999   2000   2001   2002   2003   2004   2005   2006   2007



    Possible Explanations
         Unknown features?
         No improvement observed?                                                                            8
1. Context  Operators in Search Queries                                               G. Hubert et al.


Usage of Query Operators
    Qualitative Studies
         Users
              Average users not comfortable with advanced means of searching
               [Jansen et al., 2000]
              Expert users recourse to query operators more frequently
               [H旦lscher and Strube, 2000; Lucas and Topi, 2002; White and Morris, 2007]


         Information Needs
              More used in dedicated search
               [Jansen and Pooch, 2001]
              Difficulty in finding information (e.g., complex information needs)
               [Aula et al., 2010]


         Appropriateness
              Operators used in a semantically appropriate manner
               [Eastman and Jansen, 2004]

                                                                                                 9
1. Context  Operators in Search Queries                               G. Hubert et al.


Usage of Query Operators
    Effects of Query Operators on Effectiveness



                      Eastman and Jansen studied queries with operators
                         Real users: AOL, Google and MSN Search
                         Operators: AND, OR, MUST APPEAR and PHRASE

                         No statistically significant improvement P@10




                                   [Eastman and Jansen, 2003]
                                                                                10
1. Context  Operators in Search Queries                         G. Hubert et al.


Usage of Query Operators
    Effects of Query Operators on Effectiveness



                      Study on 20% of all queries
                           Expert users
                           Complex needs (Queries with operators)




                                    [Eastman and Jansen, 2003]
                                                                          11
1. Context  Operators in Search Queries                        G. Hubert et al.


Usage of Query Operators
    Effects of Query Operators on Effectiveness



                      What about the other 80% of all queries ?!
                           Average users
                           Regular queries (no operators)




                                    [Eastman and Jansen, 2001]
                                                                         12
Query Operators Shown Beneficial for Improving Search Results      G. Hubert et al.



                                Outline

1. Context                    Operators in Search Queries

2. Methodology                Assessing the effects of query operators

3. Experiments                Potential of effectiveness yielded
   and Results                by operators

4. Conclusion and Future Work




                                                                        13
2. Methodology  Assessing the effects of query operators            G. Hubert et al.


Our Research Questions



    Q = Do query operators lead to improved search results?




    Q1 = Maximum gain in                        Q2 = Do users succeed in
effectiveness when enriching                   formulating better queries
   a query with operators?                        involving operators?



                                                                           14
2. Methodology  Assessing the effects of query operators                   G. Hubert et al.


Our Methodology in a Nutshell
                                                                            . VN
                                                                     V4 . .
                                                               V3
                                                         V2
          Regular query                        V1: Query variant with operators




                                                     
                                                            
                                                                      




                                 
                                                                                  15
3. Methodology  Assessing the effects of query operators                      G. Hubert et al.


Overview of the Methodology


preOps

postOps         Query Variant      {v1,  , vi, , vn}
                 Generator
query

corpus                                  Search           l(vi)
IR model                                Engine
                                                                 Evaluation
qrels                                                                         measures of
                                                                 Procedure    effectiveness
metrics




             Usual evaluation framework in IR

             Components introduced for this study
                                                                                       16
Query Operators Shown Beneficial for Improving Search Results      G. Hubert et al.



                                Outline

1. Context                    Operators in Search Queries

2. Methodology                Assessing the effects of query operators

3. Experiments                Potential of effectiveness yielded
   and Results                by operators

4. Conclusion and Future Work




                                                                        17
3. Experiments and Results  Potential of effectiveness yielded by operators                      G. Hubert et al.


Experiment Settings
    Standard Test Collections
         TREC-7
         TREC-8
                                       Variant #   Query variants generated with preOps and postOps

    Query Operators                      1         encryption       equipment          export
                                          2         encryption       +equipment         +export
         Must appear (+)                                                               
         Term boosting (^N)             124        encryption       +equipment        export^10
                                                                                        
                                         338       encryption^30   equipment^40        export^50
    Variant Generation
         Must appear + only
         Boost ^ only with weights ^10, ^20, ^30, ^40, and ^50
         Both + and ^


    Search engine
         Terrier with various models: BM25, DFR_BM25, InL2, PL2, TF_IDF
                                                                                                           18
3. Experiments and Results  Potential of effectiveness yielded by operators   G. Hubert et al.


Results
    TREC-7 per Topic Analysis: Boxplots
         + and ^




                                                                                        19
3. Experiments and Results  Potential of effectiveness yielded by operators   G. Hubert et al.


Results
    Per Topic Analysis: Boxplot
                                          0.4          Query variant highest AP
                 AP (Average Precision)


                                          0.3

                                                       AP of TRECs regular query

                                          0.2




                                          0.1

                                                        Query variant lowest AP

                                                     Topics
                                                32                                      20
3. Experiments and Results  Potential of effectiveness yielded by operators   G. Hubert et al.


Results
    TREC-7 Per Topic Analysis                      MAP  = 0.1554
                                                                         +35.1%
         + and ^                               MAP  = 0.2099




                                                                                        21
3. Experiments and Results  Potential of effectiveness yielded by operators   G. Hubert et al.


Results
    TREC-8 per Topic Analysis                      MAP  = 0.1840
                                                                         +24.3%
         + and ^                               MAP  = 0.2288




                                                                                        22
3. Experiments and Results  Potential of effectiveness yielded by operators                    G. Hubert et al.


Results
    Global Analysis: MAP
          + only


                                   TREC-7                                     TREC-8
                              MAP                                       MAP
     Model            Baseline       VOP          (%)          Baseline       VOP           (%)
     BM25              0.1677       0.1836        9.5**          0.1957       0.2154         10.2*
     DFR_BM25          0.1683       0.1843        9.5**          0.1965       0.2162         10.0*
     InL2              0.1710       0.1852        8.3**          0.1996       0.2172          8.8*
     PL2               0.1554       0.1826       17.5**          0.1840       0.2106         14.5**
     TF_IDF            0.1674       0.1833        9.5**          0.1964       0.2158          9.9**
                Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)




                                                                                                         23
3. Experiments and Results  Potential of effectiveness yielded by operators                    G. Hubert et al.


Results
    Global Analysis: MAP
          ^ only


                                   TREC-7                                     TREC-8
                              MAP                                       MAP
     Model            Baseline       VOP          (%)          Baseline       VOP           (%)
     BM25              0.1677       0.2027       20.9**          0.1957       0.2312         18.1**
     DFR_BM25          0.1683       0.2034       20.9**          0.1965       0.2316         17.9**
     InL2              0.1710       0.2059       20.4**          0.1996       0.2352         17.8**
     PL2               0.1554       0.1926       23.9**          0.1840       0.2173         18.1**
     TF_IDF            0.1674       0.2026       21.0**          0.1964       0.2312         17.7**
                Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)




                                                                                                         24
3. Experiments and Results  Potential of effectiveness yielded by operators                    G. Hubert et al.


Results
    Global Analysis: MAP
          + and ^


                                    TREC-7                                    TREC-8
                               MAP                                      MAP
     Model               Baseline    VOP          (%)          Baseline       VOP           (%)
     BM25                0.1677     0.2132       27.1**          0.1957       0.2381         21.7**
     DFR_BM25            0.1683     0.2133       26.7**          0.1965       0.2387         21.5**
     InL2                0.1710     0.2144       25.4**          0.1996       0.2407         20.6**
     PL2                 0.1554     0.2099       35.1**          0.1840       0.2288         24.3**
     TF_IDF              0.1674     0.2131       27.3**          0.1964       0.2383         21.3**
                Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01)




                                                                                                         25
Query Operators Shown Beneficial for Improving Search Results      G. Hubert et al.



                                Outline

1. Context                    Operators in Search Queries

2. Methodology                Assessing the effects of query operators

3. Experiments                Potential of effectiveness yielded
   and Results                by operators

4. Conclusion and Future Work




                                                                        26
4. Conclusion and Future Work                           G. Hubert et al.


Conclusions
    H: the Proper Use of Query Operators Improves Search Results

    Methodology to Validate H

    Standard IR Test Collections: TREC-7 and TREC-8

    Must Appear (+) and Boosting Operators (^)

    Findings
       Observed gain up to 35.1%
       Statistically signi鍖cant
       For all tested IR models and collections



 Users Should Use Query Operators More Often
                                                                 27
4. Conclusion and Future Work                                               G. Hubert et al.


Future Work
    Short Term
         Experimenting our methodology in various contexts
             Additional IR collections

             Additional IR models

             Additional query operators




    Medium Term
         Address Q2: Do users succeed in formulating queries with operators,
          so that these lead to a significant gain in effectiveness?
         Study other factors
             Number of terms

             Selection of terms




    Long Term
         Additional dimensions of information
            Geographic IR                                                           28
TPDL11: International Conference on Theory
              and Practice of Digital Libraries
              September 25-29, Berlin, Germany




           Thank you

More Related Content

TPDL'11: Query Operators Shown Beneficial for Improving Search Results

  • 1. TPDL11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany Query Operators Shown Beneficial for Improving Search Results Gilles Hubert, Guillaume Cabanac, Christian Sallaberry, Damien Palacio
  • 2. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al. Outline 1. Context Operators in Search Queries 2. Methodology Assessing the effects of query operators 3. Experiments Potential of effectiveness yielded and Results by operators 4. Conclusion and Future Work 2
  • 3. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al. Outline 1. Context Operators in Search Queries 2. Methodology Assessing the effects of query operators 3. Experiments Potential of effectiveness yielded and Results by operators 4. Conclusion and Future Work 3
  • 4. 1. Context Operators in Search Queries G. Hubert et al. Search Engines Offer Query Operators Information need Im looking for research projects funded in the DL domain Regular query Query with operators Various Operators Quotation marks, Must appear (+), boosting operator (^), Boolean operators, proximity operators 4
  • 5. 1. Context Operators in Search Queries G. Hubert et al. Search Engines Offer Query Operators Information need Im looking for research projects funded in the DL domain Regular query Query with operators Case 1: What designers of search engines may expect 5
  • 6. 1. Context Operators in Search Queries G. Hubert et al. Search Engines Offer Query Operators Information need Im looking for research projects funded in the DL domain Regular query Query with operators Case 2: What users of search engines may believe 6
  • 7. 1. Context Operators in Search Queries G. Hubert et al. Search Engines Offer Query Operators Information need Im looking for research projects funded in the DL domain Regular query Query with operators Case 3: What designers of search engines may fear 7
  • 8. 1. Context Operators in Search Queries G. Hubert et al. Usage of Query Operators Quantitative Studies Excite Altavista [Jansen et al. 2000] [Silverstein et al., 1999] Excite Google+MSN Search+Yahoo! [Spink et al., 2001] [White and Morris, 2007] 25% Queries with operators 20% 15% 10% 5% 0% 1999 2000 2001 2002 2003 2004 2005 2006 2007 Possible Explanations Unknown features? No improvement observed? 8
  • 9. 1. Context Operators in Search Queries G. Hubert et al. Usage of Query Operators Qualitative Studies Users Average users not comfortable with advanced means of searching [Jansen et al., 2000] Expert users recourse to query operators more frequently [H旦lscher and Strube, 2000; Lucas and Topi, 2002; White and Morris, 2007] Information Needs More used in dedicated search [Jansen and Pooch, 2001] Difficulty in finding information (e.g., complex information needs) [Aula et al., 2010] Appropriateness Operators used in a semantically appropriate manner [Eastman and Jansen, 2004] 9
  • 10. 1. Context Operators in Search Queries G. Hubert et al. Usage of Query Operators Effects of Query Operators on Effectiveness Eastman and Jansen studied queries with operators Real users: AOL, Google and MSN Search Operators: AND, OR, MUST APPEAR and PHRASE No statistically significant improvement P@10 [Eastman and Jansen, 2003] 10
  • 11. 1. Context Operators in Search Queries G. Hubert et al. Usage of Query Operators Effects of Query Operators on Effectiveness Study on 20% of all queries Expert users Complex needs (Queries with operators) [Eastman and Jansen, 2003] 11
  • 12. 1. Context Operators in Search Queries G. Hubert et al. Usage of Query Operators Effects of Query Operators on Effectiveness What about the other 80% of all queries ?! Average users Regular queries (no operators) [Eastman and Jansen, 2001] 12
  • 13. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al. Outline 1. Context Operators in Search Queries 2. Methodology Assessing the effects of query operators 3. Experiments Potential of effectiveness yielded and Results by operators 4. Conclusion and Future Work 13
  • 14. 2. Methodology Assessing the effects of query operators G. Hubert et al. Our Research Questions Q = Do query operators lead to improved search results? Q1 = Maximum gain in Q2 = Do users succeed in effectiveness when enriching formulating better queries a query with operators? involving operators? 14
  • 15. 2. Methodology Assessing the effects of query operators G. Hubert et al. Our Methodology in a Nutshell . VN V4 . . V3 V2 Regular query V1: Query variant with operators 15
  • 16. 3. Methodology Assessing the effects of query operators G. Hubert et al. Overview of the Methodology preOps postOps Query Variant {v1, , vi, , vn} Generator query corpus Search l(vi) IR model Engine Evaluation qrels measures of Procedure effectiveness metrics Usual evaluation framework in IR Components introduced for this study 16
  • 17. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al. Outline 1. Context Operators in Search Queries 2. Methodology Assessing the effects of query operators 3. Experiments Potential of effectiveness yielded and Results by operators 4. Conclusion and Future Work 17
  • 18. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Experiment Settings Standard Test Collections TREC-7 TREC-8 Variant # Query variants generated with preOps and postOps Query Operators 1 encryption equipment export 2 encryption +equipment +export Must appear (+) Term boosting (^N) 124 encryption +equipment export^10 338 encryption^30 equipment^40 export^50 Variant Generation Must appear + only Boost ^ only with weights ^10, ^20, ^30, ^40, and ^50 Both + and ^ Search engine Terrier with various models: BM25, DFR_BM25, InL2, PL2, TF_IDF 18
  • 19. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results TREC-7 per Topic Analysis: Boxplots + and ^ 19
  • 20. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results Per Topic Analysis: Boxplot 0.4 Query variant highest AP AP (Average Precision) 0.3 AP of TRECs regular query 0.2 0.1 Query variant lowest AP Topics 32 20
  • 21. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results TREC-7 Per Topic Analysis MAP = 0.1554 +35.1% + and ^ MAP = 0.2099 21
  • 22. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results TREC-8 per Topic Analysis MAP = 0.1840 +24.3% + and ^ MAP = 0.2288 22
  • 23. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results Global Analysis: MAP + only TREC-7 TREC-8 MAP MAP Model Baseline VOP (%) Baseline VOP (%) BM25 0.1677 0.1836 9.5** 0.1957 0.2154 10.2* DFR_BM25 0.1683 0.1843 9.5** 0.1965 0.2162 10.0* InL2 0.1710 0.1852 8.3** 0.1996 0.2172 8.8* PL2 0.1554 0.1826 17.5** 0.1840 0.2106 14.5** TF_IDF 0.1674 0.1833 9.5** 0.1964 0.2158 9.9** Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01) 23
  • 24. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results Global Analysis: MAP ^ only TREC-7 TREC-8 MAP MAP Model Baseline VOP (%) Baseline VOP (%) BM25 0.1677 0.2027 20.9** 0.1957 0.2312 18.1** DFR_BM25 0.1683 0.2034 20.9** 0.1965 0.2316 17.9** InL2 0.1710 0.2059 20.4** 0.1996 0.2352 17.8** PL2 0.1554 0.1926 23.9** 0.1840 0.2173 18.1** TF_IDF 0.1674 0.2026 21.0** 0.1964 0.2312 17.7** Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01) 24
  • 25. 3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al. Results Global Analysis: MAP + and ^ TREC-7 TREC-8 MAP MAP Model Baseline VOP (%) Baseline VOP (%) BM25 0.1677 0.2132 27.1** 0.1957 0.2381 21.7** DFR_BM25 0.1683 0.2133 26.7** 0.1965 0.2387 21.5** InL2 0.1710 0.2144 25.4** 0.1996 0.2407 20.6** PL2 0.1554 0.2099 35.1** 0.1840 0.2288 24.3** TF_IDF 0.1674 0.2131 27.3** 0.1964 0.2383 21.3** Statistical signi鍖cance is denoted by * for p < 0.05 (** for p < 0.01) 25
  • 26. Query Operators Shown Beneficial for Improving Search Results G. Hubert et al. Outline 1. Context Operators in Search Queries 2. Methodology Assessing the effects of query operators 3. Experiments Potential of effectiveness yielded and Results by operators 4. Conclusion and Future Work 26
  • 27. 4. Conclusion and Future Work G. Hubert et al. Conclusions H: the Proper Use of Query Operators Improves Search Results Methodology to Validate H Standard IR Test Collections: TREC-7 and TREC-8 Must Appear (+) and Boosting Operators (^) Findings Observed gain up to 35.1% Statistically signi鍖cant For all tested IR models and collections Users Should Use Query Operators More Often 27
  • 28. 4. Conclusion and Future Work G. Hubert et al. Future Work Short Term Experimenting our methodology in various contexts Additional IR collections Additional IR models Additional query operators Medium Term Address Q2: Do users succeed in formulating queries with operators, so that these lead to a significant gain in effectiveness? Study other factors Number of terms Selection of terms Long Term Additional dimensions of information Geographic IR 28
  • 29. TPDL11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany Thank you