Learning to Choose the Best System Configuration in Information Retrieval: the Case of Repeated Queries
Anthony Bigot (Université de Toulouse, France)
Sébastien Déjean (Université de Toulouse, France)
Josiane Mothe (Université de Toulouse, France)
Abstract: This paper presents a method that automatically decides which system configuration should be used to process a query. This method is developed for the case of repeated queries and implements a new kind of meta-system. It is based on a training process: the meta-system learns the best system configuration to use on a per query basis. After training, the meta-search system knows which configuration should treat a given query.
The Learning to Choose method we developed selects the best configurations among many. This selective process rests on data analytics applied to system parameter values and their link with system effectiveness. Moreover, we optimize the parameters on a per-query basis. The training phase uses a limited amount of document relevance judgment. When the query is repeated or when an equal-query is submitted to the system, the meta-system automatically knows which parameters it should use to treat the query. This method fits the case of changing collections since what is learned is the relationship between a query and the best parameters to use to process it, rather than the relationship between a query and documents to retrieve.
In this paper, we describe how data analysis can help to select among various configurations the ones that will be useful. The "Learning to choose" method is presented and evaluated using simulated data from TREC campaigns. We show that system performance highly increases in terms of precision, specifically for the queries that are difficult or medium difficult to answer. The other parameters of the method are also studied.
Keywords: evaluation, information retrieval, learning in IR, meta search, repeated queries, system combination