TREC CAR

Data Releases (latest, v1.5)

We provide the following data sets under a Creative Commons Attribution-ShareAlike 3.0 Unported License. It is based on content extracted from Wikipedia that is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.

Please cite this data set as

Laura Dietz, Ben Gamari. 
"TREC CAR: A Data Set for Complex Answer Retrieval". 
Version 1.5, 2017. 
http://trec-car.cs.unh.edu

Data sets

All archives use XZ compression, datasize refers to uncompressed data.

For discontinued data releases see:

Support Tools

Support tools for reading this data can be found here: https://github.com/TREMA-UNH/trec-car-tools.

Support tools to work with the v1.4 release will be maintained in the v1.5 branch

Earlier versions are not supported.

Contents

The following kinds of data derivatives are provided

Data:

Qrels (trec_eval-compatible qrels files) which are automatically derived from Articles (to be complemented by human judgments).

Test topics

For evaluation, only a *cbor.outlines fill will be distributed and a decision will have been made which of the three qrels files will be used.

Issues

Please submit any issues with trec-car-tools as well as the provided data sets using the issue tracker on github.

Creative Commons License
TREC-CAR Dataset by Laura Dietz, Ben Gamari is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Based on a work at www.wikipedia.org.