Data Releases (latest, v2.3 – official Y3 benchmark)

We provide the following data sets under a Creative Commons Attribution-ShareAlike 3.0 Unported License. It is based on content extracted from Wikipedia that is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License and the Textbook Question Answering corpus from AI2.

Please cite this data set as

Laura Dietz, Ben Gamari. 
"TREC CAR 2.3: A Data Set for Complex Answer Retrieval". 
Version 2.3, 2019.

This is the official dataset for the Y3 evaluation.

It contains new test queries (benchmarkY3test), as well as training data from automatic benchmarks and manual assessments from Y1 and Y2.

The paragraph IDs and entity IDs have not changed between this version and v2.0. The paragraphCorpus and allButBenchmark are the same as from v2.1.

For follow-up datasets and Wikipedia dumps see v2.4

Submission URL

Submission concluded.

validation script - and the required car.topics file: To validate that your run files are in the correct format.

Data sets

All archives use XZ or GZ compression, datasize refers to uncompressed data.

Please use these files for your TREC CAR 2019 submissions!

Code and Data for Validation and Population

Y3 Results

Y3 Releases

The dataset for Y3 is based on the TQA dataset of AI2, further described here:

Kembhavi, Aniruddha, et al. “Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension.” CVPR. Vol. 2. 2017.

The following previous releases of v2.0 and v2.1 are still valid for Y3:

Corpus of paragraphs

Knowledge base

extra-large training set




obsolete datasets (which also work, but are not recommended)

Unless denoted otherwise files are based on a Wikipedia dump from December 20, 2016. BenchmarkY2test is based on files from the AI2’s TQA corpus and a Wikipedia dump from June 2018.

Explanation of Filenames

The following kinds of data derivatives are provided


Qrels (trec_eval-compatible qrels files) which are automatically derived from Articles (to be complemented by human judgments).

Test topics

For evaluation, only a *cbor.outlines fill will be distributed and a decision will have been made which of the three qrels files will be used.


Please submit any issues with trec-car-tools as well as the provided data sets using the issue tracker on github.

Bindings for Java

Please use release version 17 from trec-car-tools-java

Alternative 1: Use Maven (highly recommended!)



Alternative 2: checkout source code

Git repository [][{}]

Please check out tag “17”!


An example on how to lead TREC CAR data from java can be found on github trec-car-tools/trec-car-tools-example.

Bindings for Python 3

Alternative 1: Install via pypi and pip

pip install trec-car-tools

More info on

version 2.3 and 2.1 should both work for benchmarkY3

Alternative 2: Checkout from source code

Python support tools for reading this data can be found here: branch:v2.1 directory: python3 It is available on PyPi and anaconda cloud as trec-car-tools v2.1.

  1. Clone the github repository
  2. cd into python3
  3. call python install

Look at for an example on how to access the data.

Old Releases

For discontinued data releases see:

Creative Commons License
TREC-CAR Dataset by Laura Dietz, Ben Gamari, Jeff Dalton is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Based on a work at