pytrec_eval is a Python interface to TREC's evaluation tool, trec_eval. It is an attempt to stop the cultivation of custom implementations of Information Retrieval evaluation measures for the Python programming language.
The module was developed using Python 3.5. You need a Python distribution that comes with development headers. In addition to the default Python modules, numpy and scipy are required.
Installation is simple and should be relatively painless if your Python environment is functioning correctly (see below for FAQs).
# Clone the source.
git clone https://github.com/cvangysel/pytrec_eval.git
cd pytrec_eval
# Pull in the trec_eval source.
git submodule init
git submodule update
# Install dependencies.
pip install -r requirements.txt
# Install pytrec_eval.
python setup.py install
Check out the examples that simulates the standard trec_eval front-end and that computes statistical significance between two runs.
To get a grasp of how simple the module is to use, check this out:
import pytrec_eval
import json
qrel = {
'q1': {
'd1': 0,
'd2': 1,
'd3': 0,
},
'q2': {
'd2': 1,
'd3': 1,
},
}
run = {
'q1': {
'd1': 1.0,
'd2': 0.0,
'd3': 1.5,
},
'q2': {
'd1': 1.5,
'd2': 0.2,
'd3': 0.5,
}
}
evaluator = pytrec_eval.RelevanceEvaluator(
qrel, {'map', 'ndcg'})
print(json.dumps(evaluator.evaluate(run), indent=1))
The above snippet will return a data structure that contains the requested evaluation measures for queries q1
and q2
:
{
'q1': {
'ndcg': 0.5,
'map': 0.3333333333333333
},
'q2': {
'ndcg': 0.6934264036172708,
'map': 0.5833333333333333
}
}
For more like this, see the examples that uses parametrized evaluation measures.
Since the module's first release, no questions have been asked so frequently that they deserved a spot in this section.
pytrec_eval is licensed under the MIT license. Please note that trec_eval is licensed separately. If you modify pytrec_eval in any way, please link back to this repository.