Skip to content
/ dval Public

Validator and Scorer for Data Science Evaluations DSE and D3M


Notifications You must be signed in to change notification settings


Repository files navigation

NIST Data Science Validation and Scoring code

This repository contains the NIST validation and scoring code components for the DSE and D3M evaluations. The DSE evaluation can be found at

In order to run the tests, it is required to use python version 3.6.

Predictions file validation


Requires a local copy of the directory for the problem/dataset that contains:

  • the dataset schema at path_to_score_root/dataset_TEST/datasetDoc.json
  • the problem schema at path_to_score_root/problem_TEST/problemDoc.json
  • the test learningData.csv at path_to_score_root/dataset_TEST/tables/learningData.csv

Download the seed datasets at Each problem/dataset has a SCORE folder that contains this structure.


This package works with Python 3.6+ and requires the d3m core package.

To install latest released version:

$ pip install git+

To install a particular release of the package, e.g., v2018.4.28:

$ pip install git+

To install latest development (unreleased) version:

$ pip install git+

CLI Usage

Validate a pipeline log

dval valid_pipelines pipeline_log_file [pipeline_log_file ...]


  • pipeline_log_file: path to the predictions file to validate

For example dval valid_pipelines mylog1.json mylog2.json

In shells like bash, you can also do : dval valid_pipelines *.json

Validate a predictions file

dval valid_predictions -d score_dir predictions_file [predictions_file ...]


  • score_dir: path to the directory described in Section Requirements. Use the SCORE directory of the seed datasets.
  • predictions_file: path to the predictions file to validate

Score a predictions file

dval score -d score_dir [-g ground_truth_file] [--validation | --no-validation] predictions_file [predictions_file ...]


  • score_dir: path to the directory described in Section Requirements. Use the SCORE directory of the seed datasets.
  • ground_truth_file: path to the ground truth file. If absent, will default to score_dir/targets.csv
  • predictions_file: path to the predictions file to score
  • --validation | --no-validation: validation is on by default. turn in off with --no-validation

Validate a generated problems directory

dval valid_generated_problems ./test/generated_problems/correct_submission/


  • problems_directory: path to directory containing the generated problems.

Docker usage

Building the docker image

Build the Docker image from the Dockerfile:

git checkout v2018.4.20  # getting a specific version of the code
docker build -t dval .

Running the docker image

The usage is the same as the CLI using a docker container but ⚠️ remember to mount the data that you want to validate or score to the container.

For example, to validate a predictions.csv file:

docker run -v /hostpath/to/data:/tmp/data dval valid_predictions -d /tmp/data/SCORE /tmp/data/predictions.csv

Code Usage

path_to_score_root = 'test/data/185_baseball_SCORE'
groundtruth_path = 'test/data/185_baseball_SCORE/targets.csv'
result_file_path = 'test/data/185_baseball_SCORE/mitll_predictions.csv'

Option 1: Using the Predictions class

>>> from dval.predictions import Predictions
>>> p = Predictions(result_file_path, path_to_score_root)
>>> p.is_valid()
>>> scores = p.score(groundtruth_path)
>>> scores
[Score(target='Hall_Of_Fame', metric='f1', scorevalue=0.691369766848)]
>>> scores[0]['scorevalue']

with the Score object being a named tuple defined the following way

import collections
Score = collections.namedtuple('Score', ['target', 'metric', 'scorevalue'])

If a problem schema describes multiple targets and/or multiple metrics, the .score() function will return a list of Score objects, one for each combination of (target, metric).

Option 2: Using the wrapper functions

>>> from dval.predictions import is_predictions_file_valid, score_predictions_file
>>> is_predictions_file_valid(result_file_path, path_to_score_root)
>>> scores = score_predictions_file(result_file_path, path_to_score_root, groundtruth_path)
>>> scores
[Score(target='Hall_Of_Fame', metric='f1', scorevalue=0.691369766848)]
>>> scores[0]['scorevalue']


Checks that the validation code does on the prediction file include:

  • Checks that file exists and is readable
  • Checks the header (needs to be indexName, targetName1, [targetName2, ...])
  • Check target types (from dataset schema data field types)
  • Check length of the index
  • Compare index with expected index

Pipeline Validation


>>> from dval.pipeline_logs_validator import Pipeline
>>> Pipeline('path/to/my.json').is_valid()


Checks that the validation code does on the pipeline log files include:

  • Checks that file exists and is readable
  • Checks that the file is correct JSON
  • Checks for all required fields
  • Checks that primitives is a json list, with no duplicates
  • Checks that pipeline_rank is an integer

Run Tests

To run all tests: pytest

We have a test suite with the pytest package and code coverage with coverage. This requires the package coverage and pytest, both of which can be installed with pip.

The following command runs all of the unit tests and outputs code coverage into htmlcov/index.html

coverage run --branch --source=./dval -m pytest -s test/ -v
coverage report -m
coverage html


Docs of the latest version of the master branch are available here (inside NIST only):

Docs were built using sphinx and autodoc with the following commands at the root directory:

sphinx-apidoc -o docs/api dval
sphinx-build -b html docs/ html_docs

And the web docs can be loaded in html_docs/index.html



The license is documented in the LICENSE file and on the NIST website.

Versions and releases:



Please send any issues, questions, or comments to


Validator and Scorer for Data Science Evaluations DSE and D3M







No releases published


No packages published
