Skip to content

Commit

Permalink
Merge branch 'master' into ad/table-position
Browse files Browse the repository at this point in the history
  • Loading branch information
po09i authored Apr 16, 2024
2 parents 4a171a0 + 09ba933 commit 1d4f0ca
Show file tree
Hide file tree
Showing 9 changed files with 240 additions and 57 deletions.
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
exclude: 'tools/schemacode/bidsschematools/tests/data/broken_dataset_description.json'
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
Expand All @@ -13,7 +13,7 @@ repos:
- id: check-added-large-files
- id: check-case-conflict
- repo: https://github.com/psf/black
rev: 24.3.0
rev: 24.4.0
hooks:
- id: black
files: ^tools/(?!schemacode)
Expand Down
4 changes: 2 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ nav:
- Electroencephalography: modality-specific-files/electroencephalography.md
- Intracranial Electroencephalography: modality-specific-files/intracranial-electroencephalography.md
- Task events: modality-specific-files/task-events.md
- Physiological and other continuous recordings: modality-specific-files/physiological-and-other-continuous-recordings.md
- Physiological recordings: modality-specific-files/physiological-recordings.md
- Behavioral experiments (with no neural recordings): modality-specific-files/behavioral-experiments.md
- Genetic Descriptor: modality-specific-files/genetic-descriptor.md
- Positron Emission Tomography: modality-specific-files/positron-emission-tomography.md
Expand Down Expand Up @@ -122,7 +122,7 @@ plugins:
"04-modality-specific-files/03-electroencephalography.md": "modality-specific-files/electroencephalography.md"
"04-modality-specific-files/04-intracranial-electroencephalography.md": "modality-specific-files/intracranial-electroencephalography.md"
"04-modality-specific-files/05-task-events.md": "modality-specific-files/task-events.md"
"04-modality-specific-files/06-physiological-and-other-continuous-recordings.md": "modality-specific-files/physiological-and-other-continuous-recordings.md"
"04-modality-specific-files/06-physiological-and-other-continuous-recordings.md": "modality-specific-files/physiological-recordings.md"
"04-modality-specific-files/07-behavioral-experiments.md": "modality-specific-files/behavioral-experiments.md"
"04-modality-specific-files/08-genetic-descriptor.md": "modality-specific-files/genetic-descriptor.md"
"04-modality-specific-files/09-positron-emission-tomography.md": "modality-specific-files/positron-emission-tomography.md"
Expand Down
82 changes: 63 additions & 19 deletions src/common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -430,36 +430,54 @@ NIfTI header.
### Tabular files
Tabular data MUST be saved as tab delimited values (`.tsv`) files, that is, CSV
files where commas are replaced by tabs. Tabs MUST be true tab characters and
MUST NOT be a series of space characters. Each TSV file MUST start with a header
line listing the names of all columns (with the exception of
[physiological and other continuous recordings](modality-specific-files/physiological-and-other-continuous-recordings.md)
as well as [motion recording data](modality-specific-files/motion.md)).
Tabular data MUST be saved as plain-text, tab-delimited values (TSV) files
(with [extension `.tsv`](glossary.md#tsv-extensions)),
that is, [CSV files](https://en.wikipedia.org/wiki/Comma-separated_values) where commas are replaced by tab characters.
Tabs MUST be true tab characters and MUST NOT be a series of space characters.
Tabular data such as continuous physiology recordings typically containing
large numbers of rows MAY be saved as
[compressed tabular files (with extension `.tsv.gz`)](#compressed-tabular-files),
which are introduced below.
Plain-text TSV and compressed TSV are not interchangeable, that is, each section
of the specification prescribes which one MUST be used for the data type at
hand.
Each TSV file MUST start with a header line listing the names of all columns
with two exceptions:
1. [compressed tabular files](#compressed-tabular-files),
for which column names are defined in a sidecar metadata
[JSON object](https://www.json.org/json-en.html) described below; and
1. [motion recording data](modality-specific-files/motion.md),
which use plain-text TSV and columns are defined as described
in its corresponding section of the specifications.
It is RECOMMENDED that the column names in the header of the TSV file are
written in [`snake_case`](https://en.wikipedia.org/wiki/Snake_case) with the
first letter in lower case (for example, `variable_name`, not `Variable_name`).
As for all other data in the TSV files, column names MUST be separated with tabs.
Column names defined in the header MUST be separated with tabs as for the data contents.
Furthermore, column names MUST NOT be blank (that is, an empty string) and MUST NOT
be duplicated within a single TSV file.
String values containing tabs MUST be escaped using double
quotes. Missing and non-applicable values MUST be coded as `n/a`. Numerical
values MUST employ the dot (`.`) as decimal separator and MAY be specified
String values containing tabs MUST be escaped using double quotes.
Missing and non-applicable values MUST be coded as `n/a`.
Numerical values MUST employ the dot (`.`) as decimal separator and MAY be specified
in scientific notation, using `e` or `E` to separate the significand from the
exponent. TSV files MUST be in UTF-8 encoding.
exponent.
TSV files MUST be in UTF-8 encoding.
Example:
```Text
onset duration response_time correct stop_trial go_trial
200 200 0 n/a n/a n/a
onset duration response_time trial_type trial_extra
200 20.0 15.8 word 中国人
240 5.0 17.34e-1 visual n/a
```

**Note**: The TSV examples in this document (like the one above this note)
are occasionally formatted using space characters instead of tabs to improve
human readability.
Directly copying and then pasting these examples from the specification
for use in new BIDS datasets can lead to errors and is discouraged.
!!! warning "Attention"

The TSV examples in this document (like the one above this note) are occasionally
formatted using space characters instead of tabs to improve human readability.
Directly copying and then pasting these examples from the specification
for use in new BIDS datasets can lead to errors and is discouraged.

Tabular files MAY be optionally accompanied by a simple data dictionary
in the form of a JSON [object](https://www.json.org/json-en.html)
Expand Down Expand Up @@ -536,12 +554,38 @@ like in the example below.
"F": {
"Description": "Female",
"TermURL": "https://www.ncbi.nlm.nih.gov/mesh/68005260"
},
}
}
}
}
```

### Compressed tabular files

Large tabular information, such as physiological recordings, MUST be stored with
[compressed tab-delineated (TSV.GZ) files](glossary.md#tsvgz-extensions) when
so established by the specifications.
Rules for formatting plain-text tabular files apply to TSVGZ files with three exceptions:

1. The contents of TSVGZ files MUST be compressed with
[gzip](https://datatracker.ietf.org/doc/html/rfc1952).
1. Compressed tabular files MUST NOT contain a header in the first row
indicating the column names.
1. TSVGZ files MUST have an associated JSON file that defines the columns in the tabular file.

!!! warning "Attention"

In contrast to plain-text TSV files,
compressed tabular files files MUST NOT include a header line.
Column names MUST be provided in the JSON file with the
[`Columns`](glossary.md#columns-metadata) field.
Each column MAY additionally be described with a column description,
as described in [Tabular files](#tabular-files).

TSVGZ are header-less to improve compatibility with existing software
(for example, FSL, or PNM), and to facilitate the support for other file formats
in the future.

### Key-value files (dictionaries)

JavaScript Object Notation (JSON) files MUST be used for storing key-value
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,9 @@
# Physiological and other continuous recordings
# Physiological recordings

Physiological recordings such as cardiac and respiratory signals and other
continuous measures (such as parameters of a film or audio stimuli) MAY be
specified using two files:

1. a [gzip](https://datatracker.ietf.org/doc/html/rfc1952)
compressed TSV file with data (without header line)

1. a JSON file for storing metadata fields (see below)
Physiological recordings such as cardiac and respiratory signals MAY be
specified using a [compressed tabular file](../common-principles.md#compressed-tabular-files)
([TSV.GZ file](../glossary.md#tsvgz-extensions)) and a corresponding
JSON file for storing metadata fields (see below).

!!! example "Example datasets"

Expand All @@ -25,8 +21,6 @@ sub-<label>/[ses-<label>/]
<datatype>/
<matches>[_recording-<label>]_physio.tsv.gz
<matches>[_recording-<label>]_physio.json
<matches>[_recording-<label>]_stim.tsv.gz
<matches>[_recording-<label>]_stim.json
```

For the template directory name, `<datatype>` can correspond to any data
Expand All @@ -38,8 +32,12 @@ before the suffix.
For example for the file `sub-control01_task-nback_run-1_bold.nii.gz`,
`<matches>` would correspond to `sub-control01_task-nback_run-1`.

Note that when supplying a `*_<physio|stim>.tsv.gz` file, an accompanying
`*_<physio|stim>.json` MUST be supplied as well.
!!! warning "Caution"

`<matches>_physio.tsv.gz` files MUST NOT include a header line, as established by the
[common-principles](../common-principles.md#compressed-tabular-files).
As a result, when supplying a `<matches>_physio.tsv.gz` file, an accompanying
`<matches>_physio.json` MUST be present to indicate the column names.

The [`recording-<label>`](../appendices/entities.md#recording)
entity MAY be used to distinguish between several recording files.
Expand All @@ -48,10 +46,21 @@ the eyetracking data in a certain sampling frequency, and
`sub-01_task-bart_recording-breathing_physio.tsv.gz` to contain respiratory
measurements in a different sampling frequency.

Physiological recordings (including eyetracking) SHOULD use the `_physio`
suffix, and signals related to the stimulus SHOULD use `_stim` suffix.
Physiological recordings (including eyetracking) MUST use the `_physio` suffix.

The following tables specify metadata fields for the `*_physio.json` file.

<!-- This block generates a metadata table.
These tables are defined in
src/schema/rules/sidecars
The definitions of the fields specified in these tables may be found in
src/schema/objects/metadata.yaml
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table(["continuous.Continuous"]) }}

The following table specifies metadata fields for the `*_<physio|stim>.json` file.
## Hardware information

<!-- This block generates a metadata table.
These tables are defined in
Expand All @@ -61,20 +70,11 @@ The definitions of the fields specified in these tables may be found in
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table(["continuous.Continuous", "continuous.Physio"]) }}
{{ MACROS___make_sidecar_table(["continuous.PhysioHardware"]) }}

Additional metadata may be included as in
[any TSV file](../common-principles.md#tabular-files) to specify, for
example, the units of the recorded time series.
Please note that, in contrast to other TSV files in BIDS, the TSV files specified
for physiological and other continuous recordings *do not* include a header
line.
Instead the name of columns are specified in the JSON file (see `Columns` field).
This is to improve compatibility with existing software (for example, FSL, PNM)
as well as to make support for other file formats possible in the future.
As in any TSV file, column names MUST NOT be blank (that is, an empty string),
and MUST NOT be duplicated within a single JSON file describing a headerless
TSV file.

Example `*_physio.tsv.gz`:

Expand Down Expand Up @@ -168,13 +168,8 @@ stored in separate files
(and the [`recording-<label>`](../appendices/entities.md#recording)
entity MAY be used to distinguish these files).

If the same continuous recording has been used for all subjects (for example in
the case where they all watched the same movie), one file MAY be used and
placed in the root directory.
For example, `task-movie_stim.tsv.gz`

For motion parameters acquired from MRI scanner side motion correction, the
`_physio` suffix SHOULD be used.
`_physio` suffix MUST be used.

For multi-echo data, a given `physio.tsv` file is applicable to all echos of
a particular run.
Expand Down
89 changes: 88 additions & 1 deletion src/modality-specific-files/task-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ and a guide for using macros can be found at
-->
{{ MACROS___make_columns_table("task.TaskEvents") }}

The content of `events.tsv` files SHOULD be sorted by values in the `onset` column.

Note for MRI data:
If any acquired scans have been discarded before forming the imaging data file,
ensure that an `onset` of 0 corresponds to the time the first image was stored.
Expand Down Expand Up @@ -124,7 +126,7 @@ Note that in the example above:

1. The `channel` column contains a list of values that are separated
by a delimiter (`|`), as is declared in the `Delimiter` metadata
field of the `events.json file.
field of the `events.json` file.
Thus, the channels related to the event in the third row of the example
are called `F,1`, `F,2`, and `Cz`.

Expand Down Expand Up @@ -309,3 +311,88 @@ in the accompanying JSON sidecar as follows (based on the example of the previou
"VisionCorrection": "lenses"
}
```

### Continuously-sampled, stimulus-related signals

!!! example "Example datasets"

The following [BIDS-Examples](https://bids-standard.github.io/bids-examples/#dataset-index)
showcase stimulus-related signals and may be used as a reference
when curating a new dataset:

- ["synthetic" example dataset](https://github.com/bids-standard/bids-examples/tree/master/synthetic).

Signals related to stimuli (such as parameters of a film or audio stimuli) that are
evenly recorded at a constant sampling frequency MUST be specified using a
[compressed tabular file](../common-principles.md#compressed-tabular-files)
([TSV.GZ file](../glossary.md#tsvgz-extensions)) and a corresponding
JSON file for storing metadata fields (see below).

Template:

```Text
sub-<label>/[ses-<label>/]
<datatype>/
<matches>_stim.tsv.gz
<matches>_stim.json
```

For the template directory name, `<datatype>` can correspond to any data
recording modality.

In the template filenames, the `<matches>` part corresponds to task filename
before the suffix.
For example for the file `sub-control01_task-nback_run-1_bold.nii.gz`,
`<matches>` would correspond to `sub-control01_task-nback_run-1`.

!!! warning "Caution"

`<matches>_stim.tsv.gz` files MUST NOT include a header line,
as established by the [common-principles](../common-principles.md#compressed-tabular-files).
As a result, when supplying a `<matches>_stim.tsv.gz` file, an accompanying
`<matches>_stim.json` MUST be present to indicate the column names.

If the same continuous recording has been used for all subjects (for example in
the case where they all watched the same movie), one file placed in the
root directory (for example, `<root>/task-movie_stim.<tsv.gz|json>`) MAY be used
and will apply to all `<matches>_task-movie_<matches>_<suffix>.<ext>` files.
In the following example, the two `task-nback_stim.<json|tsv.gz>` apply
to all the `task-nback` runs across the two available subjects:

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"sub-01": {
"func": {
"sub-01_task-nback_run-1_bold.nii.gz": "",
"sub-01_task-nback_run-2_bold.nii.gz": "",
},
},
"sub-02": {
"func": {
"sub-02_task-nback_run-1_bold.nii.gz": "",
"sub-02_task-nback_run-2_bold.nii.gz": "",
},
},
"task-nback_stim.json": "",
"task-nback_stim.tsv.gz": "",
}) }}

The following table specifies metadata fields for the
`<matches>_stim.json` file.

<!-- This block generates a metadata table.
These tables are defined in
src/schema/rules/sidecars
The definitions of the fields specified in these tables may be found in
src/schema/objects/metadata.yaml
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table(["continuous.Continuous"]) }}

Additional metadata may be included as in
[any TSV file](../common-principles.md#tabular-files) to specify, for
example, the units of the recorded time series for each column.
Loading

0 comments on commit 1d4f0ca

Please sign in to comment.