-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PEtab 2.0 draft #554
Add PEtab 2.0 draft #554
Conversation
|
For now, the changes to
|
Unless there are any objections, I will separate the v1 and v2 specs after merging #538 and then merge this PR to main. The main RTD page will then get a separate v2-draft section. This will also make v2 development more visible. |
* extract all changes from previous * fixup * allow hyphens in extension names * fixup hyphens * only require one toolbox that implements extension * specify how to work with multiple PEtab problems * specify we do not require a quorum number of votes * allow test cases to be provided by the extension library * Apply suggestions from code review Co-authored-by: Daniel Weindl <dweindl@users.noreply.github.com> Co-authored-by: Daniel Weindl <dweindl@users.noreply.github.com>
PEtab extensions were introduced in #537. We should be able to distinguish there between optional extensions and required extensions, i.e. those that modify the parameter estimation problem as such, and those that just add additional/optional information (e.g. annotations, info for visualization, ...). If some tool does not know about a certain optional extension, it can safely be ignored during import, if it does not know about a required extension, it should fail. This PR adds a `required` attribute to extensions in the yaml file to indicate whether they are required for the mathematical interpretation of the PEtab problem. Resolves #544
Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
Previously, the math expression syntax wasn't specified. This was very problematic, because different libraries and programming languages have different names for the same functions, and more importantly, differ in operator precedence. Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com> Co-authored-by: dilpath <dilan.private+github@outlook.com>
# Motivation There are a number of formats for specifying models in systems biology, each with their specific strengths and weaknesses. PEtab version 1.0.0 only allows Systems Biology Markup Language (SBML) models. While SBML is supported by a large number of tools, there are good reasons to use other formats. For example, rule-based model formats (e.g., BioNetGenLanguage) permit more abstract and compact specification of models based on rules, which are generalisations of reactions. Therefore, and based on user request (#436), we propose to lift PEtab’s restriction to SBML models and allow arbitrary model formats. # Proposed changes * Changes to the PEtab YAML file: * Change `sbml_files` to `models` * `models` entries will be model IDs (following the existing conventions for PEtab IDs) mapping to: * `location`: path / URL to the model * `language`: model format Initial set of model format identifiers (to be extended as needed): * SBML: `sbml` * CellML: `cellml` * BNGL: `bngl` * PySB: `pysb` * An additional entry for mapping tables (see below) is added Example: **Before:** ```yaml format_version: 1 parameter_file: parameters.tsv problems: - condition_files: - conditions.tsv measurement_files: - measurements.tsv observable_files: - observables.tsv sbml_files: - model1.xml ``` **After:** ```yaml format_version: 2.0.0 parameter_file: parameters.tsv problems: - condition_files: - conditions.tsv measurement_files: - measurements.tsv observable_files: - observables.tsv mapping_file: mappings.tsv # optional models: id_for_model1: location: model1.xml language: sbml ``` * Changes to the format of existing tables/files: * Condition/Observable/Parameter Table All symbols that previously referenced the ID of SBML entities, such as parameter IDs or compartment IDs, now refer to (globally unique) named entities in the model, such as parameters, observables, expressions. For example, condition table columns may correspond to parameters, states, species of the referenced model. For species, assignments in the condition table set the initial value at the beginning of the simulation for that condition, potentially replacing the initialization from preequilibration. For all other entities, values are statically replaced at all time points. For entities that assign values to other entities, such as SBML AssignmentRules, the value of the target of that rule is statically replaced at all time points. * Additional files * Mapping Table: Mapping PEtab entity IDs to entity IDs in the model. This optional file may be used to reference model entities in PEtab files where the ID in the model would not be a valid identifier in PEtab (e.g., due to containing blanks, dots, or other special characters). The tsv file has two mandatory columns: `petabEntityId`, `modelEntityId`. Additional columns are allowed. modelEntityIds must be unique identifiers in the model. The mapping table must not map modelEntityIds to petabEntityIds that are also defined in any other part of the PEtab problem. modelEntityId may not refer to other petabEntityIds, including those defined in the mapping table. petabEntityIds defined in the mapping table may be referenced in condition, measurement, parameter and observable tables, but cannot be referenced in the model itself. For example, in SBML, local parameters may be referenced as `$reactionId.$localParameterId`, which are not valid PEtab IDs as they contain a `.` character. Similarly, this table may be used to reference specific species in a BGNL model which may contain many unsupported characters such as `,`, `(` or `.`. However, please note that IDs must exactly match the species names in the BNGL generated network file and no pattern matching will be performed. # Implications * Tools need to check the model format and provide an informative message if the given format cannot be handled * Validators will skip model-dependent validation when encountering unknown model types - ideally there would be some plugin mechanisms to provide validation --- Co-authored by @FFroehlich @fbergmann. Also thanks to everybody participating in these discussions during the last COMBINE meeting. --------- Co-authored-by: FFroehlich <fabian@schaluck.com> Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com> Co-authored-by: Frank T. Bergmann <frank.thomas.bergmann@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🚀
I did not look much at parts of doc/v2/documentation_data_format.rst
that were probably copied from v1
.
Additional columns, such as ``Color``, etc. may be specified. Extensions | ||
that define operations on multiple PEtab problems need to employ a single | ||
PEtab YAML file as entrypoint to the analysis. This PEtab file may leave all | ||
fields specifying files empty and reference the other PEtab problems in the | ||
extension specific fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this useful? PEtab Select currently doesn't do this, and I don't see any benefits from doing this. It would just add an additional file to PEtab Select that duplicates information in the Model Space table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or by "operations on multiple PEtab problems", do you mean rather, estimating parameters across multiple PEtab problems? Is there a use case? If not, it could be removed from the spec...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rather about the petab-select case. But I don't think it makes sense to consider petab-select an extension in the sense used here. It just builds on top of PEtab problems, but doesn't really change the interpretation of any specific PEtab problem. I think it could be removed, but maybe that's a separate discussion.
I think the main point here was that for a PEtab problem, we always want to have a yaml file that lists the extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree re: putting extensions that modify a PEtab problem, into the PEtab YAML. Since I don't yet see a use case for "Extensions that define operations on multiple PEtab problems", I don't understand/can't review this part of the spec.
Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
Adds the PEtab v2 specification draft to the main documentation.
👀 https://petab--554.org.readthedocs.build/en/554/v2/documentation_data_format.html