Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[redcap] new module #9474

Open
wants to merge 97 commits into
base: main
Choose a base branch
from
Open

[redcap] new module #9474

wants to merge 97 commits into from

Conversation

regisoc
Copy link
Contributor

@regisoc regisoc commented Nov 16, 2024

Brief summary of changes

This PR adds the REDCap interoperability module.
This module opens an API endpoint to receive notifications from REDCap servers through the use of the Data Entry Trigger (DET). REDCap notifications are parsed and REDCap records are imported into LORIS.
If an error happens during this notification handling process, an error is generated in the issue tracker.
The module is imported from HBCD, and even if the core logic is the same, REDCap objects are now their own models.
This version also supports connection to multiple REDCap instances.

Features

The main goal was to replicate HBCD data import process based on REDCap DET.

  • Refactoring of the REDCap HTTP Client and linked classes/models.
  • Support for multiple REDCap instances.
  • Additional tools and features:
    • redcap2linst: script to import REDCap forms as LORIS LINST instruments.
    • [ ] fetch button: UI button on instrument panel to trigger data import will be done in a later PR

State

Testing instructions (if applicable)

Notification process:

  1. Define main assignee, REDCap instances and projects in the config.xml following the structure given in RB/config.xml file.
  2. Define which instruments could be imported by adding them to the configuration list in LORIS. See Front-end > Admin > Configuration > REDCap > redcap_importable_instrument.
  3. Define which user is the main issue tracker assignee for all redcap issues in the configuration. See Front-end > Admin > Configuration > REDCap > redcap_issue_assignee.
  4. Send a URL notification to LORIS (i.e. endpoint /redcap/notifications) to simulate the reception of a DET notification. The payload should be the following (see redcap/notifications/redcapnotification.class.inc):
{
    "instrument":"instrument_backend_name",
    "project_id":"REDCap project ID",
    "project_url":"REDCap project URL",
    "record":"REDCap record ID / PSCID",
    "redcap_event_name":"REDCap event name",
    "redcap_url":"REDCap instance URL",
    "username":"REDCap username",
    "${instrument_backend_name}_complete":"2"
}

REDCap LINST instrument process:

  • See tools/redcap2linst.php usage.

Link(s) to related issue(s)

Links to related PRs

@maximemulder maximemulder added Project: HBCD Issue or PR related to the HBCD project Project: C-BIG Issue or PR related to the C-BIG project Difficulty: Complex PR or issue that require a great effort to implementat, review, or test Area: Instruments PR or issue related instruments Release: SQL patch PR that contains an SQL patch to apply labels Nov 29, 2024
Copy link
Contributor

@maximemulder maximemulder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally finished reviewing the code, will test it now.

Okay, I finished reviewing the code, I am now in the process of testing it with MPN. Congratulations ! The module seems to work quite well and provides really convenient configuration options like multi-instance and multi-project integration. In addition you also use statically typed models instead of dictionaries directly, which I really like.

I wrote a bunch of small code comments, which you can also ignore if you do not agree with them, especially those marked with SUBJECTIVE.

On a broader level, I would have the following architectural comments:

  • I think we should use readonly properties for the configuration and model classes, which would help to remove a lot of boilerplate code.
  • I would prefer to avoid abbreviated variable names like pid, iname and rc, and instead use longer, more descriptive, names like project_id, instance_name and redcap_client. I know this does not play well with the 85-character line limit, but well, the problem is the limit here, not the longer names. I would also prefer more precise names like instrument_name, event_name and record_id instead of instrument, event and record when that makes sense.
  • In variable and type names, REDCap is sometimes written Redcap and sometimes REDCap, I think we should stick with one (IMO the simpler Redcap for code, and the "official" REDCap for documentation).
  • The configuration parser can be simplified. The current code builds the global configuration first, and then mutates it to add the instance and project configurations. It would be simpler to descend the XML tree, gather the instance configurations, and only then build the global configuration (same reasoning applies for instance and project configurations).
  • I need to have an additional REDCap project configuration parameter for MPN that changes the behavior of the pipeline. However, the configuration is currently not accessible from the REDCap notification handler. I think maybe there should be one configuration object accessible from there that contains all the configuration information for the specific project and instance of this call (instead of having a tree).
  • There are several singletons and factories, but I am not sure if these are really needed. If they do not persist across requests they should be removed IMO.
  • Same thing with the REDCap HTTP client cache. Is there a specific case that needs it ? If not, I think the abstraction should either be factorized so that it does not need to be manually read/written in each HTTP request call, or removed.
  • Same thing with the REDCap HTTP client handler. I am not sure managing different clients is necessary. Probably what should happen is: 1. Get the REDCap instance and project from the DET notification, 2. Get the REDCap configuration for that instance and project from the configuration file, 3. instantiate a REDCap HTTP client from that configuration and pass it as an argument to the REDCap notification handler, without necessarily bookkeeping instantiated clients like now (unless there is a good reason to do so).
  • The RedcapNotification class should be part of the REDCap HTTP client models, as it is structured JSON data obtained by HTTP from REDCap.

@maximemulder maximemulder force-pushed the 20241020_redcap_module branch from 61d5c00 to 28f3733 Compare December 18, 2024 10:08
@maximemulder
Copy link
Contributor

maximemulder commented Dec 19, 2024

UPDATE: Since Regis is taking (well-deserved) time off, I am working on this PR myself and adding the required configuration options for MPN (that are hopefully also useful for other projects !).

A few things I noticed during my tests:

  • In the REDCap get project API call, the creation time and production time fields are nullable, I changed the types in the model to reflect that.
  • In the REDCap get event API call, MPN also returns the day_offset, offset_min and offset_max fields. MPN uses REDCap v14.0.28. Maybe these fields are optional or depend on the REDCap version.
  • In the REDCap get event API call, the current code expects the REDCap "event name" and "unique event name" to be related. However, that assumption does not always hold when special characters are involved (parentheses, hyphens, underscores, spaces...). I removed the check here.

@maximemulder maximemulder force-pushed the 20241020_redcap_module branch 7 times, most recently from c72c69b to 3472533 Compare December 24, 2024 09:35
@maximemulder
Copy link
Contributor

maximemulder commented Dec 24, 2024

Okay, I made some changes to the module for it to be usable for MPN, and made a few refactors that make the code (IMO) cleaner. Still a few TODOs notably regarding error handling, but I think that's a nice progress. List of sifnificant changes:

  • Added new configuration options and a documentation for the configuration.
  • Reworked the REDCap configuration parsing.
  • Reworked the REDCap HTTP client construction (no more HTTP REDCap client handler).
  • Moved all the REDCap client -related classes and models in redcap/php/client (to be renamed redcap/php/redcap_client ?).
  • Use readonly instead of getters in the models to reduce boilerplate.
  • Added a RedcapProps class to validate REDCap response objects (replacement for RedcapProp).

With all these changes, I removed almost all of my previous code comments except for a few ones I am still interested about.

Notably TODOs:

  • Finish error handling.
  • Migrate RedcapProject to RedcapProps and readonly.
  • Is the cache useful in the REDCap HTTP client ? It should be either removed or factorized IMO.
  • IMO the REDCap HTTP client does too many things. Code that is not directly related to calling the REDCap API and formatting results should probably be extracted.

CI seems broken because of an APT dependency, this is unrelated to the changes made here.

@maximemulder maximemulder force-pushed the 20241020_redcap_module branch from 3472533 to 5223553 Compare January 2, 2025 10:11
@regisoc
Copy link
Contributor Author

regisoc commented Jan 2, 2025

@maximemulder happy new year, and thanks for adding more to this PR while I was away!
I did not look into the code yet, just based on your comments:

  • In the REDCap get project API call, the creation time and production time fields are nullable, I changed the types in the model to reflect that. -> types, good.
  • In the REDCap get event API call, MPN also returns the day_offset, offset_min and offset_max fields. MPN uses REDCap v14.0.28. Maybe these fields are optional or depend on the REDCap version. -> not sure as we do not have a clear vision on REDCap changelog. Was that added to the redcapevent class as optional args?
  • In the REDCap get event API call, the current code expects the REDCap "event name" and "unique event name" to be related. However, that assumption does not always hold when special characters are involved (parentheses, hyphens, underscores, spaces...). I removed the check here. -> the event_name is the redcap event name (e.g. 'aaa' or a LORIS visit/timepoint name in our case) and the unique_event_name concatenates the arm number with an underscore. From what we talked about with Rida based on HBCD way of naming, I thought we wanted them to be linked, thus the test. The only special characters should be an underscore between the base event name and the arm number. Is MPN using other chars?
  • Added new configuration options and a documentation for the configuration. -> there was a doc before and it is difficult to see what was there before with the force-push, thanks for updating the documentation.
  • Reworked the REDCap HTTP client construction (no more HTTP REDCap client handler). -> good if you have removed it, I did not like it but it was a first draft. I will look into this, because I just want to be sure we can access multiple REDCap instances.
  • Moved all the REDCap client -related classes and models in redcap/php/client (to be renamed redcap/php/redcap_client ?). -> I do not mind that. redcap/php/client looks good to me.
  • Use readonly instead of getters in the models to reduce boilerplate. -> great!
  • Added a RedcapProps class to validate REDCap response objects (replacement for RedcapProp). -> ok. Is the old RedcapProp still needed?

Todo:

  • Finish error handling. -> the issue creation in the issue tracker is still working, I wanted to add more specific LORIS-REDCap related exceptions.
  • Migrate RedcapProject to RedcapProps and readonly. -> not sure to follow here? Is it the only one missing in the transition from RedcapProp to RedcapProps?
  • Is the cache useful in the REDCap HTTP client ? It should be either removed or factorized IMO. -> yes it is needed and yes is will be factorized. DET triggers a looot of notifications and having big items cached (i.e. the data dictionary, events, arms, instruments list) per REDCap instance is a must IMO. It was not a priority but I put the base of it.
  • IMO the REDCap HTTP client does too many things. Code that is not directly related to calling the REDCap API and formatting results should probably be extracted. -> not sure to get this, which methods?

CI issue -> not related to phan type checks?

I will look into that for next week and I would like to have a quick meeting too so we can align on the next steps.

@regisoc regisoc marked this pull request as ready for review February 18, 2025 03:48
@driusan driusan added this to the 27.0.0 milestone Feb 24, 2025
@sruthymathew123
Copy link
Contributor

@regisoc Not able to test this PR on my VM as checking out to this branch with a latest raisinbread data set has candidate list broken.

@sruthymathew123
Copy link
Contributor

sruthymathew123 commented Feb 27, 2025

@regisoc So I tried to test a redcap data import with raisinbread data. I configured an HBCD/redcap compatible instrument and added to visit V2. Completed the same instrument on redcap server as well. But I am getting below error.
[redcap:event] property custom_event_label required but is null {"instrument":"sed_cg_foodins","project_id":"62","project_url":"test","received_dt":"2025-02-27 14:08:10","record":"DCC090","redcap_event_name":"v02_arm_1","redcap_url":"https:\/\/redcap-pii.ucsd.edu\/redcap\/","complete":"2","username":"admin"}

@regisoc Let me know if you need any additional info

@regisoc
Copy link
Contributor Author

regisoc commented Feb 27, 2025

@sruthymathew123 thanks for testing it, I will look at it tomorrow.

@sruthymathew123
Copy link
Contributor

Thanks @regisoc for the fixes.

Got it working with my redcap instance.

@regisoc
Copy link
Contributor Author

regisoc commented Mar 7, 2025

@sruthymathew123 I pushed some changes in configuration:

  • the issue assignee extracted from config.xml to the configuration module (REDCap tab).
  • psc-id and dcc-id changed to pscid and candid for loris consistency in config.xml file (optional candidate-id item).
  • record-id and survey-participant-id changed to snake case for REDCap naming consistency in config.xml file (optional redcap-participant-id item).
    It should not change the main logic.

@sruthymathew123
Copy link
Contributor

@regisoc Retested with the new changes and looks good.

@sruthymathew123 sruthymathew123 added the Passed manual tests PR has been successfully tested by at least one peer label Mar 7, 2025
Copy link
Contributor

@sruthymathew123 sruthymathew123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and code changes also looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Instruments PR or issue related instruments Difficulty: Complex PR or issue that require a great effort to implementat, review, or test Passed manual tests PR has been successfully tested by at least one peer Project: C-BIG Issue or PR related to the C-BIG project Project: HBCD Issue or PR related to the HBCD project Release: SQL patch PR that contains an SQL patch to apply
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants