-
Notifications
You must be signed in to change notification settings - Fork 108
ReqMgr2 MicroService Transferor
This documentation is meant to describe the architecture, behaviour and APIs for the ReqMgr2 Microservice Transferor module, which is responsible for looking data up at the global workqueue level and creating PhEDEx subscriptions.
For the record - given that those abbreviations will be mentioned several times in this document - here is their meaning GQE: global workqueue element LQE: local workqueue element
Here is how we envision it to work:
- Unified assigns a workflow (request transition from
assignment-approved
toassigned
) - Global Workqueue queries ReqMgr2 for workflows in the
assigned
status. It then parses the request spec and create chunks of work GQE in theNew
status. Once ALL elements have been created, the request goes to theacquired
status (right now it happens asynchronously with a ReqMgr2 CherryPy thread, TBD and investigated whether it's safe to keep it that way). - The MicroService kicks in and queries ReqMgr2 for requests in
acquired
status;
- IF there are none requests, it exits and wait until the next cycle.
- Otherwise, it has to fetch the Campaign(s) configuration for every request (this can be cached!), which will be used to decide how many replicas have to be made and where data has to be subscribed to.
- IF a Campaign configuration cannot be found, then we have to create an alert and skip that workflow (leaving it in
acquired
status)
- Otherwise, given the list of requests in
acquired
, MicroService talks then to the Global WorkQueue and fetches all their GQE in theNew
status
- If there are none, then all subscriptions have been made and the request should be soon moving to the
staging
status - Otherwise, run the Transferor algorithm - taking into account the Campaign configuration, from CouchDB - and create PhEDEx subscriptions.
- PS.: Check whether parent blocks are available in the GQE
- Set the GQE status to
Available
for all the PhEDEx subscriptions that were successfully made (or for GQE without any input data!) - TBD update the request status to
staging
. Or let the ReqMgr2 CherryPy thread take care of that...
The process above has to be executed for every single workflow. By design, the agents won't pull any GQE work that is still without a PhEDEx subscription (in New
status).
In case a PhEDEx subscription fails, the GQE will remain in the New
status, and so should the request still be in the acquired
status as well.
Open questions Do we want to keep track of the subscriptions (persisting data somewhere?)?
Do we want to monitor the subscriptions and act upon issues and/or stuck transfers? Or we just assume transfers will eventually succeed? Alerts have to be created for bad input placement (bad transfers) as well.
The MicroService is a data-service which provides set of APIs to perform certain actions. Its general architecture is shown below:
In particular the WMCore MicroService provides an interface to perform Unified actions, such as fetch requests from ReqMgr2 data-services, obtain necessary informations for data placement and place requests of assigned workflows into data placement system PhEDEx.
- /microservice/data provides basic information about MicroService. It returns the following information:
{"result": [
{"microservice": "UnifiedTransferorManager", "request": {}, "results": {"status": {}}}
]}
- /microservice/data/status provides detailed information about requests in MicroService. It returns the following information:
curl --cert $X509_USER_CERT --key $X509_USER_KEY -X GET -H "Content-type: application/json" https://cmsweb-testbed.cern.ch/microservice/data/status
{"result": [
{"microservice": "UnifiedTransferorManager", "request": {}, "results": {"status": {}}}
]}
- /microservice/data allows to send specific request to MicroService
post request to process some state
curl -X POST -H "Content-type: application/json" -d '{"request":{"process":"assignment-approved"}}' http://localhost:8822/microservice/data
obtain results about specific workflow
curl --cert $X509_USER_CERT --key $X509_USER_KEY -X POST -H "Content-type: application/json" -d '{"request":{"task":"amaltaro_StepChain_DupOutMod_Mar2019_Validation_190322_105219_7255"}}' https://cmsweb-testbed.cern.ch/microservice/data
{"result": [
{"amaltaro_StepChain_DupOutMod_Mar2019_Validation_190322_105219_7255": {"completed": 100}}
]}