-
Notifications
You must be signed in to change notification settings - Fork 3
Meeting Notes April 2020
Attendees - Pat Grubel (pagrubel), Qiang Guan, Al McPherson (mcpherson), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
Agenda
PR Review
-
#150 (trandles-lanl) - Beeconfig 149
- pagrubel will test and comment
Issue Review
- #139 (trandles-lanl) - BEEStart: A script to start BEE components
Discussion (ToDo?)
- NONE
Around the room
- Pat
- back on BEE this week
- Al
-
cwltool
investigations - waiting on VASP container from trandles
-
- Qiang
- Jake
- talk to trandles-lanl about BEEStart
- Tim
NO MEETING
Attendees - Rusty David (rstyd), Pat Grubel (pagrubel), Qiang Guan, Al McPherson (mcpherson), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
Agenda
PR Review
- NONE
Issue Review
- NONE
Discussion (ToDo?)
- FY21 ECP activities off to L3 for approval, can add more later
- See what OSC is using for a resource manager (qguan)
- Time to define logging standard for BEE (running of the system, not necessarily workflow-specific stuff)
- Start list of what goes into the graph database (e.g. job script as metadata on task node)
Around the room
- Pat
- Al
- working on database refactor
- waiting on trandles-lanl to get VASP container then will write CWL for Sven's workflow
- thinking about parser strategy (maybe a hack of
cwltool
) - will email someone about
cwltool
- Qiang
- almost finished paper describing scheduling algorithms
- continue discussion of tasks for Jake - container integration (discuss Wednesday)
- Jake
- looking at issue 124 (task status reporting)
- job-building/script-building to test individual commands in job for success/failure
- Tim
- wrapping up basic
BEEStart
to push to repo - planning activities with Qiang
- wrapping up basic
- Rusty
- wrapping up
pytest
activities - refine REST APIs
- wrapping up
Attendees - Rusty Davis (rstyd), Pat Grubel (pagrubel), Qiang Guan, Al McPherson (mcpherson), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
PR Review
-
#143 (pagrubel) - Fix slurm unit tests
- rstyd and trandles-lanl will run tests to confirm, if pass then approve
Issue Review
-
#144 (trandles-lanl, mcpherson) - Create VASP Charliecloud container
- mcpherson to review past emails with srudin and comment on issue
Discussion
- BEE docker image - jtronge
- README.md on mattermost chat describing use
- works at Kent
- fedora image
- trandles-lanl, pagrubel, mcpherson will test it, provide feedback to jtronge
- WoWoHa - pagrubel
- WoWoHa 2020 cancelled
- will be a weekly "summer seminar series" June - August 2020
- BEE will give a talk
- pushing to master and public BEE repo - pagrubel
- getting closer to public release
- need to define criteria for first release (documentation, workflow limitations/supported CWL, etc.)
- trandles-lanl will create milestone issue for first public release - target end of FY
- trandles-lanl will create issue for supporting MPI applications using Charliecloud and BEE
Around the room
- Rusty
- pytest vs. unittest
- can run unittest with pytest
- pytest has a flask plugin, good support
- pytest has better test output, works with doctest (see https://vincent.bernat.ch/en/blog/2019-sustainable-python-script)
- Pat
- jtronge test PR #143
- Issue #124 - jtronge discuss with pagrubel
- Al
- working on database refactor
- chasing down VASP stuff
- Qiang
- tasks for jtronge
- discuss FY activities with trandles-lanl
- Jake
- Tim
- push BEEStart ASAP and let others hack on it
Attendees - Pat Grubel (pagrubel), Qiang Guan, Al McPherson (mcpherson), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
Agenda
PR Review
- NONE
Issue Review
- NONE
Discussion
- Thoughts on FY21 cloud milestone
- using ORNL or Chameleon cloud for target platform
Around the room
- Pat
- working to get pyslurm tests running
- using DockerRequirement from CWL
- Al
- getting on darwin and fog
- Qiang & Jake
- got examples running that were in milestone documentation
- Jake will document his scripts and dockerfiles for setting up their test environment
- Jake will get things running on group server
- Qiang to send thoughts on FY21 cloud milestone
- Tim
- continue working on BEEStart script
Attendees - Rusty Davis (rstyd), Pat Grubel (pagrubel), Al McPherson (mcpherson), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
Agenda
- discuss CWL and container support
- TaskManager design for modular support of container runtimes and resource managers
- discuss proposed FY21 ECP P6 Activites
- BEE- FY21 P6-1 Develop the ability to archive, clone, and re-run workflows (start 10/01/20, due 3/31/21)
- BEE- FY21 P6-2 Run BEE jobs on private cloud infrastructure (due 9/31/21)
PR Review
- NONE
Issue Review
- NONE
Discussion
- APPROVED April 6, 2020 meeting notes
- TaskManager discussion mostly shelved for now, revisit next week
- CWL support for containers
- defined at https://www.commonwl.org/v1.1/CommandLineTool.html#DockerRequirement
- how to handle bind mounts, inputs/outputs, etc.?
- maybe a question for the CWL mailing list
- Rusty looking for existing containerized CWL workflows for examples
- "standard" container runtime options in the bee.conf file?
- FY21 ECP Activities are documented at
- Tim starting on design document for the activities
Around the room
- Jake
- neo4j issues (Task already exists)
- Rusty knows how to fix itj
- close to being able to run test workflows
- Rusty
- starting test work
- looking at PyTest for integration testing
- maybe pexpect for client testing
- Flask has some testing framework (Jake)
- BEE should start a document of what CWL is supported by project
- Pat
- question for Rusty about passing Task object to worker from TaskManager
- will need to think about how to pass things around when there's more data (requirements and hints)
- Al
- refactoring database and building new API to it
- no way to version python APIs
- API changes only affect WorkflowManager
- next use case CWL example
- maybe BLAST workflow again
- keep scope of parsing to HPC use cases, not "generic everything CWL"
- Do srudin VASP workflow (parameter study) #66
- refactoring database and building new API to it
Action Items
- Tim - get VASP containers that work with Charliecloud (Power9, x86_64)
Attendees - Rusty Davis (rstyd), Pat Grubel (pagrubel), Qiang Guan (guanxyz), Tim Randles (trandles-lanl), Jake Tronge (jtronge)
PR Review
-
#138 APPROVED (trandles-lanl) - Use bee.conf to configure listen ports for BEEWorkflowManager and BEETaskManager
- Pat approves of merging this PR, but into
master
instead ofdevelop
. The rationale is the functionality is simple and enables everyone to do development work at the same time on the same system.
- Pat approves of merging this PR, but into
Issue Review
-
#137 (pagrubel) - Slurm worker to properly check DockerRequirment
-
slurm_worker.py should use the
DockerRequirement: dockerImageId
specified in the CWL file
-
slurm_worker.py should use the
Discussion
- extending CWL for other container runtimes (rstyd)
- discuss on Wednesday
- guanxyz had some ideas
- next ECP milestones up on wiki
Around the room
- Jake
- got a test environment set up at KSU
- initial problems with PySlurm due to having a too-new Slurm installed
- Rusty
- working on
unittest
and CI tests for client/WorkflowManager - not a lot of time for BEE this week (very understandable, everyone prioritized BEE the past 2 week (trandles-lanl))
- working on
- Pat
-
unittest
for TaskManager - issue #137 above
- not much time for BEE this week
-
- Tim
- issue #139 planning to discuss on Wednesday
- ECP milestone housekeeping