Bioassay data associative promiscuity pattern learning engine V2.
For smaller use cases one can use the Badapple2 web app: https://chiltepin.health.unm.edu/badapple2
Badapple is a method for detecting likely promiscuous compounds via their associated scaffolds, using public bioassay data from PubChem. For more information please see the About Page.
The code contained in this repo is for building and analyzing the Badapple databases. If you would like to view the code for the Badapple UI or API please visit the repos below:
If you want to setup the badapple_classic DB follow the instructions here.
The steps below outline how one can setup the badapple2 DB.
Use this option to install a Docker image with the DB.
See the docker README file here
Use this option to install the DB directly on your system using PostgreSQL.
- Follow the PostgreSQL setup instructions here
- Download badapple2.pgdump.
- Note: If your use case needs the "activity" table, then instead download badapple2_full.pgdump
- Create the DB:
createdb badapple2
- Load DB from dump file:
pg_restore -O -x -v -d badapple2 badapple2.pgdump
- Note: If you're including the "activity" table then use:
pg_restore -O -x -v -d badapple2 badapple2_full.pgdump
- Note: If you're including the "activity" table then use:
You can skip this section if you setup the DB using the steps from above
If you would like to run the entire workflow used to create the badapple2 DB, then please follow the instructions here.
If you'd like to run the scripts/code contained within this repository then you will need to follow the setup guidelines outlined below.
Code is expected to work on Linux systems.
MacOS and Windows users will need need to modify the conda environment.yml file. Make sure to follow appropriate installation guidelines for other dependencies (PostgreSQL, Docker). Please note that packages/dependencies may function differently across operating systems.
- Setup conda (see the Miniconda Site for more info)
- (Optional) I'd recommend using the libmamba solver for faster install times, see here
- Install the Badapple2 environment:
conda env create -f environment.yml
- This will create a new conda env with name
badapple2
. If you wish, you can change the first line of environment.yml prior to the command above to change the name.
- This will create a new conda env with name
- Install PostgreSQL with the RDKit cartridge (requires sudo):
sudo apt install postgresql-14-rdkit
- (Option 1) Make your user a superuser prior to DB setup:
- Switch to postgres user:
(base) <username>@<computer>:~$ sudo -i -u postgres
- Make yourself a superuser:
psql -c "CREATE ROLE <username> WITH SUPERUSER PASSWORD '<password>'"
- Switch to postgres user:
- (Option 2) If you don't want to make
<username>
a superuser, follow the steps below:- When running DB setup commands, prepend
sudo -u postgres
to DB setup commands. For example, instead ofcreatedb <DB_NAME>
usesudo -u postgres createdb <DB_NAME>
. - After setting up the DB as
postgres
you can grant permissions to<username>
to access the DB as<username>
like so:
sudo -i -u postgres psql -d <DB_NAME> -c "CREATE ROLE <username> WITH LOGIN PASSWORD '<password>'" psql -d <DB_NAME> -c "GRANT SELECT ON ALL TABLES IN SCHEMA public TO <username>" psql -d <DB_NAME> -c "GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO <username>" psql -d <DB_NAME> -c "GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO <username>"
- When running DB setup commands, prepend