Custom application looks for new files in particular folders an S3 bucket and interacts with the API for shared print data based on the data in the delimited file
- Obtained list of libraries holding given OCLC Number
- Obtain list of libraries with retention by OCLC Number by group or state
- Obtain retained holdings for OCLC Symbol given by OCLC Number
- Obtain all retained holdings for a given OCLC Symbol
- Retrieve all OCLC numbers (current and merged record numbers) associated with a retention.
Clone this repository
$ git clone {url}
or download directly from GitHub.
Change into the application directory
$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt
$ python -m pytest
usage: getData.py [-h] --itemFile ITEMFILE --operation
{retrieveMergedOCLCNumbers,retrieveHoldingsByOCLCNumber,retrieveSPByOCLCNumber,retrieveInstitutionRetentionsbyOCLCNumber,retrieveAllInstitutionRetentions}
--outputDir OUTPUTDIR
optional arguments:
-h, --help show this help message and exit
--itemFile ITEMFILE File you want to process
--operation {retrieveMergedOCLCNumbers,retrieveHoldingsByOCLCNumber,retrieveSPByOCLCNumber,retrieveInstitutionRetentionsbyOCLCNumber,retrieveAllInstitutionRetentions}
Operation to run: retrieveMergedOCLCNumbers,
retrieveHoldingsByOCLCNumber, retrieveSPByOCLCNumber,
retrieveInstitutionRetentionsbyOCLCNumber,
retrieveAllInstitutionRetentions
--outputDir OUTPUTDIR
Directory to save output to
--heldBy HELDBY
OCLC Symbol of institution to limit to
--heldByGroup HELDBYGROUP
Symbol of group to limit to
--heldInState HELDINSTATE
Two letter state/province code to limit holdings to
$ python getData.py --itemFile samples/oclc_numbers.csv --operation retrieveMergedOCLCNumbers --outputDir samples/mergedOCLCNumbers.csv
$ python getData.py --itemFile samples/oclc_numbers_holdings.csv --operation retrieveHoldingsByOCLCNumber --outputDir samples/holdings_results.csv --heldBy CCO
$ python getData.py --itemFile samples/sp_holdings.csv --operation retrieveSPByOCLCNumber --outputDir samples/sp_holdings_results.csv --heldInState CA
$ python getData.py --itemFile samples/my_retentions.csv --operation retrieveInstitutionRetentionsbyOCLCNumber --outputDir samples/my_retentions_CCO.csv --heldBy CCO
$ python getData.py --itemFile samples/symbol_retentions.csv --operation retrieveAllInstitutionRetentions --outputDir samples/my_retentions_CCO.csv
Download node and npm and use the install
command to read the dependencies JSON file
$ npm install
- Install AWS Command line tools
- https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html I recommend using pip.
- Create an AWS user in IAM console. Give it appropriate permissions. Copy the key and secret for this user to use in the CLI.
- Configure the command line tools - https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html
- Make sure you add -- key/secret -- region
- Use the AWS Console to create a bucket. Note your bucket name!!!
- Create folder collection_analysis/
- Add a sample csv file named holdingsByOCLCNumber.csv with data to check for holdings
- Add a sample csv file named retainedholdingsByOCLCNumber.csv with data to check for retained holdings
-
Alter s3-getHoldings.json to point to your bucket and your sample txt file.
-
Use serverless to test locally
$ serverless invoke local --function checkHoldingsByOCLCNumber --path s3-getHoldings.json
-
Alter s3-getRetainedHoldings.json to point to your bucket and your sample csv file.
-
Use serverless to test locally
$ serverless invoke local --function checkSPByOCLCNumber --path s3-getRetainedHoldings.json
$ serverless deploy