diff --git a/README.md b/README.md index 2237130..569fd53 100644 --- a/README.md +++ b/README.md @@ -4,45 +4,108 @@ Automated Metadata Service -This package will update the metadata in our repositories from external sources. This package is currently in development and will have additional sources and matchers added over time. +Manage metadata from different sources. The examples in the package are +specific to Caltech repositories, but could be generalized. This package +is currently in development and will have additional sources and matchers +added over time. -Requires: +## Install: -Python 3 (Recommended via [Anaconda](https://www.anaconda.com/download)) with [requests](https://pypi.python.org/pypi/requests) library and [Dataset](https://github.com/caltechlibrary/dataset). +If you just need functions (like codemeta_to_datacite): `pip install ames` +If you want to run operations, download the whole repo to get examples + +## Requirements: + +Python 3.7 (Recommended via [Anaconda](https://www.anaconda.com/download)) + +You should have requests and datacite: `pip install requests datacite` + +Harvesting requires [Dataset](https://github.com/caltechlibrary/dataset). CaltechDATA integration requires [caltechdata_api](https://github.com/caltechlibrary/caltechdata_api) -## Harvesters +## Organization + +### Harvesters -- datacite_refs - Harvest references in datacite metadata from crossref event data - crossref_refs - Harvest references in datacite metadata from crossref event data -- cd_github - Harvest GitHub repos from CaltechDATA +- caltechdata - Harvest metadata from CaltechDATA +- cd_github - Harvest GitHub repos and codemeta files from CaltechDATA +- matomo - Harvest web statistics from matomo -## Matchers +### Matchers - caltechdata - Match content in CaltechDATA +- update_datacite - Match content in DataCite + +## Example Operations + +The run scripts show examples of using ames to perform a specific update +operation. + +### CodeMeta management + +In the test directory these is an example of using the codemeta_to_datacite +function to convert a codemeta file to DataCite standard metdata -## CodeMeta Updating +### CodeMeta Updating + +Collect GitHub records in CaltechDATA, search for a codemeta.json file, and +update CaltechDATA with new metadata. #### Setup You need to set an environmental variable with your token to access CaltechDATA `export TINDTOK=` #### Usage -Type `python run_codemeta.py`. This will harvest all the repos present in -CaltechDATA, search them for codemeta.json files, and update the metadata -in CaltechDATA. There are more fields that could be mapped in the future. +Type `python run_codemeta.py`. + +### CaltechDATA Citation Alerts + +Harvest citation data from the Crossref Event Data API, records in +CaltechDATA, match records, update metadata in CaltechDATA, and send email to +user. + +#### Setup +You need to set environmental variables with your token to access +CaltechDATA `export TINDTOK=` and Mailgun `export MAILTOK=`. + +#### Usage + +Type `python run_event_data.py`. You'll be prompted for confirmation if any +new citations are found. + +### Media Updates + +Update media records in DataCite that indicate the files associated with a DOI. + +#### Setup +You need to set an environmental variable with your password for your DataCite +account using `export DATACITE=` + +#### Usage +Type `python run_media_update.py`. + +### CaltechDATA metadata updates + +This will run checks on the quality of metadata in CaltechDATA. Currently this +verifies whether redundent links are present in the related identifier section. + +#### Setup +You need to set environmental variables with your token to access +CaltechDATA `export TINDTOK=` + +#### Usage +Type `python run_caltechdata_updates.py`. -## CaltechDATA Citation Alerts +### Matomo downloads + +This will harvest download information from matomo. Very experimental. #### Setup You need to set environmental variables with your token to access -CaltechDATA `export TINDTOK=` and Mailgun `export MAILTOK=`. Access to data on S3 is currently -restricted to Caltech Library staff and your S3 configuration needs to be set up -following the instructions in Dataset. +CaltechDATA `export MATTOK=` #### Usage -Type `python run_event_data.py`. You will automatically generate citation alerts for all DOIs in the CaltechDATA repository. -This script collects citation data from the Crossref Event Data API, matches DOIs with those -in CaltechDATA, updates the metadata in CaltechDATA, and sends an email alert to the contact -person for the data record. You'll be prompted if any citations are found. +Type `python run_downloads.py`. + diff --git a/codemeta.json b/codemeta.json index ea61ba7..87f47d0 100644 --- a/codemeta.json +++ b/codemeta.json @@ -1,12 +1,12 @@ { "@context": "https://doi.org/10.5063/schema/codemeta-2.0", "@type": "SoftwareSourceCode", - "description": "AMES automates updating repository metadata.", + "description": "Manage metadata from different sources.", "name": "AMES: Automated Metadata Service", "codeRepository": "https://github.com/caltechlibrary/ames", "issueTracker": "https://github.com/caltechlibrary/ames/issues", "license": "https://data.caltech.edu/license", - "version": "0.0.7", + "version": "0.1.1", "author": [ { "@type": "Person", @@ -17,7 +17,7 @@ "@id": "https://orcid.org/0000-0001-9266-5146" }], "developmentStatus": "active", - "downloadUrl": "https://github.com/caltechlibrary/ames/archive/0.0.7.zip", + "downloadUrl": "https://github.com/caltechlibrary/ames/archive/0.1.1.zip", "keywords": [ "GitHub", "metadata", diff --git a/setup.py b/setup.py index 329c2fa..eeae3ec 100644 --- a/setup.py +++ b/setup.py @@ -18,7 +18,7 @@ EMAIL = 'tmorrell@caltech.edu' AUTHOR = 'Tom Morrell' REQUIRES_PYTHON = '>=3.7.0' -VERSION = '0.1.0' +VERSION = '0.1.1' # What packages are required for this module to be executed? REQUIRED = [