Skip to content

Commit

Permalink
Documentation Updates
Browse files Browse the repository at this point in the history
  • Loading branch information
tmorrell committed Nov 29, 2018
1 parent 641fa71 commit bd87e7a
Show file tree
Hide file tree
Showing 3 changed files with 86 additions and 23 deletions.
101 changes: 82 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,45 +4,108 @@

Automated Metadata Service

This package will update the metadata in our repositories from external sources. This package is currently in development and will have additional sources and matchers added over time.
Manage metadata from different sources. The examples in the package are
specific to Caltech repositories, but could be generalized. This package
is currently in development and will have additional sources and matchers
added over time.

Requires:
## Install:

Python 3 (Recommended via [Anaconda](https://www.anaconda.com/download)) with [requests](https://pypi.python.org/pypi/requests) library and [Dataset](https://github.com/caltechlibrary/dataset).
If you just need functions (like codemeta_to_datacite): `pip install ames`
If you want to run operations, download the whole repo to get examples

## Requirements:

Python 3.7 (Recommended via [Anaconda](https://www.anaconda.com/download))

You should have requests and datacite: `pip install requests datacite`

Harvesting requires [Dataset](https://github.com/caltechlibrary/dataset).

CaltechDATA integration requires [caltechdata_api](https://github.com/caltechlibrary/caltechdata_api)

## Harvesters
## Organization

### Harvesters

- datacite_refs - Harvest references in datacite metadata from crossref event data
- crossref_refs - Harvest references in datacite metadata from crossref event data
- cd_github - Harvest GitHub repos from CaltechDATA
- caltechdata - Harvest metadata from CaltechDATA
- cd_github - Harvest GitHub repos and codemeta files from CaltechDATA
- matomo - Harvest web statistics from matomo

## Matchers
### Matchers

- caltechdata - Match content in CaltechDATA
- update_datacite - Match content in DataCite

## Example Operations

The run scripts show examples of using ames to perform a specific update
operation.

### CodeMeta management

In the test directory these is an example of using the codemeta_to_datacite
function to convert a codemeta file to DataCite standard metdata

## CodeMeta Updating
### CodeMeta Updating

Collect GitHub records in CaltechDATA, search for a codemeta.json file, and
update CaltechDATA with new metadata.

#### Setup
You need to set an environmental variable with your token to access
CaltechDATA `export TINDTOK=`

#### Usage
Type `python run_codemeta.py`. This will harvest all the repos present in
CaltechDATA, search them for codemeta.json files, and update the metadata
in CaltechDATA. There are more fields that could be mapped in the future.
Type `python run_codemeta.py`.

### CaltechDATA Citation Alerts

Harvest citation data from the Crossref Event Data API, records in
CaltechDATA, match records, update metadata in CaltechDATA, and send email to
user.

#### Setup
You need to set environmental variables with your token to access
CaltechDATA `export TINDTOK=` and Mailgun `export MAILTOK=`.

#### Usage

Type `python run_event_data.py`. You'll be prompted for confirmation if any
new citations are found.

### Media Updates

Update media records in DataCite that indicate the files associated with a DOI.

#### Setup
You need to set an environmental variable with your password for your DataCite
account using `export DATACITE=`

#### Usage
Type `python run_media_update.py`.

### CaltechDATA metadata updates

This will run checks on the quality of metadata in CaltechDATA. Currently this
verifies whether redundent links are present in the related identifier section.

#### Setup
You need to set environmental variables with your token to access
CaltechDATA `export TINDTOK=`

#### Usage
Type `python run_caltechdata_updates.py`.

## CaltechDATA Citation Alerts
### Matomo downloads

This will harvest download information from matomo. Very experimental.

#### Setup
You need to set environmental variables with your token to access
CaltechDATA `export TINDTOK=` and Mailgun `export MAILTOK=`. Access to data on S3 is currently
restricted to Caltech Library staff and your S3 configuration needs to be set up
following the instructions in Dataset.
CaltechDATA `export MATTOK=`

#### Usage
Type `python run_event_data.py`. You will automatically generate citation alerts for all DOIs in the CaltechDATA repository.
This script collects citation data from the Crossref Event Data API, matches DOIs with those
in CaltechDATA, updates the metadata in CaltechDATA, and sends an email alert to the contact
person for the data record. You'll be prompted if any citations are found.
Type `python run_downloads.py`.

6 changes: 3 additions & 3 deletions codemeta.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"description": "AMES automates updating repository metadata.",
"description": "Manage metadata from different sources.",
"name": "AMES: Automated Metadata Service",
"codeRepository": "https://github.com/caltechlibrary/ames",
"issueTracker": "https://github.com/caltechlibrary/ames/issues",
"license": "https://data.caltech.edu/license",
"version": "0.0.7",
"version": "0.1.1",
"author": [
{
"@type": "Person",
Expand All @@ -17,7 +17,7 @@
"@id": "https://orcid.org/0000-0001-9266-5146"
}],
"developmentStatus": "active",
"downloadUrl": "https://github.com/caltechlibrary/ames/archive/0.0.7.zip",
"downloadUrl": "https://github.com/caltechlibrary/ames/archive/0.1.1.zip",
"keywords": [
"GitHub",
"metadata",
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
EMAIL = '[email protected]'
AUTHOR = 'Tom Morrell'
REQUIRES_PYTHON = '>=3.7.0'
VERSION = '0.1.0'
VERSION = '0.1.1'

# What packages are required for this module to be executed?
REQUIRED = [
Expand Down

0 comments on commit bd87e7a

Please sign in to comment.