Skip to content

Troubleshooting

Liz Krznarich edited this page Nov 3, 2023 · 4 revisions

ROR services are very stable and rarely experience issues and downtime.

Common issues

Issue Common causes Action(s)
ROR API is down Elastic search at 100% CPU usage Typically nothing; app will recover on its ownIf it does not recover, force a restart with deployment to ror-api or (in case of emergency) AWS CLI request like aws ecs update-service --force-new-deployment --cluster CLUSTER_NAME --service SERVICE_NAME
Ror-site won’t deploy Dependency issues Review Actions log; check dependencies pulled in during actions runTrigger deployment again if needed
No app logs from ECS containers (not really an issue itself, but makes it hard to troubleshoot) Nginx logs are not being forwarded (bug in Phusion Passenger https://github.com/phusion/passenger-docker/issues/224 SSH to container (see below) and restart nginx-log-forwarder

Reverting/overwriting Elastic search index

In case of issues with the data release process, it's possible to delete and recreated the Elasticsearch index from a data dump.

  1. SSH to running ECS container - see Bastion host entry in 1Password

  2. Run the setup up command and pass the filename of the data dump you want to index (no file extension). File must exist in ror-data.

     python manage.py setup v1.0-2022-03-17-ror-data
    
Clone this wiki locally