Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cvat_redis_inmem restart loop #8892

Closed
2 tasks done
antortjim opened this issue Dec 30, 2024 · 5 comments
Closed
2 tasks done

cvat_redis_inmem restart loop #8892

antortjim opened this issue Dec 30, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@antortjim
Copy link

antortjim commented Dec 30, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

Download cvat

git clone  https://github.com/cvat-ai
cd cvat-ai
git checkout v2.10.1 # a33f7f57088744bab61f18e8a8cf6528a0c22fd2

Run it for an undefinite time and reboot your machine. For some reason, now I do get the error (it;s not the first time I reboot the machine though)

Output of docker logs cvat_redis_inmem

1:C 30 Dec 2024 15:17:12.930 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see jemalloc/jemalloc#1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:C 30 Dec 2024 15:17:12.930 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 30 Dec 2024 15:17:12.930 * Redis version=7.2.3, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 30 Dec 2024 15:17:12.930 * Configuration loaded
1:M 30 Dec 2024 15:17:12.930 * monotonic clock: POSIX clock_gettime
1:M 30 Dec 2024 15:17:12.931 * Running mode=standalone, port=6379.
1:M 30 Dec 2024 15:17:12.932 * Server initialized
1:M 30 Dec 2024 15:17:12.932 * Reading RDB base file on AOF loading...
1:M 30 Dec 2024 15:17:12.932 * Loading RDB produced by version 7.2.3
1:M 30 Dec 2024 15:17:12.932 * RDB age 1554244 seconds
1:M 30 Dec 2024 15:17:12.932 * RDB memory usage when created 1.80 Mb
1:M 30 Dec 2024 15:17:12.932 * RDB is base AOF
1:M 30 Dec 2024 15:17:12.932 * Done loading RDB, keys loaded: 33, keys expired: 0.
1:M 30 Dec 2024 15:17:12.932 * DB loaded from base file appendonly.aof.6.base.rdb: 0.000 seconds
1:M 30 Dec 2024 15:17:13.206 # Bad file format reading the append only file appendonly.aof.6.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>

Expected Behavior

No response

Possible Solution

  1. Stop cvat with docker-compose down

  2. Comment out the custom command under cvat_redis_inmem in the docker compose file.

#    command: [
#      "redis-server",
#      "--save", "60", "100",
#      "--appendonly", "yes",
#    ]

I have no idea what are the repercussions of not running the custom command. But it seems commenting it makes cvat works again. I would like to have confirmation to be sure.

  1. Start it again with docker-compose up -d

Context

I am unable to use cvat because work is not being saved anymore and I get an error notification in the GUI

Could not fetch models meta information
Error: Request failed with status code 500. "\n<!doctype html>\n<html lang="en">\n<head>\n <title>Server Error (500)</title>\n</head>\n<body>\n <h1>Server Error (500)</h1><p></p>\n</body>\n</html>\n

Environment

Git branch version

commit a33f7f57088744bab61f18e8a8cf6528a0c22fd2 (HEAD -> master, tag: v2.10.1, origin/master)
Merge: d66d043e2 c21062f5f
Author: cvat-bot[bot] <147643061+cvat-bot[bot]@users.noreply.github.com>
Date:   Thu Jan 18 11:10:58 2024 +0000

    Merge pull request #7372 from opencv/release-2.10.1

    Release v2.10.1

Docker version 20.10.21, build 20.10.21-0ubuntu1~22.04.3

Linux cv3 6.8.0-49-generic #49~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Nov  6 17:42:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
@antortjim antortjim added the bug Something isn't working label Dec 30, 2024
@antortjim
Copy link
Author

Forgot to attach the output of docker ps, which I think may be informative

CONTAINER ID   IMAGE                                       COMMAND                  CREATED          STATUS                          PORTS                                                                                          NAMES
4510c1d62a46   cvat/ui:v2.10.1                             "/docker-entrypoint.…"   23 minutes ago   Up 23 minutes                   80/tcp                                                                                         cvat_ui
6947f8021e4a   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_export
c0dcf8acf906   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_server
06b0218d76f5   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_annotation
d634d15a6d93   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_webhooks
b72b82684f9d   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_utils
b3cababa1ec0   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_import
600b5056fed7   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_quality_reports
c2fc683a5aeb   cvat/server:v2.10.1                         "./backend_entrypoin…"   23 minutes ago   Up 23 minutes                   8080/tcp                                                                                       cvat_worker_analytics_reports
1af4fed58ce5   timberio/vector:0.26.0-alpine               "/usr/local/bin/vect…"   23 minutes ago   Up 23 minutes                                                                                                                  cvat_vector
fbcb6bf66748   grafana/grafana-oss:10.1.2                  "sh -euc 'mkdir -p /…"   23 minutes ago   Up 23 minutes                   3000/tcp                                                                                       cvat_grafana
2ba607a4ed25   postgres:15-alpine                          "docker-entrypoint.s…"   23 minutes ago   Up 23 minutes                   5432/tcp                                                                                       cvat_db
6ab4a45beae9   traefik:v2.10                               "/entrypoint.sh trae…"   23 minutes ago   Up 23 minutes                   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 80/tcp, 0.0.0.0:8090->8090/tcp, :::8090->8090/tcp   traefik
5a5d4d4d36d4   apache/kvrocks:2.7.0                        "kvrocks -c /var/lib…"   23 minutes ago   Up 23 minutes (healthy)         6666/tcp                                                                                       cvat_redis_ondisk
e247add61bc6   clickhouse/clickhouse-server:23.11-alpine   "/entrypoint.sh"         23 minutes ago   Up 23 minutes                   8123/tcp, 9000/tcp, 9009/tcp                                                                   cvat_clickhouse
55131759a286   redis:7.2.3-alpine                          "docker-entrypoint.s…"   23 minutes ago   Restarting (1) 37 seconds ago                                                                                                  cvat_redis_inmem
51e55f2bd450   openpolicyagent/opa:0.45.0-rootless         "/opa run --server -…"   23 minutes ago   Up 23 minutes                                                                                                                  cvat_opa

@zhiltsov-max
Copy link
Contributor

Hi, I think some of the redis files might be corrupted because of the system restarts. Have you tried following this message: 1:M 30 Dec 2024 15:17:13.206 # Bad file format reading the append only file appendonly.aof.6.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>?

@antortjim
Copy link
Author

@zhiltsov-max thanks for looking into this
I cannot access the container with docker exec -it cvat_redis_inmem /bin/sh, I get this error message:

Error response from daemon: Container 55131759a286318154bc62a382243d6a9ceeb4922b2eeb0bf89b6fb012da3fcb is restarting, wait until the container is running

I tried running

docker run --rm -it -v cvat_inmem_db:/data redis:7.2.3-alpine sh

and then, inside the container

find / -regex .*manifest
gave me no result

find / -regex *.aof
gave me only ./usr/local/bin/redis-check-aof

So I don't know how to follow that message unfortunately!

@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Dec 31, 2024

Have you tried to remove the volume? docker volume remove cvat_cvat_inmem_db. If it's not acceptable, consider starting a separate redis container with changed entrypoint (--entrypoint) to /bin/sh and with this volume mounted.

@antortjim
Copy link
Author

I did run docker volume rm cvat_inmem_db and it didn't fix the problem. But now I just tried your suggestion of docker volume remove cvat_cvat_inmem_db and it works! I don't see the error in the logs or in the GUI! Thank you 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants