Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all type of events are saved in Clickhouse #8783

Open
2 tasks done
wandeder opened this issue Dec 6, 2024 · 2 comments
Open
2 tasks done

Not all type of events are saved in Clickhouse #8783

wandeder opened this issue Dec 6, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@wandeder
Copy link

wandeder commented Dec 6, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

  1. CVAT deployed via helm in k8s
  2. Turn on alnalytics by analytics.enabled=true

Expected Behavior

All type of events should be stored in Clickhouse via backend -> vector -> Clickhouse.

Possible Solution

No response

Context

When I turn on analytics, just few type of events are saving in Clickhouse (like create:job, update:task from server source), nothing more. Users activity, working time, create/delete project, delete task/job are not exist in Clickhouse.
By the way, looks like the server POST events:

2024-12-06 10:27:53,189 DEBG 'uvicorn-0' stderr output:
{"scope":"save:job","obj_name":null,"obj_id":null,"obj_val":null,"source":"client","timestamp":"1733480873.184401","count":null,"duration":2,"project_id":38,"task_id":257,"job_id":890,"user_id":68,"user_name":"user","user_email":"[email protected]","org_id":2,"org_slug":"org","payload":"{\"client_id\": \"889802\", \"is_active\": true, \"working_time\": 0, \"username\": \"user\"}"}

2024-12-06 10:27:53,190 DEBG 'uvicorn-0' stdout output:
INFO:     10.42.0.0:0 - "POST /api/events?org=org HTTP/1.0" 201 Created

But nothing in Vector logs and Clickhouse DB.
When I create a new task I see the logs in Vector and new row in DB, but just for this case.

I found one same issue: Am I the only one whose client events are not being saved in k8s? · Issue #8205 · cvat-ai/cvat

Any good idea?

Environment

Logs from `cvat-server`:
2024-12-06 10:27:53,189 DEBG 'uvicorn-0' stderr output:
{"scope":"save:job","obj_name":null,"obj_id":null,"obj_val":null,"source":"client","timestamp":"1733480873.184401","count":null,"duration":2,"project_id":38,"task_id":257,"job_id":890,"user_id":68,"user_name":"user","user_email":"[email protected]","org_id":2,"org_slug":"org","payload":"{\"client_id\": \"889802\", \"is_active\": true, \"working_time\": 0, \"username\": \"user\"}"}

2024-12-06 10:27:53,190 DEBG 'uvicorn-0' stdout output:
INFO:     10.42.0.0:0 - "POST /api/events?org=org HTTP/1.0" 201 Created

Used default values.yaml:

analytics:
  enabled: true
  clickhouseDb: cvat
  clickhouseUser: user
  clickhousePassword: user
  clickhouseHost: "{{ .Release.Name }}-clickhouse"

vector:
  envFrom:
    - secretRef:
        name: cvat-analytics-secret
  existingConfigMaps:
    - cvat-vector-config
  dataDir: "/vector-data-dir"
  containerPorts:
    - name: http
      containerPort: 80
      protocol: TCP
  service:
    ports:
      - name: http
        port: 80
        protocol: TCP
  image:
    tag: "0.26.0-alpine"

clickhouse:
  shards: 1
  replicaCount: 1
  extraEnvVarsSecret: cvat-analytics-secret
  initdbScriptsSecret: cvat-clickhouse-init
  auth:
    username: user
    existingSecret: cvat-analytics-secret
    existingSecretKey: CLICKHOUSE_PASSWORD
  # Consider enabling zookeeper if a distributed configuration is used
  zookeeper:
    enabled: false
@wandeder wandeder added the bug Something isn't working label Dec 6, 2024
@satyatejaswini1234
Copy link

The configuration looks mostly correct for enabling analytics, but the issue might lie in:

The Vector configuration, which might not be set up to capture all types of events.
The Clickhouse schema, which might need adjustments to store additional event types.
Connectivity or processing issues between Vector and Clickhouse.
You may need to check the Vector configuration for any missing event types and ensure Clickhouse is set up to store all event data. Additionally, reviewing the logs from both services (Vector and Clickhouse) can provide more insight into why some events are not being stored.

@wandeder
Copy link
Author

But I use default config for Vector and initial script for Clickhouse from this repo, for example Vector config:

[sources.http-events]
type = "http_server"
address = "0.0.0.0:80"
encoding = "json"

# Uncomment for debug
[sinks.console]
type = "console"
inputs = [ "http-events" ]
target = "stdout"

[sinks.console.encoding]
codec = "json"

[sinks.clickhouse]
inputs = [ "http-events" ]
type = "clickhouse"
database = "${CLICKHOUSE_DB}"
table = "events"
auth.strategy = "basic"
auth.user = "${CLICKHOUSE_USER}"
auth.password = "${CLICKHOUSE_PASSWORD}"
endpoint = "http://${CLICKHOUSE_HOST}:8123"
request.concurrency = "adaptive"
request.retry_attempts = 3
request.retry_backoff_secs = 1
encoding.only_fields = [
    "scope",
    "obj_name",
    "obj_id",
    "obj_val",
    "source",
    "timestamp",
    "count",
    "duration",
    "project_id",
    "task_id",
    "job_id",
    "user_id",
    "user_name",
    "user_email",
    "org_id",
    "org_slug",
    "payload",
]

There is nothing unusual in the logs either, I enabled error output in Vector, but there are no errors there.

It looks like the events are not reaching the Vector, but why does CVAT get code 200 in the response?
Could you tell me where exactly the message exchange between Vector and CVAT is configured?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants