-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(data-exploration): data exploration backend API with types #13933
Conversation
…n, remove timestamp in old events query
Ready for a better look. Things I already know to do... likely in a followup:
|
* This file acts as the source of truth for: | ||
* | ||
* - frontend/src/queries/schema.json | ||
* - generated from typescript via "pnpm run generate:schema:json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If schema.json
and schema.py
are only generated from this file, do we need to commit them? I can see this being a bit easier, but I'm a bit worried about them getting of out of sync as it's possible to commit changes to the wrong file. Plus they're really big. Could be compiled in package.json's prepare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Says the guy who just committed gigabytes of PNGs to the repo :D.
Just like snapshots, I'm afraid if we don't commit them to git, we won't see any unexpected changes in the schema. Since this is a very important schema that'll be used to generate public facing docs, tooling for users, etc, I'd prefer to not take any chances.
It also makes development easier, if the files are already there. Unrelated, I've also been playing around with a Kea typegen inline types mode that would also commit the types into the repo... for clean, fast and reproducible builds.
Also, I still want to add a github action that throws a wrench when the schema is out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the kea types ship with git would help a lot for reviews etc. where switching branches takes a bit of time at the moment, so +1 for a solution there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't schema.ts
the snapshot in this case? The derived .json
and .py
versions ones are just distractions in a code review.
But fair enough, should be okay IF we have a wrench-throwing action.
offset: int = 0, | ||
) -> Union[List, EventsQueryResponse]: | ||
) -> List: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A List
of what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... this is how it was 😅
@@ -168,7 +216,10 @@ def query_events_list( | |||
fragment = fragment[1:] | |||
order_by_list.append(translate_hogql(fragment, hogql_context) + " " + order_direction) | |||
else: | |||
order_by_list.append(select_columns[0] + " ASC") | |||
if "count(*)" in select_columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is count()
a valid option too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not right now, as this is the generated output SQL where count()
hogql -> count(*)
clickhouse sql... but since count()
is also valid clickhouse SQL, and we might accidentally change this down the line, I added both formats here just to future proof.
) | ||
|
||
|
||
def query_events_list_v2( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like run_data_table_query
? This is more than an "events list v2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to run_events_query
, as a Team
and an EventsQuery
are its only inputs. I'd move this to a new file as well if I'd make all the changes I can foresee...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Am I right that we'd write types for the backend response in schema.ts as well and thus have full type checking for the frontend <-> backend interaction? (will ask this at standup :))
Leaving the backend review to more knowledgeable people :)
frontend/src/queries/nodes/DataTable/ColumnConfigurator/ColumnConfigurator.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ but let's make sure the schemas don't diverge with some PR check.
Thanks! Going to merge this in to reduce the number of spinning plates... and add one more with that CI check. |
* feat(data-exploration): backend json schema validation * Update snapshots * Update snapshots * Update snapshots * Update snapshots * Update snapshots * Update snapshots * Update snapshots * Update snapshots * refactor * switch to pydantic * split into v2 events list query * add backend query endpoint * Update snapshots * make events query work * bring back HogQLExpression, inline all __root__ only classes in python, remove timestamp in old events query * Update snapshots * Update snapshots * schema and mypy * Update snapshots * Update snapshots * restore /fake/ orderBy (only "timestamp" and "-timestamp" are supported) * Update snapshots * Update snapshots * Update snapshots * raise if invalid json * Update snapshots * remove __root__ hack * improve comments, in and out of quotes * shrug * future proofing * comment * rename to "run_events_query" * simplify Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Problem
People are already using our API to get data exploration event responses, even though that backend part is undocumented and not finished.
Changes
/api/project/:id/events
endpoint. The old endpoint works like it did before. Eventually we can deprecate it./api/project/:id/query
endpoint, which accepts any node that matchesQuerySchema
(so any query).pydantic
, generate Python classes from the JSON schema (which is generated from TypeScript definitions)query
endpoint, and in thev2
events query.Replace the typeHogQLExpression
with juststring
, as it was causing trouble in the backend.test_event.py
tests intotest_query.py
without changing the contentquery_event_list.py
file. I'd like to still refactor it substantially, but the point now was to do the smallest change on the query itself.How did you test this code?
Tested in the interface locally, and moved some tests around to the new format.