Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teradata SQL Grammar #4330

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

DmitryMikhailovich
Copy link
Contributor

Grammar of Teradata SQL dialect based on version 17.10 of Teradata Vantage.

Based on my estimates, this grammar is 90% complete. We used it to parse codebase of tens of thousands of objects for enterprise data lineage.

I've started to add examples only at a later stage of the development, so examples dir lacks many essential statements, like CRUD.

Q: Why "Teradata SQL" is in naming, not just "Teradata"?
A: I've made it deliberately, because this grammar covers only SQL, without BTEQ scripting language (I think BTEQ scripts should have separate grammar).

@@ -0,0 +1,31 @@
# Teradata SQL Grammar

An [ANTLR4](https://www.antlr.org/) grammar for Teradata SQL. Based on a grammar of Teradata Database version 17.10.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include a permalink to the actual grammar for version 17.10. Is it a bison grammar? Is there a lex file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand what do you mean. Should I include links to formal grammar? I have none, and haven't found it on RDBMS vendor's website. I've written this grammar from the official publicly available documentation.

What information should I include in my case in readme? Links to documentation for version 17.10?

Copy link
Contributor

@kaby76 kaby76 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any of these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added links to 17.10 docs. Those docs URLs end with /July-2021 subpath.

numeric_data_type
: BYTEINT
| SMALLINT
| (INTEGER|INT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parentheses are used in places that don't need them. Why? E.g., (INTEGER | INT) instead of INTEGER | INT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

cursor_declaration*
condition_handler*
procedure_stat*
END (label_name)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on (label_name)?. Why this instead of label_name??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It looks like i've copied (label_name ':')? from the beginning of the same compound_stat rule and just removed ':'.

(ON logging_item (',' logging_item)* )?
;

operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort and uniq this list. You have duplicates, which is causing ambiguity to be detected in examples/ddl/logging/begin_logging.sql. For example, two INSERT, two SELECT, two UPDATE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thank you for review, I've copied list for operation rule from docs and haven't noticed that it had duplicates in sub-lists of common logging operations and row-level security logging operations.

@kaby76
Copy link
Contributor

kaby76 commented Nov 16, 2024

(On the build, the MacOS runners in GitHub are terrible--hanging in the testing for Java. I'm not sure when they will finish, but it's stalled in parsing for the Java target. But, that works in about 9s on my machine just fine. My PR #4327 removes the parsing tests for MacOS in order to minimize hangs like this from happening.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants