Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change individual to grouped parsing #2988

Closed
kaby76 opened this issue Jan 3, 2023 · 1 comment
Closed

Change individual to grouped parsing #2988

kaby76 opened this issue Jan 3, 2023 · 1 comment

Comments

@kaby76
Copy link
Contributor

kaby76 commented Jan 3, 2023

Testing in the V4 repo parse one input file per test program, which I call individual parsing. But, we can save a good deal of time by parsing multiple input files per test program, called grouped parsing. Let's see what this looks like in the output from a build:

Testing java/java9

~/issues/issue-2988/before/java/java9/Generated ~/issues/issue-2988/before
bash test.sh
../examples/AllInOne8.java
Time: 00:00:05.9106526
Parse succeeded.
../examples/helloworld.java
Time: 00:00:00.5270543
Parse succeeded.
../examples/IdentifierTest.java
Time: 00:00:00.6478677
Parse succeeded.
../examples/Instanceof.java
Time: 00:00:00.2960225
Parse succeeded.
../examples/ManyStringsConcat.java
Time: 00:00:00.1571820
Parse succeeded.
../examples/module-info.java
Time: 00:00:00.0614978
Parse succeeded.
../examples/TryWithResourceDemo.java
Time: 00:00:00.7459264
Parse succeeded.
../examples/Unicode.java
Time: 00:00:00.5391444
Parse succeeded.
Duration: 0 hours 0 minutes 12 seconds

With grouped parsing, the run time for inputs after the first are much shorter:

Testing java/java9

~/issues/issue-2988/grammars-v4/java/java9/Generated-CSharp ~/issues/issue-2988/grammars-v4
bash test.sh
CSharp 0 ../examples/AllInOne8.java success 6.1742676
CSharp 1 ../examples/helloworld.java success 0.0255556
CSharp 2 ../examples/IdentifierTest.java success 0.1810686
CSharp 3 ../examples/Instanceof.java success 0.0866523
CSharp 4 ../examples/ManyStringsConcat.java success 0.0374591
CSharp 5 ../examples/module-info.java success 0.0031916
CSharp 6 ../examples/TryWithResourceDemo.java success 0.1530053
CSharp 7 ../examples/Unicode.java success 0.044574
Total Time: 6.987732

Grouped parsing will help for Java, CSharp, and probably a few other targets. But, it may not fix targets like PHP, which show no speed-up with warm-up. See antlr/antlr-php-runtime#36.

To implement grouped parsing, the newest version of Trash trgen will need to be employed, as well as the templates in this repo updated.

There is still a significant amount of time required to generate and compile the drivers, so the builds will still be slow.

@kaby76
Copy link
Contributor Author

kaby76 commented Jan 4, 2023

There is one problem with warm-up parsing: the script is designed to produce one .error and one .tree file when needed. For example, in antlr/antlr4/examples, grammar three.g4 does not parse, and should produce the file three.g4.errors. Likewise, in arithmetic/examples, there are several .tree files containing the parse trees for those files.

I think the solution is to have an option to shunt output to a .error and/or .tree file when parsing, instead of sending everything to stderr/stdout.

@kaby76 kaby76 changed the title Change individual to warm-up parsing Change individual to grouped parsing Jan 4, 2023
teverett pushed a commit that referenced this issue Feb 4, 2023
* Changes for #2988

* Prior to this PR, .errors files were hand-constructed. This is not sustainable. All .errors and .trees that are tracked in Git repo must be generated exactly from the program consistently. Errors in the .errors files must have exactly one newline terminating the error. The parse tree in the .tree file does not have a newline terminating the tree.

* Renamed interfering Test.java sources--trgen much better tester.

* Actually, you can't ignore .errors even if tracked because this .gitignore causes any diffs in the tracked to be ignored as well, completely defeating the whole purpose. Makes absolutely zero sense writers of Git because .gitignore is a "broad brush" setting, and tracking a specific file ***should be*** a fine brush overriding the broad brush.

* Build failed with absolutely no output, no data on why it failed. This file changed last, cause massive failure in github actions. Push something to test again.

* Remove mvn builds.
@kaby76 kaby76 closed this as completed Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant