Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce storageStruct #29908

Open
wants to merge 25 commits into
base: dev
Choose a base branch
from
Open

Introduce storageStruct #29908

wants to merge 25 commits into from

Conversation

Spiri0
Copy link
Contributor

@Spiri0 Spiri0 commented Nov 16, 2024

Introduce storageStruct

This is still a draft as there is still a bit to be done but it is already fully functional in wgslFn and for compute shader.
Of course the whole thing also has to work for Fn and I still have to test it. Just like for vertex and fragment shader. The work for this essentially lies in the name management of the structs, which still needs improvement, and the flow adjustment for the vertex and fragment shader (small but neccessary things)
The struct mechanic itself works exactly as I imagined and this means that complex structs (single and array structs) as well as atomic can be used. I have already tested this extensively.

I'm open to suggestions, but in the meantime I'll move on.

I will also create a small example for this. Basically I have already prepared this on CodePen
https://codepen.io/Spiri0/pen/QWePNeL so that I only need to integrate the struct but here too I would like to have the shaders in Fn.

@sunag I closed the old PR because it went in the wrong direction from the start

Attila Schroeder and others added 2 commits November 16, 2024 09:18
Copy link

github-actions bot commented Nov 16, 2024

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 339.13
79
339.13
79
+0 B
+0 B
WebGPU 482.48
133.72
484.49
134.46
+2.01 kB
+741 B
WebGPU Nodes 481.95
133.62
483.96
134.36
+2.01 kB
+743 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 464.59
111.96
464.59
111.96
+0 B
+0 B
WebGPU 550.54
149.19
552.26
149.84
+1.72 kB
+645 B
WebGPU Nodes 506.42
138.91
508.14
139.54
+1.72 kB
+631 B

@Spiri0 Spiri0 marked this pull request as draft November 16, 2024 09:14
src/renderers/webgpu/nodes/WGSLNodeBuilder.js Fixed Show fixed Hide fixed
const bufferName = nodeBuffers[i];
const struct = structs[i];

resultMap.set(bufferName, struct.structName);

Check failure

Code scanning / CodeQL

Incomplete string escaping or encoding High

This replaces only the first occurrence of '&'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: .map( part => part.replace( /&/g, '' ) ) instead of .map( part => part.replace( '&', '' ) )

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 19, 2024

I had the review comment in the code but I removed it and am now posting it here because the comment leads to lint issues. I tested this with compute, vertex, fragment shaders. The custom structs are not always at the beginning. This is here now just for easier viewing. Since with TSL the user is not forced to choose a struct name as is the case with the ptr in wgsl, I have to think about how best to do this. When accessing the structs you have to know their names. A name must therefore be specified.

/*
REVIEW COMMENT

example from my test app:

before reduceFlow:
compute( &NodeBuffer_554.nodeUniform0, &NodeBuffer_558.nodeUniform1, &NodeBuffer_555.nodeUniform2, &NodeBuffer_556.nodeUniform3, &NodeBuffer_557.nodeUniform4, &NodeBuffer_559.nodeUniform5, instanceIndex, object.nodeUniform6 );

after reduceFlow: reduceFlow checks whether there is a storageStruct and if so then the
postfix is removed so that the pointer points to the struct and not to a content in the struct
compute( &NodeBuffer_554, &NodeBuffer_558, &NodeBuffer_555.nodeUniform2, &NodeBuffer_556.nodeUniform3, &NodeBuffer_557.nodeUniform4, &NodeBuffer_559.nodeUniform5, instanceIndex, object.nodeUniform6 );

extractPointerNames reads the names of the reduced pointers and stores them in an array
Array(2)
0: "NodeBuffer_554"
1: "NodeBuffer_558"

getCustomStructNameFromShader at the beginning of buildCode() reads the structNames from the shader header
Array(2)
0: {name: 'drawBuffer', structName: 'DrawBuffer'}
1: {name: 'meshletInfo', structName: 'MeshletInfo'}


createStructNameMapping links the automatic generated WGSLNodeBuilder name for each struct with the 
custom struct name specified by the user. This is necessary because in wgslFn the user can choose any 
name in the shader in ptr for structs.

Map(2)
[[Entries]]
0: {"NodeBuffer_554" => "DrawBuffer"}
1: {"NodeBuffer_558" => "MeshletInfo"}

replaceStructNames then replaces the names in the uniforms in the custom structs that the WGSLNodeBuilder
created with the name chosen by the user.

before replaceStructNames:

struct NodeBuffer_554Struct {
	vertexCount: u32,
	instanceCount: atomic<u32>,
	firstVertex: u32,
	firstInstance: u32
};
@binding( 0 ) @group( 0 )
var<storage, read_write> NodeBuffer_554 : NodeBuffer_554Struct;

after replaceStructNames:

struct DrawBuffer {
	vertexCount: u32,
	instanceCount: atomic<u32>,
	firstVertex: u32,
	firstInstance: u32
};
@binding( 0 ) @group( 0 )
var<storage, read_write> NodeBuffer_554 : DrawBuffer;
*/

Here one more example to illustrate:

struct NodeBuffer_568Struct {
	nodeUniform6 : array< vec3<f32> >

};
@binding( 3 ) @group( 1 )
var<storage, read> NodeBuffer_568 : NodeBuffer_568Struct;

struct NodeBuffer_569Struct {
	nodeUniform7 : array< vec2<f32> >

};
@binding( 4 ) @group( 1 )
var<storage, read> NodeBuffer_569 : NodeBuffer_569Struct;

struct MeshletInfo {
	cone_apex: vec4<f32>,
	cone_axis: vec4<f32>,
	cone_cutoff: f32,
	boundingSphere: vec4<f32>,
	parentBoundingSphere: vec4<f32>,
	error: vec4<f32>,
	parentError: vec4<f32>,
	lod: vec4<f32>,
	bboxMin: vec4<f32>,
	bboxMax: vec4<f32>
};
@binding( 5 ) @group( 1 )
var<storage, read> NodeBuffer_570 : array<MeshletInfo>;


The user could choose exactly the same name as the struct name itself as a parameter name in the struct. This is not normally done, but it is allowed. So that this is not replaced, I would have to adapt the function.

Without specifying a name, in the case of TSL the WGSLNodeBuilder would currently use its automatically generated name, e.g. NodeBuffer_554, NodeBuffer_558 and as a user you never see them and you would like to be able to use meaningful names. So I need a TSL specific way to specify names

getCustomStructNameFromShader( source ) {

const functionRegex = /fn\s+\w+\s*\(([\s\S]*?)\)/g; // filter shader header
const parameterRegex = /(\w+)\s*:\s*(ptr<\s*([\w]+),\s*(?:array<([\w<>]+)>|(\w+))[^>]*>|[\w<>,]+)/g; // filter parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange to me to find regular expression here. This shouldn't happen when code is generated from nodes, and not the other way around or reprocessed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had the parser in mind that you also use for the pointers because it is part of the WGSLNodeBuilder. However, the parser does not recognize stageData.codes as WGSL code, so it always thrown an error. At first I thought that this would only be the case with compute shaders because the -> void expression in stageData.codes is missing. _getWGSLComputeCode doesn't have that. But even when trying to use vertex and fragment shaders, the parser slipped into the else expression because it didn't recognize the shaders from stageData.codes as WGSL shaders.

So far stageData.codes is not used anywhere. It was only created. In your opinion, does the code from stageData.codes have to be compatible with the parser?
I then concentrated on reading out only the header, i.e. the part in the ( ), since the rest is not important. That seemed to me to be the safest and most stable way. If you see a better way to extract the datas from the shaders I welcome your idea. I originally wanted to forego the function entirely and do it with the parser.
Is the process clear from the review text?

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 21, 2024

@sunag what do you think about adjusting the declarationRegexp?

//actual
const declarationRegexp = /^[fn]*\s*([a-z_0-9]+)?\s*\(([\s\S]*?)\)\s*[\-\>]*\s*([a-z_0-9]+(?:<[\s\S]+?>)?)/i;

//work for -> void and without 
const declarationRegexp = /^[fn]*\s*([a-z_0-9]+)?\s*\(([\s\S]*?)\)\s*(?:->\s*([a-z_0-9]+(?:<[\s\S]+?>)?))?/i;

This also affects the vertex and fragment shader. The declarationRegexp is only used by the parser. Therefore it is not critical to change it this way and I also could use it. Even if it were a small change and I could include it directly in this PR, perhaps an extra PR for it would be appropriate for better traceability?
If not, I can adjust it in this PR

@sunag
Copy link
Collaborator

sunag commented Nov 21, 2024

It is certainly better to create a new PR for declarationRegexp.
The best way is to not reprocess the code that was just generated, so the change may not be small and may take a while.

@Spiri0 Spiri0 marked this pull request as ready for review November 21, 2024 06:49
@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 21, 2024

I set this to review. I asked @RenaudRohlinger if he could kindly help me translate the compute shader in my codePen into TSL, but that's probably not such an easy thing, as it's currently only possible with storageObject. There are also special drawIndirect commands for Metal on Macs. The extension to TSL is easier with a functional wgsl version without risking interferences from storageObject.
My parameterRegex is probably also suitable as a global parameter, since with this extension structs are now added and with this regex the struct names are more convenient to read (custom and node ones). Of course it then has to be renamed. But these is a small thing that can be done quickly.

@Spiri0 Spiri0 requested a review from sunag November 21, 2024 07:54
@sunag
Copy link
Collaborator

sunag commented Nov 21, 2024

Would you have a minimalist example of storageStruct() for share?

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 21, 2024

Would you have a minimalist example of storageStruct() for share?

I prepared this as an example:
https://codepen.io/Spiri0/pen/QWePNeL

I will now use this to create an example with a struct for the drawBuffer.

The advantage of this example is that it shows the use of drawIndirect, storageBuffers, compute shaders, structs and atomic, so pretty much all the basic aspects are combined in a very clear, short example.

But I'd like to do an extra PR for this as I've only done one example so far and could use some practice creating examples.

I'll do that tonight.

With additional buffers such as an instance storage buffer, which entries will also set in the compute shader, you can then control exactly which instances the vertex shader should process. But that would be another example.
Because in this way you would no longer be able to see how the indirectDrawBuffer changes anything in case of frustum culling

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 21, 2024

When testing existing examples, I noticed that some of them no longer worked because of my extension. The cause is a very banal mistake that I will correct. During the isArray check at the beginning where the bufferSnippets are created, I did not ensure that the input variables existed. With this all the examples run.

if ( bufferNode.value && bufferNode.value.array && bufferNode.value.itemSize ) {

	isArray = bufferType && bufferNode.value.array.length > bufferNode.value.itemSize;

}

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 22, 2024

@sunag I added the example but I still don't understand the screenshot topic. The E2E test only works if there is hardly anything visible in the screenshot. But the moments where you can see a lot would be better in my opinion.

For TSL, a struct name choice wouldn't even be necessary, if I understand correctly, since you don't need a pointer.
I would like to have the same example also in TSL. Is atomicStore with pointer in Fn already available?

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 26, 2024

@sunag what do you think of the current status? Can it be merged? For Macs, Atomic apparently requires Metal-specific adjustments in the shader. I can test this with this extension with someone who has a Mac like renauldRolinger.
I discovered that there is a limit of 8 storage buffers for shaders. The problem was quickly solved with the structs. With interleaved buffers and the structs, the limit is easily overcome and the shaders are much clearer.

@sunag
Copy link
Collaborator

sunag commented Nov 27, 2024

Sorry for the delay, in the next release I will dedicate myself to this. I think having struct is essential too, I need to check because we have a parse inside the builder, it doesn't seem like the right place for that.

@Spiri0
Copy link
Contributor Author

Spiri0 commented Nov 28, 2024

No problem with the delay. I can continue working.
Here is a screenshot of what is possible with structs and drawIndirect.

image

@Mugen87 @RenaudRohlinger This might also be of interest to you since you have some topics with mass mesh calculations.
Each dragon has 1303 meshlets and for each dragon and meshlet a visibility check is performed. So even for the much larger number of dragons and meshlets that are not in the frustum, the visibility check is calculated in the compute shader. The vertex shader is then only supplied with the visible vertices and instances by the compute shader.
The reason I haven't made more dragons is because I'm at the limit of the buffer size. Even at the buffer size limit (134217728) the fps is still at 120 fps. The threejs performance with structs + drawIndirect is so high that it is no longer the limiting element. Without structs this would not be possible because there is also a limit of 8 for the number of buffers that can be used in the compute shader. But even with a higher buffer count (without structs I would need over 30), structs are far better because they allow you to bundle the large amount of datas very efficiently in few buffers. In addition, the shader is much clearer and easier to use with structs. I can't stop being amazed by the enormous performance.

Spiri0 added a commit to Spiri0/three.js that referenced this pull request Dec 2, 2024
Arrays are not currently taken into account by the wgslTypeLib. However, with the struct extension mrdoob#29908, arrays will also become important as a type.
@Spiri0
Copy link
Contributor Author

Spiri0 commented Dec 6, 2024

Is there anything else I can do to relieve you? Seems like you're using the opportunity for a more extensive revision

sunag pushed a commit that referenced this pull request Dec 8, 2024
* Update WGSLNodeBuilder.js

Arrays are not currently taken into account by the wgslTypeLib. However, with the struct extension #29908, arrays will also become important as a type.

* enables the use of samplers in compute shaders

Since textureSampleLevel is usable in compute shaders, this small PR allows sampler to be used in compute shaders for this purpose

* Update WGSLNodeBuilder.js

* Update WGSLNodeBuilder.js
sunag pushed a commit that referenced this pull request Dec 13, 2024
* Update WGSLNodeBuilder.js

Arrays are not currently taken into account by the wgslTypeLib. However, with the struct extension #29908, arrays will also become important as a type.

* enables the use of samplers in compute shaders

Since textureSampleLevel is usable in compute shaders, this small PR allows sampler to be used in compute shaders for this purpose

* Update webgpu_pmrem_scene.html

add background to environment reflection

* Update webgpu_pmrem_scene.html

* Update webgpu_pmrem_scene.html

* Update webgpu_pmrem_scene.html

* Update WGSLNodeBuilder.js

* Add files via upload
@Makio64
Copy link
Contributor

Makio64 commented Dec 26, 2024

This features seem amazing and can improve a lot the performance for advance use case, whats blocking / need for a merge? @mrdoob @Mugen87 @RenaudRohlinger

@sunag
Copy link
Collaborator

sunag commented Dec 26, 2024

That's the reason for the block. #29908 (comment)

@Spiri0
Copy link
Contributor Author

Spiri0 commented Dec 26, 2024

That's the reason for the block. #29908 (comment)

When I click on your link I see my comment. Do you mean your comment with the delay above?

@mrdoob
Copy link
Owner

mrdoob commented Dec 27, 2024

That's the reason for the block. #29908 (comment)

When I click on your link I see my comment. Do you mean your comment with the delay above?

When I click on the link I get this:

Screenshot_20241227-093123

@Spiri0
Copy link
Contributor Author

Spiri0 commented Dec 27, 2024

ah ok, I was just a bit confused because the link made my comment seem so centered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants