Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pool2d MLRoundingType - Simplify the operand layout support of conv2d and pooling 2d operations #770

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 14 additions & 38 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -871,7 +871,7 @@ dictionary MLComputeResult {
interface MLContext {
Promise<MLComputeResult> compute(
MLGraph graph, MLNamedArrayBufferViews inputs, MLNamedArrayBufferViews outputs);

MLOpSupportLimits opSupportLimits();
};
</script>
Expand Down Expand Up @@ -5283,19 +5283,14 @@ partial dictionary MLOpSupportLimits {
### Pooling operations ### {#api-mlgraphbuilder-pool2d}
Compute a pooling operation across all the elements within the moving window over the input tensor.
<script type=idl>
enum MLRoundingType {
"floor",
"ceil"
};

dictionary MLPool2dOptions : MLOperatorOptions {
sequence<[EnforceRange] unsigned long> windowDimensions;
sequence<[EnforceRange] unsigned long> padding;
sequence<[EnforceRange] unsigned long> strides;
sequence<[EnforceRange] unsigned long> dilations;
MLInputOperandLayout layout = "nchw";
MLRoundingType roundingType = "floor";
sequence<[EnforceRange] unsigned long> outputSizes;
required sequence<[EnforceRange] unsigned long> outputSizes;
};

partial interface MLGraphBuilder {
Expand Down Expand Up @@ -5346,16 +5341,16 @@ partial dictionary MLOpSupportLimits {
- input tensor: *[batches, height, width, inputChannels]*
- output tensor: *[batches, height, width, outputChannels]*

: <dfn>roundingType</dfn>
::
The rounding function used to compute the output shape.

: <dfn>outputSizes</dfn>
::
A list of length 2.
Specifies the sizes of the two spacial dimensions of the output tensor. When the output sizes are explicitly specified, the {{MLPool2dOptions/roundingType}} is ignored.
A list of length 2: *[outputHeight, outputWidth]*
Specifies the sizes of the two spatial dimensions of the output tensor.

If not specified, the output sizes are automatically computed.
The spatial dimensions of the output tensor can be calculated as follows:

*output size = ((input size - filter size + beginning padding + ending padding) / stride) + 1*

Then the caller either applies a floor or ceiling depending on whether partial window results are desired.
</dl>

<div dfn-for="MLGraphBuilder/averagePool2d(input, options), MLGraphBuilder/l2Pool2d(input, options), MLGraphBuilder/maxPool2d(input, options)" dfn-type=argument>
Expand All @@ -5366,13 +5361,8 @@ partial dictionary MLOpSupportLimits {

**Returns:** an {{MLOperand}}. The output 4-D tensor that contains the
result of the reduction. The logical shape is interpreted according to the
value of *layout*. More specifically, if the *options.roundingType* is {{MLRoundingType/"floor"}}, the spatial dimensions of the output tensor can be calculated as follows:

`output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride)`

or if *options.roundingType* is {{MLRoundingType/"ceil"}}:

`output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride)`
value of *layout*, taking the batch and channel count from the input with
the spatial sizes from *outputSizes*.
</div>

{{MLOpSupportLimits}} has the following members for pooling operations:
Expand All @@ -5395,7 +5385,7 @@ partial dictionary MLOpSupportLimits {

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate pool2d output sizes</dfn> given {{MLInputOperandLayout}} |layout|, [=/list=] of 4 unsigned integers |inputShape|, {{MLRoundingType}} |roundingType|, [=/list=] of 2 unsigned integers |windowDimensions|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, [=/list=] of 2 unsigned integers |dilations|, and optional [=/list=] of 2 unsigned integers |outputSizes|, perform these steps. They return a [=/list=] of 4 unsigned integers.
To <dfn for=MLGraphBuilder>calculate pool2d output sizes</dfn> given {{MLInputOperandLayout}} |layout|, [=/list=] of 4 unsigned integers |inputShape|, [=/list=] of 2 unsigned integers <var ignore>windowDimensions</var>, [=/list=] of 4 unsigned integers <var ignore>padding</var>, [=/list=] of 2 unsigned integers <var ignore>strides</var>, [=/list=] of 2 unsigned integers <var ignore>dilations</var>, and optional [=/list=] of 2 unsigned integers |outputSizes|, perform these steps. They return a [=/list=] of 4 unsigned integers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If windowDimensions, padding, strides and dilations parameters are not used in the algorithm steps, they probably can be removed.

outputSizes is not optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this algorithm is only invoked in one place (the "create pooling operations" steps) can we just move it inline?

</summary>
1. Switch on |layout|:
<dl class=switch>
Expand All @@ -5404,21 +5394,7 @@ partial dictionary MLOpSupportLimits {
: {{MLInputOperandLayout/"nhwc"}}
:: Let « |batches|, |inputHeight|, |inputWidth|, |channels| » be |inputShape|.
</dl>
1. If |outputSizes| is given, then let « |outputHeight|, |outputWidth| » be |outputSizes|.
1. Otherwise:
1. Let |outputSizes| be the result of [=MLGraphBuilder/calculating conv2d output sizes=] given |inputHeight|, |inputWidth|, |windowDimensions|[0], |windowDimensions|[1], |padding|, |strides|, and |dilations|.
1. Let « |outputHeight|, |outputWidth| » be |outputSizes|.
1. Switch on |roundingType|:
huningxin marked this conversation as resolved.
Show resolved Hide resolved
<dl class=switch>
: {{MLRoundingType/"floor"}}
::
1. Set |outputWidth| to floor(|outputWidth|).
1. Set |outputHeight| to floor(|outputHeight|).
: {{MLRoundingType/"ceil"}}
::
1. Set |outputWidth| to ceiling(|outputWidth|).
1. Set |outputHeight| to ceiling(|outputHeight|).
</dl>
1. Let « |outputHeight|, |outputWidth| » be |outputSizes|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add validation steps for user-supplied outputSizes? Chromium prototype does that: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/public/cpp/graph_validation_utils.cc;l=1179;drc=60130be50f68c89220d8ace62c87155025267a2c;bpv=1;bpt=1

If we add outputSizes validation steps, windowDimensions, padding, strides and dilations parameters would be required.

1. Switch on |layout|:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
Expand Down Expand Up @@ -5451,7 +5427,7 @@ partial dictionary MLOpSupportLimits {
1. If |options|.{{MLPool2dOptions/dilations}}'s [=list/size=] is not 2, then [=exception/throw=] a {{TypeError}}.
1. If any value in |options|.{{MLPool2dOptions/dilations}} is not greater than 0, then [=exception/throw=] a {{TypeError}}.
1. Let |desc| be a copy of |input|.{{MLOperand/[[descriptor]]}}.
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating pool2d output sizes=] given |options|.{{MLPool2dOptions/layout}}, |input|'s [=MLOperand/shape=], |options|.{{MLPool2dOptions/roundingType}}, |options|.{{MLPool2dOptions/windowDimensions}}, |options|.{{MLPool2dOptions/padding}}, |options|.{{MLPool2dOptions/strides}}, |options|.{{MLPool2dOptions/dilations}}, and |options|.{{MLPool2dOptions/outputSizes}} (if it [=map/exists=]).
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating pool2d output sizes=] given |options|.{{MLPool2dOptions/layout}}, |input|'s [=MLOperand/shape=], |options|.{{MLPool2dOptions/windowDimensions}}, |options|.{{MLPool2dOptions/padding}}, |options|.{{MLPool2dOptions/strides}}, |options|.{{MLPool2dOptions/dilations}}, and |options|.{{MLPool2dOptions/outputSizes}} (if it [=map/exists=]).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outputSizes validation above in 13.2 doesn't appear to be correct--shouldn't it be similar to this logic?

1. If any [=list/item=] in |outputShape| is not a [=valid dimension=], then [=exception/throw=] a {{TypeError}}.
1. Set |desc|.{{MLOperandDescriptor/shape}} to |outputShape|.
1. *Make graph connections:*
Expand Down