-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of UriHelper.GetDisplayUrl #55611
Improve performance of UriHelper.GetDisplayUrl #55611
Conversation
ab31619
to
fdfe440
Compare
.Append(path) | ||
.Append(queryString) | ||
.ToString(); | ||
return string.Create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meaningfully better than just calling the span Concat
overload?
return string.Concat((ReadOnlySpan<string?>)[
request.Scheme,
SchemeDelimiter,
request.Host.Value,
request.PathBase.Value,
request.Path.Value,
request.QueryString.Value
]);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because string.Concat
with ReadOnlySpan<char>
only goes up to 5 parameters and 6 are needed, as you figured out, an array allocation is needed (the API you "used" doesn't exist).
I haven't benchmarked this one, but I don't expect it to be better than string.Create
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does exist in .Net 9, might be fairly recent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still can't see how that could be better than string.Create
. have you benchmarked it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better by way of being 10x shorter and thus more readable/maintainable.
Unless rolling out the custom Concat is meaningfully faster, I don't think it's worth the extra logic.
Judging by the implementation, I would expect the two to perform very similarly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's around 3 times faster and uses around 4 times less memory and is a pattern already used in the code base.
I'm not saying we shouldn't make any changes here. Thank you for looking into this and improving things.
Going through string.Create
definitely is faster than StringBuilder
, that isn't being disputed.
I'm saying that using an existing helper string.Concat(ReadOnlySpan<string>)
should be very similar perf-wise to the significantly more logic needed to go through string.Create
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. for a sample input from your benchmark
Method | Mean | Allocated |
---|---|---|
StringBuilder | 262.35 ns | 904 B |
String_Interpolation | 111.77 ns | 216 B |
Concat | 88.82 ns | 216 B |
String_Create | 75.50 ns | 216 B |
Then the question becomes whether the extra LOC are worth it for 10 ns on GetDisplayUrl
.
I'll leave that up to the maintainers of this repo.
Side note: I wouldn't be surprised if Interpolation
ends up compiling to the same thing as Concat
in future compiler versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems odd that Concat
is slower, it should basically be doing the same thing as string.Create
here.
Is it possible that the CopyStringContent
calls are adding overhead that could be reduced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That helper is most likely getting inlined. My guess would be it's a combination of
- extra branches to guard against null values (both when calculating length and before copying)
- extra branch to check for length overflow (which theoretically the manual
string.Create
should also be doing) - extra branches for length checks before copying (defensive in case the backing values changed)
- extra branch at the end to check if the length is correct
- branches from having the loops at all instead of being effectively manually unrolled
- One extra Memmove call for
"://"
which would otherwise be turned into a couplemov
s when used directly in aCopyTo
like in thestring.Create
impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems odd that
Concat
is slower, it should basically be doing the same thing asstring.Create
here.Is it possible that the
CopyStringContent
calls are adding overhead that could be reduced?
public static string Concat(/params/ ReadOnlySpan<string?> values) is not yet in .NET 9.0.0-preview.3.24172.9. As soon as it is in a preview,
The current implementation of string.Concat
that does not require the allocation of a string[]
only accepts up to 4 string
s, and this requires 5.
I'll update my benchmarks to use it.
That helper is most likely getting inlined. My guess would be it's a combination of
- extra branches to guard against null values (both when calculating length and before copying)
- extra branch to check for length overflow (which theoretically the manual
string.Create
should also be doing)- extra branches for length checks before copying (defensive in case the backing values changed)
- extra branch at the end to check if the length is correct
- branches from having the loops at all instead of being effectively manually unrolled
- One extra Memmove call for
"://"
which would otherwise be turned into a couplemov
s when used directly in aCopyTo
like in thestring.Create
impl
When hot path performance is involved, I tend to not rely on guessing. I rely on personal and community experience and concrete measurements.
I'm going to be sad if this is the most efficient way to concatenate six strings. |
Up to 9.0.0-preview.3.24175.3, it is. |
Upated to use the new in .NET 9.0 Always slower than |
Remaining questions, @MihaZupan @BrennanConroy? |
|
I don't see a difference anywhere close to what you're seeing. For me, the string.Create case is just a few ns faster than the string.Concat (which makes sense, as it's effectively manually unrolling a loop the latter has), but it's also ~50 lines of code compared to the ~1 line of code for string.Concat. And it's going to contribute to a larger NativeAOT footprint, due to the use of the ValueTuple`5, which may not be otherwise used in the app and which brings with it a non-trivial footprint. (I also wouldn't be surprised if the string.Create benchmark were being skewed a bit by dynamic PGO, which can devirtualize this delegate call but would be much less likely in a real app where different delegates were being used with it.) This path is already allocating, so it can't be an ultra hot path. I'd just do the super simple and basically as efficient thing; just use Concat. |
Hi @stephentoub,
I was comparing the current implementation using In the meantime the original idea of using In the interest of measuring and comparing, I've added: [Benchmark]
[ArgumentsSource(nameof(Data))]
public string String_Concat2(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
{
if (!query.HasValue)
{
return string.Concat(scheme, "://", host.Value, basePath.Value, path.Value);
}
else if (!basePath.HasValue)
{
return string.Concat(scheme, "://", host.Value, path.Value, query.Value);
}
else if (!path.HasValue)
{
return string.Concat(scheme, "://", host.Value, basePath.Value, query.Value);
}
else
{
return string.Concat((ReadOnlySpan<string>)[scheme, "://", host.Value, basePath.Value, path, query.Value]);
}
} But it doesn't perform better:
I haven't doen't any work with AOT, so that completely overlooked that. But, I tried creating a struct for the arguments: [Benchmark]
[ArgumentsSource(nameof(Data))]
public string String_Create2(string scheme, HostString host, PathString basePath, PathString path, QueryString query)
{
var schemeValue = scheme ?? string.Empty;
var hostValue = host.Value ?? string.Empty;
var basePathValue = basePath.Value ?? string.Empty;
var pathValue = path.Value ?? string.Empty;
var queryValue = query.Value ?? string.Empty;
var length =
+schemeValue.Length
+ SchemeDelimiter.Length
+ hostValue.Length
+ basePathValue.Length
+ pathValue.Length
+ queryValue.Length;
return string.Create(
length,
new UriParts( scheme, hostValue, basePathValue, pathValue, queryValue),
static (buffer, uriParts) =>
{
if (uriParts.Scheme.Length > 0)
{
uriParts.Scheme.CopyTo(buffer);
buffer = buffer.Slice(uriParts.Scheme.Length);
}
SchemeDelimiter.CopyTo(buffer);
buffer = buffer.Slice(SchemeDelimiter.Length);
if (uriParts.Host.Length > 0)
{
uriParts.Host.CopyTo(buffer);
buffer = buffer.Slice(uriParts.Host.Length);
}
if (uriParts.BasePath.Length > 0)
{
uriParts.BasePath.CopyTo(buffer);
buffer = buffer.Slice(uriParts.BasePath.Length);
}
if (uriParts.Path.Length > 0)
{
uriParts.Path.CopyTo(buffer);
buffer = buffer.Slice(uriParts.Path.Length);
}
if (uriParts.Query.Length > 0)
{
uriParts.Query.CopyTo(buffer);
}
});
}
private readonly struct UriParts
{
public readonly string Scheme;
public readonly string Host;
public readonly string BasePath;
public readonly string Path;
public readonly string Query;
public UriParts(string scheme, string host, string basePath, string path, string query)
{
this.Scheme = scheme;
this.Host = host;
this.BasePath = basePath;
this.Path = path;
this.Query = query;
}
} And it's very close that using a tuple (
Whether this marginally improvement over The best option would be to add an overload to |
13f0072
to
562d2be
Compare
Replaced `StringBuilder` concatenation with the more efficient `string.Create` method for creating new strings by concatenating `scheme`, `host`, `pathBase`, `path`, and `queryString`. This method reduces overhead by avoiding multiple string concatenations and allocating the correct buffer size for the new string. The `CopyTo` method is used in a callback function to copy each string to the new string, slicing the buffer to remove the copied part. The `SchemeDelimiter` is always copied to the new string, regardless of its length. This change enhances the performance of the code.
449d2c1
to
643eeee
Compare
643eeee
to
ea0ba89
Compare
Given @stephentoub's comment and |
Anything blocking this PR? |
@paulomorgado No blockers, just needs a review approval. FYI the team is pretty focused on wrapping up 9.0 but we'll give this another look soon. |
Thanks for your patience @paulomorgado. This will make it in soon and show up for .NET 10 Preview 1 :) |
Replaced
StringBuilder
concatenation with the more efficientstring.Create
method for creating new strings by concatenatingscheme
,host
,pathBase
,path
, andqueryString
. This method reduces overhead by avoiding multiple string concatenations and allocating the correct buffer size for the new string. TheCopyTo
method is used in a callback function to copy each string to the new string, slicing the buffer to remove the copied part. TheSchemeDelimiter
is always copied to the new string, regardless of its length. This change enhances the performance of the code.Improve performance of UriHelper.GetDisplayUrl
Summary
UriHelper.GetDisplayUrl
uses a non-pooledStringBuilder
that is instantiated on every invocation. Although optimized in size, it is a heap allocation with an intermediary buffer.Motivation and goals
This method is frequently used in hot paths like redirect and rewrite rules.
From the benchmarks below, we can see that, compared to the current implementation using a
StringBuilder
with enough capacity, string interpolation is around 3 times better in terms of duration and around 4 times in memory used.String.Create
is even more performant.Benchmarks
StringBuilder
This benchmark uses the same implementation as
UriHelper.GetDisplayUrl
.String_Interpolation
This benchmark uses string interpolation to build the URL.
String_Concat
This benchmark uses the new in .NET 9.0
String.Concat(ReadOnlySpan<string?> values)
to build the URL.String_Create
This benchmark uses
String.Create
and spans to build the URL.Code
{Detail}
Fixes #28906