-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[performance] add API load testing script #4237
base: main
Are you sure you want to change the base?
Conversation
765c888
to
0b989d4
Compare
0b989d4
to
a8e08ff
Compare
Rebase to |
9821cc5
to
b3589ff
Compare
b3589ff
to
4f8b013
Compare
@@ -39,6 +39,8 @@ FILE_API_KEY_DONCASTER=👻 | |||
FILE_API_KEY_GLOUCESTER=👻 | |||
FILE_API_KEY_TEWKESBURY=👻 | |||
|
|||
# Used to circumvent API rate limiting for development purposes (e.g. load testing) | |||
SKIP_RATE_LIMIT_SECRET=👻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already generated a 40char alphanumeric secret and pushed it up to S3, so it's ready to pull down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm I can pull this down ✔️
@@ -14,7 +19,29 @@ export default async (): Promise<Authenticator> => { | |||
const customPassport = new passport.Passport(); | |||
|
|||
// instantiate Microsoft OIDC client, and use it to build the related strategy | |||
const microsoftIssuer = await Issuer.discover(MICROSOFT_OPENID_CONFIG_URL); | |||
// we also keep said config as a fixture to enable offline local development |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change was literally just to allow me to run the API for development purposes (i.e. do pnpm dev
) while I was on a train with no internet, so it doesn't necessarily need to make it into the codebase, but could be useful for others?
afaik fetching the config for the Microsoft OIDC stuff is the only call the code makes when spinning up, so it was possible to run pnpm dev
to completion while offline before we added that SSO option, but now it isn't (without this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say no harm in keeping this around! Fixture may additionally come in handy for future E2E test mocks if we ever want Microsoft SSO coverage?
privateDownloadController, | ||
); | ||
|
||
router.delete( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would definitely like to hear whether anyone thinks there's a risk associated with exposing this function? I've made it accessible only to anyone with platform level admin powers.
the utility of it here is that we can upload as many files to staging S3 as we like during load testing without worrying about incurring additional cost, because we can also delete them immediately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing to platformAdmins
seems fine by me !
"Public" files are pretty exclusively flow content (eg vector drawings like roof pitch examples) used throughout services uploaded by editors; all have original copies on OSL's Google Drive, so low risk here! (Whereas "/file/private/..." are uploaded by applicants, so it's ideal to keep deletion of those restricted to session sanitation jobs.)
@@ -23,7 +23,7 @@ function useMinio() { | |||
// Points to Minio | |||
return { | |||
endpoint: `http://minio:${process.env.MINIO_PORT}`, | |||
s3ForcePathStyle: true, | |||
forcePathStyle: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this took me a good while to hunt down! a pesky little breaking change resulting from our move to v3 @aws-sdk/client-s3
in #3989 that was causing 500 errors with the following msg:
ServerError: Failed to upload public file: getaddrinfo ENOTFOUND pizza-user-uploads.minio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for shaking some of these out - went unnoticed for a good while!
@@ -7,6 +7,19 @@ const apiLimiter = rateLimit({ | |||
max: 250, | |||
standardHeaders: true, | |||
legacyHeaders: false, | |||
skip: (req: Request, _res: Response) => { | |||
// add a mechanism for skipping rate limit when load testing (on local or staging only) | |||
if (process.env.APP_ENVIRONMENT == "production") return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without this (or a more convoluted solution like hosting Locust workers on remote servers with different IPs) I can't exceed the rate limit of 250 reqs per IP per 10 minute period, which of course is not sufficient for load testing purposes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only thing I'm not totally following here is why we need a request header and env var? Setting both and checking their equality is just an extra layer of assurance that we're bypassing intentionally?
return resp.failure("Failed to delete file") | ||
resp.success() | ||
|
||
# TODO: add another task which hits the /upload-submission endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't managed to implement this part, but I think the above task should suffice for our purposes for this round of load testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks great here - one minor comment to address around Pulumi secret-getting, but otherwise think this is good to merge & test on staging 🙂
Happy to have another go with the Python script instructions once merged 👍
@@ -39,6 +39,8 @@ FILE_API_KEY_DONCASTER=👻 | |||
FILE_API_KEY_GLOUCESTER=👻 | |||
FILE_API_KEY_TEWKESBURY=👻 | |||
|
|||
# Used to circumvent API rate limiting for development purposes (e.g. load testing) | |||
SKIP_RATE_LIMIT_SECRET=👻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm I can pull this down ✔️
@@ -14,7 +19,29 @@ export default async (): Promise<Authenticator> => { | |||
const customPassport = new passport.Passport(); | |||
|
|||
// instantiate Microsoft OIDC client, and use it to build the related strategy | |||
const microsoftIssuer = await Issuer.discover(MICROSOFT_OPENID_CONFIG_URL); | |||
// we also keep said config as a fixture to enable offline local development |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say no harm in keeping this around! Fixture may additionally come in handy for future E2E test mocks if we ever want Microsoft SSO coverage?
@@ -71,15 +72,15 @@ export const publicUploadController: UploadController = async ( | |||
} | |||
}; | |||
|
|||
export const downloadFileSchema = z.object({ | |||
export const hostedFileSchema = z.object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 indeed a bit more semantically clear !
privateDownloadController, | ||
); | ||
|
||
router.delete( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing to platformAdmins
seems fine by me !
"Public" files are pretty exclusively flow content (eg vector drawings like roof pitch examples) used throughout services uploaded by editors; all have original copies on OSL's Google Drive, so low risk here! (Whereas "/file/private/..." are uploaded by applicants, so it's ideal to keep deletion of those restricted to session sanitation jobs.)
@@ -23,7 +23,7 @@ function useMinio() { | |||
// Points to Minio | |||
return { | |||
endpoint: `http://minio:${process.env.MINIO_PORT}`, | |||
s3ForcePathStyle: true, | |||
forcePathStyle: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for shaking some of these out - went unnoticed for a good while!
@@ -7,6 +7,19 @@ const apiLimiter = rateLimit({ | |||
max: 250, | |||
standardHeaders: true, | |||
legacyHeaders: false, | |||
skip: (req: Request, _res: Response) => { | |||
// add a mechanism for skipping rate limit when load testing (on local or staging only) | |||
if (process.env.APP_ENVIRONMENT == "production") return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only thing I'm not totally following here is why we need a request header and env var? Setting both and checking their equality is just an extra layer of assurance that we're bypassing intentionally?
{ | ||
name: "SKIP_RATE_LIMIT_SECRET", | ||
value: config.requireSecret("skip-rate-limit-secret"), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will fail on our production deploy's Pulumi up
step because there's no equivalent defined in Pulumi.production.yaml
and value: config.requireSecret(...)
is expecting a defined value
?
Alternatively, it looks like config.getSecret()
should do what we want here and handle this as a staging-only secret ! Docs here https://www.pulumi.com/docs/iac/concepts/secrets/#creating-secrets-programmatically
|
||
The API load testing script requires some files to work with, which are in `/samples`. Care should be taken to make sure anything added there is in the public domain. | ||
|
||
For example, I used [Unsplash](https://unsplash.com/s/photos/tree?license=free) to search for an image with a ['free' license](https://unsplash.com/license), and printed a page from the [WikiHouse](https://www.wikihouse.cc/) site as a PDF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Related to this ticket.
Main thrust of this PR is the addition of the
test_api.py
load testing script (see final commit).However, to get it working I had to make some other changes, such as adding a mechanism to bypass rate limiting, and exposing a file deletion endpoint. They are probably best considered on a per commit basis. I'm happy to split this up into multiple PRs if that's preferable :)
I'll also do a review and tag this up with some additional context.
Once this is merged we should be able to run the new script against the API on staging, to get an idea of whether it can handle the same load we are confident the data layer can (i.e. Hasura + RDS).
When we do that I'll track results here.