-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(ci): run benchmarks for each storage in separate workers #1536
base: main
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@mdelapenya has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 20 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThis pull request introduces a new Bash script Changes
Sequence DiagramsequenceDiagram
participant CI as CI Environment
participant Script as changed-modules.sh
participant Modules as Repository Modules
CI->>Script: Trigger with modified files
Script->>Script: Analyze changed files
Script->>Modules: Determine modules to build
Script-->>CI: Return list of modules
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
thank you, i also had such an idea for a short time, but then lost sight of it again |
I faced troubles while contributing the testcontainers support in the other PR, and detected that the benchmark was failing for other reasons. I'm not sure the implications of the changes with the benchmarks website, so please feel free to take over and change what you need |
Marked it as ready to review, as the only failure now is caused by a panic in the MySQL module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
s3/init_test.go (2)
32-33
: Context handling
You have set up aBackground()
context for container orchestration. Consider if you need aWithTimeout
context to avoid potential indefinite waits in case of container runtime failures.
43-47
: Error handling
Panicking on container connection errors is acceptable for test code, but you may want to provide additional logging or uset.Fatal
for clarity..github/scripts/changed-modules.sh (6)
17-17
: Shellcheck SC2155
It's generally recommended to declare and assign variables in separate steps to avoid potential masking of return values in Bash.Here’s an example adjustment:
- readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) + cd "$(dirname "${BASH_SOURCE[0]}")/../.." || exit 1 + readonly ROOT_DIR="$(pwd)"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
27-27
: Shellcheck SC2053
Quote the string when comparing with==
to avoid unintended pattern matching.- if [[ $1 == $excluded_module ]]; then + if [[ "$1" == "$excluded_module" ]]; then🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
36-36
: Shellcheck SC2044
Usingfor
withfind
output can be fragile. Consider awhile
-based approach or-exec
.-for modFile in $(find "${ROOT_DIR}" -name "go.mod" ...); do +find "${ROOT_DIR}" -name "go.mod" ... | while read -r modFile; do ... done🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
45-45
: Shellcheck SC2207
When sorting an array in Bash,mapfile
or read loops are safer than splitting on IFS.- IFS=$'\n' modules=($(sort <<<"${modules[*]}")) + mapfile -t modules < <(printf '%s\n' "${modules[@]}" | sort)Also applies to: 52-52
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
72-72
: Regex-based array checks
Using=~
on quoted strings in Bash can result in literal matches. Consider unquoting the pattern or checking membership with a loop.🧰 Tools
🪛 Shellcheck (0.10.0)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
84-84
: Output formatting
The surrounding quotes inecho "["$(IFS=,; echo "${modified_modules[*]}" | sed 's/ /,/g')"]"
may inadvertently remove spacing. You might want to fully quote the entire echo command.🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
.github/workflows/benchmark.yml
is excluded by!**/*.yml
.github/workflows/test-s3.yml
is excluded by!**/*.yml
s3/go.mod
is excluded by!**/*.mod
s3/go.sum
is excluded by!**/*.sum
,!**/*.sum
📒 Files selected for processing (2)
.github/scripts/changed-modules.sh
(1 hunks)s3/init_test.go
(2 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[warning] 67-67: Variable was used as an array but is now assigned a string.
(SC2178)
[warning] 67-67: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
🔇 Additional comments (7)
s3/init_test.go (6)
4-4
: Use of context package: good practice for test container management
Leveraging thecontext
package for controlling container lifecycles is a great best practice.
9-9
: Dependency import
Importingtestcontainers-go
for spinning up MinIO containers is appropriate here and helps keep tests self-contained.
18-24
: Configurable MinIO environment variables
Defining constants for MinIO credentials and enabling override viaTEST_MINIO_IMAGE
is a robust way to parameterize tests.
27-30
: Environmental override
Smart approach to override the default MinIO image from an environment variable. This ensures flexibility across environments.
34-38
: Proper container setup
The container is configured with custom credentials. This is clean and makes the test more realistic.
51-51
: Dynamic credentials
Replacing hard-coded credentials with the newly defined constants is a good improvement for maintainability.Also applies to: 54-55
.github/scripts/changed-modules.sh (1)
57-57
: Shellcheck SC2124
${ALL_CHANGED_FILES[@]}
can collapse arrays into strings, possibly losing data. IfALL_CHANGED_FILES
is expected to be a string, consider capturing it differently.🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
Shellcheck SC2178 / SC2124 Assigning arrays to a string variable can lead to unexpected behavior.
Shellcheck SC2207 When sorting an array in Bash, mapfile or read loops are safer than splitting on IFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
.github/scripts/changed-modules.sh (4)
16-21
: Improve error handling in ROOT_DIR setupSplit the declaration and assignment to properly handle potential errors in directory resolution.
-readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) +declare -r ROOT_DIR +ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) || exit 1🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
44-54
: Improve array sorting reliabilityUse mapfile for more reliable array sorting.
-IFS=$'\n' modules=($(sort <<<"${modules[*]}")) -unset IFS +mapfile -t modules < <(printf '%s\n' "${modules[@]}" | sort) -IFS=$'\n' allModules=($(sort <<<"${allModules[*]}")) -unset IFS +mapfile -t allModules < <(printf '%s\n' "${allModules[@]}" | sort)🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
80-84
: Improve output formatting reliabilityThe current output formatting might break with special characters.
-echo "["$(IFS=,; echo "${modified_modules[*]}" | sed 's/ /,/g')"]" +printf '[%s]\n' "$(IFS=,; echo "${modified_modules[*]}")"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
1-84
: Excellent approach to CI optimization!The script effectively implements the PR's objective of running benchmarks for each storage in separate workers by:
- Intelligently detecting modified modules
- Supporting selective testing based on changes
- Enabling parallel execution of benchmarks
This will significantly improve the contribution experience by:
- Reducing CI execution time
- Isolating benchmark failures to specific modules
- Preventing unrelated test failures
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/scripts/changed-modules.sh
(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
🔇 Additional comments (1)
.github/scripts/changed-modules.sh (1)
1-15
: Well-documented script with clear test instructions!The documentation effectively covers different test scenarios with expected outputs, making it easier for contributors to validate changes.
Regex-based array checks Using =~ on quoted strings in Bash can result in literal matches. Consider unquoting the pattern or checking membership with a loop.
Shellcheck SC2178 / SC2124 Assigning arrays to a string variable can lead to unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.github/scripts/changed-modules.sh (2)
16-18
: Enhance error handling in ROOT_DIR assignmentThe current implementation could mask potential failures. Consider separating the declaration and assignment:
-readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) +declare -r ROOT_DIR +ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) || exit 1🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
77-81
: Improve output formatting robustnessThe current output formatting could be improved to handle special characters and ensure proper JSON-like output.
-echo "["$(IFS=,; echo "${modified_modules[*]}" | sed 's/ /,/g')"]" +printf '[%s]\n' "$(IFS=,; echo "${modified_modules[*]}")"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 81-81: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 81-81: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/scripts/changed-modules.sh
(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 81-81: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 81-81: Quote this to prevent word splitting.
(SC2046)
🔇 Additional comments (4)
.github/scripts/changed-modules.sh (4)
1-15
: Well-documented script with clear examples!The script documentation effectively covers the main use cases and provides clear examples of how to test the functionality.
34-51
: Clean and efficient module discovery implementation!The code demonstrates good practices:
- Safe file processing with while read loop
- Proper array handling
- Efficient sorting using mapfile
62-75
:⚠️ Potential issueFix array handling and pattern matching
There are several issues in the modified files processing:
- The for loop doesn't properly handle filenames with spaces
- Array comparison is not robust
- Pattern matching could be improved
Apply these changes:
-for file in $modified_files; do +for file in "${modified_files[@]}"; do if [[ $file == .github/* ]]; then modified_modules=("${allModules[@]}") break fi - module_name=$(echo $file | cut -d'/' -f1) + module_name=$(echo "$file" | cut -d'/' -f1) - if [[ ! ${modified_modules[@]} =~ ${module_name} ]]; then + if ! printf '%s\n' "${modified_modules[@]}" | grep -Fxq "\"$module_name\""; then if is_excluded "$module_name"; then continue fi modified_modules+=("\"$module_name\"") fi doneLikely invalid or redundant comment.
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
1-81
: Verify CI workflow integrationSince this script is crucial for the new CI workflow that runs benchmarks in separate workers, please ensure:
- The CI workflow correctly sets the
ALL_CHANGED_FILES
environment variable- The script's output is properly parsed by the matrix job configuration
- Error cases (no modules found, invalid paths) are handled gracefully by the CI workflow
✅ Verification successful
Let me check the content of the benchmark workflow file to verify the integration.
CI workflow integration is properly configured
The script is correctly integrated with the CI workflow:
- The
ALL_CHANGED_FILES
environment variable is properly set using thetj-actions/changed-files
action output- The script's output is correctly parsed in the matrix job configuration using
fromJSON(needs.detect-modules.outputs.modules)
- Error handling is in place through the workflow's structure:
- The
detect-modules
job captures and validates the output- The matrix strategy uses
fail-fast: false
to handle failures gracefully- The modules count is tracked and displayed for verification
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify CI workflow configuration # Check if the workflow file exists and contains the necessary configuration gh api repos/{owner}/{repo}/contents/.github/workflows -q '.[].name' | grep -q "benchmark" # Verify the workflow file contains the required environment variable rg -l "ALL_CHANGED_FILES" .github/workflows/Length of output: 169
Script:
#!/bin/bash # Check the benchmark workflow configuration cat .github/workflows/benchmark.ymlLength of output: 7899
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 81-81: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 81-81: Quote this to prevent word splitting.
(SC2046)
I'm seen instability in how the different technologies run the benchmarks. I'm pretty sure the test code is not waiting for the service to be running, something to be easily fixed once there is a testcontainers with a proper wait-strategy backing the service. Other than that, I think this is ready for review, as I addressed the majority of the suggestions from coderrabit (great tool!). I intentionally not addressed some of them, as they caused the script to not work as intended. |
What does this PR do?
It adds an initial job in the benchmark workflow in order to identify which modules where modified for a given build.
To detect the changes, we have created a shell script,
./github/scripts/changed-modules.sh
. This script receives one single env var as input parameter,ALL_CHANGED_FILES
, which on CI will be provided by a Github action, https://github.com/tj-actions/changed-files, that puts in there a list of the modified files, comparing the current PR changeset with the parent branch (main). We can tune this up to always check the latestmain
branch (open to discussion here).The script will compare all existing modules (looking up all the
go.mod
files of interest, no test files) with the modified files, building an array of modified modules. At the moment there is a modified file in the .github directory, all the modules will be included in the build.Finally, for running the benchmarks, we are adding a matrix for running them for the modified storage module, with the following rules:
ubuntu-latest
.UPDATE: I've added the minio module to S3, so that the benchmarks for S3 work again.
Why is this important?
Separate concerns and better troubleshoot of issues with benchmarks: at the moment, all benchmarks are executed in serie, which 1) slows down the build, 2) could cause a PR to fail if one benchmark for another module failed.
With these changes we try to improve the contribution experience when working on one module.
Summary by CodeRabbit
New Features
Chores