Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel_upgrade.sh script can skip reboots even when it should #28

Open
bsper2 opened this issue Jun 27, 2024 · 0 comments
Open

kernel_upgrade.sh script can skip reboots even when it should #28

bsper2 opened this issue Jun 27, 2024 · 0 comments

Comments

@bsper2
Copy link
Member

bsper2 commented Jun 27, 2024

There are some bugs and a minor edge case in the kernel_upgrade.sh script, which would cause a system to not reboot when we'd normally want it to reboot.

Bug 1

PKG_UPDATES_REQ_REBOOT

LASTREBOOTDATE=`last reboot | head -1 | awk '{ print $5  " "  $6  " "  $7 }'`
LASTBOOTDATE=`date -d "$LASTREBOOTDATE" +"%d %b %Y"`
TODAYDATE=`date +"%d %b %Y"`
PKG_UPDATES_REQ_REBOOT=`rpm -qa --last | grep -B1000 "$LASTBOOTDATE" | grep -v "$LASTBOOTDATE" | egrep -i "$PKGS_REQ_REBOOT" |  wc -l`

Issue is that PKG_UPDATES_REQ_REBOOT will be 0 when the node was rebooted on a day when no packages were installed. Specifically the issue is here : rpm -qa --last | grep -B1000 "$LASTBOOTDATE". On stateless nodes it would be extremely uncommon for this to be a problem, but on stateful nodes this can definitely happen.

Bug 2

PKG_UPDATES

PKG_UPDATES=`rpm -qa --last | grep -B1000 "$LASTBOOTDATE" | grep -v "$LASTBOOTDATE" | wc -l`

PKG_UPDATES has the same issue as PKG_UPDATES_REQ_REBOOT

Bug 3

KERNEL_UPDATES_TODAY

KERNEL_UPDATES_TODAY=`rpm -qa --last | grep -B1000 "$TODAYDATE" | egrep -i 'kernel-[0-9]' | wc -l`

We set KERNEL_UPDATES_TODAY as above, and then later use this condition to see if a reboot is needed:

[[ $((KERNEL_UPDATES_TODAY)) -gt 1 ]]

Issue is that we grep for egrep -i 'kernel-[0-9]' which most of the time when there is a kernel update this will only be equal to 1 since the installed packages will look something like this:

kernel-4.18.0-553.5.1.el8_10.x86_64                     # Matches regex and is counted
kernel-modules-4.18.0-553.5.1.el8_10.x86_64      # Does not get counted
kernel-core-4.18.0-553.5.1.el8_10.x86_64             # Does not get counted

So we probably want the condition to be something like this instead:

[[ $((KERNEL_UPDATES_TODAY)) -ge 1 ]]

Edge Case 1

PKG_UPDATES_TODAY

This one is definitely just an edge case rather than a bug. But just want to mention it since it caused issues in SVC-24690. As long as we fix the three bugs above leaving this edge case unchanged is probably not a big deal.

If packages are updated on at least the day before the scheduled run of this script, this script won't trigger an auto reboot if the # of packages the scheduled script updates is less than 5 (though that threshold # can be changed on CLI). Again, this part of the script does work as intended so it's probably not a huge deal.

@bsper2 bsper2 changed the title Edge cases and bugs in kernel_upgrade.sh script kernel_upgrade.sh script can skip reboots even when it should Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant