-
Notifications
You must be signed in to change notification settings - Fork 195
ci/openshift-ci: Enable selinux in CI runs #5798
Conversation
Oups, there is still some issue, let me debug it first |
I'm still getting some issues, let's not merge this just yet. |
It seems to be working, I'm not sure about the use of |
image: alpine | ||
securityContext: | ||
privileged: true | ||
command: ["/bin/sh", "-c", "nsenter --target 1 --mount bash -c \"ls -alZ /opt/kata/bin; semanage fcontext -a -t bin_t '/(.*/)?opt/kata/bin(/.*)?'; semanage fcontext -a -t bin_t '/(.*/)?opt/kata/libexec(/.*)?'; semanage fcontext -a -t bin_t '/(.*/)?opt/kata/runtime-rs/bin(/.*)?'; restorecon -Rv /opt/kata; ls -alZ /opt/kata/bin\"; sleep infinity"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ldoktor I think you will need to pass -e
to /bin/sh
so to immediately exit error on nsenter
failure. Otherwise it will always sleep
and exit success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: loop through the directories names, like:
for dir in bin libexec runtime-rs/bin; do echo semanage fcontext -a -t bin_t '/(.*/)?opt/kata/'$dir'(/.*)?'; done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I started with one dir but the list keeps growing :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the return code, we don't really care as the pod might be executed multiple times so IMO best effort way should do. Note we might be running on system without selinux completely and in such case it should also succeed.
|
||
# FIXME: Remove when https://github.com/kata-containers/kata-containers/pull/8417 is resolved | ||
# Selinux context is currently not handled by kata-deploy | ||
oc apply -f ${deployments_dir}/relable_selinux.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be relabel_selinux instead of relable?
Here you also will need to check the pod is running to ensure the relabel operation succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it's true we should perhaps wait for a message of completion before proceeding. As for the successfullness I don't think that's a good idea as we don't know it's a selinux capable system.
app: restorecon | ||
spec: | ||
serviceAccountName: kata-deploy-sa | ||
hostPID: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: is the hostPID needed so that it can nsenter --target 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, yes.
@ldoktor you will need the |
OK since there are multiple changes requested I added a new commit to better visualize the changes. The only thing I have not tackled is the return code of the relabel as at this point I'd rather use a best-effort approach as we might be running on non-selinux host or multiple times. If you insist I can add a check for |
Thanks for addressing my comments @ldoktor ! I don't have further suggestions, you can squash if you wish. |
Rebased |
ad19130
to
1d989e1
Compare
I see, the Fixes needs to be |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @ldoktor !
image: alpine | ||
securityContext: | ||
privileged: true | ||
command: ["/bin/sh", "-c", "nsenter --target 1 --mount bash -xc \"for ENTRY in '/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?' '/(.*/)?opt/kata/share/ovmf(/.*)?' '/(.*/)?opt/kata/share/tdvf(/.*)?' '/(.*/)?opt/kata/libexec(/.*)?'; do semanage fcontext -a -t qemu_exec_t \\\"\\$ENTRY\\\"; done; restorecon -v -R /opt/kata\"; echo NSENTER_FINISHED_WITH: $?; sleep infinity"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the motivation behind each regexp. Also on a typical RHEL or RHCOS setup, only the QEMU binary gets the qemu_exec_t
label. Other files have different labels, e.g. virtiofsd
has bin_t
.
Can you clarify @ldoktor ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Greg, I used qemu_exec_t
as that one is defined in container-selinux
and contains the map
and execute
privileges, which is what we need for binaries and bios files. The bin_t
would do as well but might be too generic.
As for the /(.*/)?opt
part it's because selinux policy uses realpath
and /opt
is mounted from /var/opt
on azure so we need to allow this extra flexibility.
There might be few shortcuts as it should only serve the CI pipeline. Once we hammer-out the details the proper implementation should become part of kata-containers/kata-containers#8417 Anyway I'm listening so if you know how to do things properly, please let me know (I'm just a selinux user, not a policy writer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Greg, I used
qemu_exec_t
as that one is defined incontainer-selinux
and contains themap
andexecute
privileges, which is what we need for binaries and bios files. Thebin_t
would do as well but might be too generic.
Hi Lukas,
My point is that this CI is meant to test kata in an OCP environment. Why using different labels that the one you get on a real OCP setup ?
As for the
/(.*/)?opt
part it's because selinux policy usesrealpath
and/opt
is mounted from/var/opt
on azure so we need to allow this extra flexibility.
I'm not concerned by the /(.*/)?opt
part but by the rest of the regexps... how did you come up with them ?
There might be few shortcuts as it should only serve the CI pipeline. Once we hammer-out the details the proper implementation should become part of kata-containers/kata-containers#8417 Anyway I'm listening so if you know how to do things properly, please let me know (I'm just a selinux user, not a policy writer)
Same here... hence my questions ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the real OCP cluster we rely on system-wide installed tools (qemu) while in the CI pipeline we use the latest built versions that are deployed via kata-deploy
. The problem is that currently kata-deploy
does not cover the selinux
labels at all which results in selinux rejects.
For the rules I just collected the required privileges and found the "closest matching label" and, for consistency, enabled it on all executable/binary files delivered by the kata-deploy
script, although I only tested qemu. This was sufficient for e2e pipeline to succeed while keeping selinux enabled in the CI, which is a step forward.
Obviously after the kata-deploy
script is improved this CI-pipeline "hack" should be removed but for now it might serve as an incubator to see what all is needed in kata-deploy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarifications @ldoktor.
hi @gkurz , we would like to get this PR merged so that it will be possible to test upstream kata on Openshift with SELinux turned on (current we turn it off). do you have a reason to nack this change? @beraldoleal mind to review this one too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except from @gkurz comments and two small nitpics, lgtm.
@@ -181,3 +181,11 @@ if [ ${SELINUX_PERMISSIVE} == "yes" ]; then | |||
# The new SELinux configuration will trigger another reboot. | |||
wait_for_reboot | |||
fi | |||
|
|||
# FIXME: Remove when https://github.com/kata-containers/kata-containers/pull/8417 is resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: My suggestion would be to convert this FIXME note into an Github issue, regardless if it will be solved soon or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to create an issue in the tests repo and to keep a FIXME mentioning this issue in the code for greater visibility. If this code is moved to the main repo before kata-containers/kata-containers#8417 is merged, the issue will need to be converted to a kata-containers issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
image: alpine | ||
securityContext: | ||
privileged: true | ||
command: ["/bin/sh", "-c", "nsenter --target 1 --mount bash -xc \"for ENTRY in '/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?' '/(.*/)?opt/kata/share/ovmf(/.*)?' '/(.*/)?opt/kata/share/tdvf(/.*)?' '/(.*/)?opt/kata/libexec(/.*)?'; do semanage fcontext -a -t qemu_exec_t \\\"\\$ENTRY\\\"; done; restorecon -v -R /opt/kata\"; echo NSENTER_FINISHED_WITH: $?; sleep infinity"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: To improve readability here, if possible, it would be nice to break the lines. Something like:
command: ["/bin/sh", "-c", "
set -e; # Exit on any error
nsenter --target 1 --mount bash -xc '
for ENTRY in \
\"/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?\" \
\"/(.*/)?opt/kata/share/ovmf(/.*)?\" \
\"/(.*/)?opt/kata/share/tdvf(/.*)?\" \
\"/(.*/)?opt/kata/libexec(/.*)?\";
do
semanage fcontext -a -t qemu_exec_t \"$ENTRY\" || { echo \"Error in semanage command\"; exit 1; }
done;
restorecon -v -R /opt/kata || { echo \"Error in restorecon command\"; exit 1; }
';
echo NSENTER_FINISHED_WITH: $?;
sleep infinity
"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks better and provides more output. I'll add a Starting the relabel
message at the beginning and test everything again and hopefully we can merge this before the weekend to get few runs to see the stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oups, I missed the /opt/kata/bin
between the rebases... I'll include it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @gkurz , we would like to get this PR merged so that it will be possible to test upstream kata on Openshift with SELinux turned on (current we turn it off). do you have a reason to nack this change?
Not anymore. I needed some clarifications and I'm fine with @ldoktor's answers 😃
@beraldoleal mind to review this one too?
\lgtm with @beraldoleal 's suggestions.
Thanks @ldoktor !
as kata-deploy does not currently handles selinux, this requires manual relabel of the /opt/kata folder where custom binaries are deployed. Fixes: kata-containers#5802 Signed-off-by: Lukáš Doktor <[email protected]>
Changes:
Tested on azure 4.14, worked well. |
/test |
as kata-deploy does not currently handles selinux, this requires manual relabel of the /opt/kata folder where custom binaries are deployed.
Fixes: #5802