Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce privileged-mode #9017

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

A1kmm
Copy link

@A1kmm A1kmm commented Oct 24, 2024

The privileged-mode setting lets admins decide what level of privilege tasks running as privileged should have. This gives the ability to lock down privileged access to a level that isn't equivalent to full root on the host.

There are three proposed levels:
full, the status quo. This has multiple vectors to take over the host, including by loading modules into the kernel.
fuse-only, enough to work with containers using tools like buildah and podman if they are configured appropriately. As long as the Concourse worker is run in a user namespace on an up-to-date Linux kernel, this shouldn't be enough access to escape the container. ignore - privileged tasks have the same access as normal tasks.

To get podman and buildah working, a few more syscalls need to be allowed through seccomp. A few harmless ones have been added to the general allow list, while others related to mounting and unsharing are only added for fuse-only mode.

Changes proposed by this PR

  • Implement privileged-mode
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: full, can create container with buildah and run with podman.
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: fuse-only, can create container with buildah and run with podman. Capabilities are less.
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: fuse-only, cannot escape container using cgroup release_agent (note: can still escape if Worker not run in a new userns, setting release_agent fails if using a new userns).
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: ignore, cannot create container with buildah and run with podman, as expected.
  • Write automated tests for functionality.
  • Convert from draft PR to normal PR.

Notes to reviewer

This pipeline is helpful for manual testing:

jobs:
  - name: build-container
    public: false
    plan:
    - task: build
      privileged: true
      config:
        platform: linux
        image_resource:
          type: registry-image
          source:
            repository: quay.io/buildah/stable
        run:
          path: /bin/bash
          args:
            - "-c"
            - |
              capsh --print &&\
              yum -y install podman &&\
              mkdir container-storage &&\
              ls -l /dev/fuse /usr/bin/fuse-overlayfs $(pwd) $(pwd)/container-storage &&\
              PODMAN_ROOT=$(pwd)/container-storage &&\
              echo FROM mirror.gcr.io/alpine:latest >Dockerfile &&\
              echo CMD echo Hello World >>Dockerfile &&\
              buildah bud --root=$PODMAN_ROOT -t helloworld &&\
              echo "[containers]" >/etc/containers/containers.conf &&\
              echo "keyring = false" >>/etc/containers/containers.conf &&\
              podman run --rm --uts=host --network=host --userns=host --root=$PODMAN_ROOT --cgroups=disabled -it helloworld

Release Note

  • Added a new --privileged-mode option to the worker, which accepts full (default, previous behaviour), fuse-only (privileged: true tasks can use tools like buildah and podman, but can't escape if user namespaces are used to run the worker), ignore (privileged: true tasks have no extra access compared to privileged: false tasks)

A1kmm added 2 commits October 25, 2024 20:55
The privileged-mode setting lets admins decide what level of privilege
tasks running as privileged should have. This gives the ability to
lock down privileged access to a level that isn't equivalent to full
root on the host.

There are three proposed levels:
full, the status quo. This has multiple vectors to take over the host,
including by loading modules into the kernel.
fuse-only, enough to work with containers using tools like buildah and
podman if they are configured appropriately. As long as the Concourse
worker is run in a user namespace on an up-to-date Linux kernel, this
shouldn't be enough access to escape the container.
ignore - privileged tasks have the same access as normal tasks.

To get podman and buildah working, a few more syscalls need to be
allowed through seccomp. A few harmless ones have been added to the
general allow list, while others related to mounting and unsharing
are only added for fuse-only mode.

Signed-off-by: Andrew Miller <[email protected]>
@A1kmm A1kmm marked this pull request as ready for review October 25, 2024 09:56
@A1kmm A1kmm requested a review from a team as a code owner October 25, 2024 09:56
@taylorsilva taylorsilva added this to the v7.13.0 milestone Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants