Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEV] Add grouplevel key-value pairs to the metadata for use on the node #17

Closed
alexlovelltroy opened this issue Oct 22, 2024 · 6 comments · Fixed by #19
Closed

[DEV] Add grouplevel key-value pairs to the metadata for use on the node #17

alexlovelltroy opened this issue Oct 22, 2024 · 6 comments · Fixed by #19
Assignees

Comments

@alexlovelltroy
Copy link
Member

Short Description
SMD can provide the cloud-init server with a set of group names that an individual node is a part of. It cannot provide the key/value pairs that are needed by the node at boot time. We need to store key value pairs in the cloud-init server that are then added to the payload.

For example, we might want a different remote syslog aggregator for each rack of nodes. On the SMD side, the admin can create a group that represents that rack, possibly even by xname, (x3000) and when cloud-init retrieves the list of groups per node, it will be included.

"meta-data": {
  "groups": ["x3000",],
}

If the admin then also adds a key/value pair to the cloud-init server named for the group, that can be included in the client payload as well.

"meta-data": {
  "groups": [
      { "x3000": {
       {"syslog_aggregator": "192.168.0.1"},
     }}]
}

The mechanics of adding key/value pairs for a group should be via POST with standard CRUD operations.

/cloud-init/groups/ PUT --> {key: value, key: value}
/cloud-init/groups POST --> {key:value, key:value}

No need to query SMD in order to add/update. This means that admins can put typo groups into cloud-init and nothing will stop them.

require jwt for changes and log operations with information about who made the changes through JWT interrogation

@alexlovelltroy
Copy link
Member Author

The current merge logic that allows any group to specify any portion of the cloud-init payload may be problematic for admins who make a change to one group, expecting it to manifest, but then a merge removes their changes.

A concrete usecase that we need to support is a default cloud-init for all nodes in the compute group with additional information provided because the node is part of the x3000 group representing the rack, as well as a canary_123 group which defines a set of nodes to test rolling out a new image. All three of these may provide information about the rsyslog aggregator. Which one actually gets written to the filesystem and how? How should configurator be involved?

The current logic allows an admin to target a node directly with a singleton cloud-init payload which will supersede all group related cloud-init information. We should review this from an admin perspective to identify desired behavior when a custom cloud-init for a particular node is no longer useful.

Another thought for consideration is how to handle things like writefiles and runcmd. Are they purely additive? Should one group be able to override a different group?

@davidallendj
Copy link
Contributor

davidallendj commented Oct 23, 2024

I was able to get achieve something similar just using the /cloud-init endpoint running locally with curl http://127.0.0.1:27777/cloud-init -d @data.json. The data.json looks like the following:

{
  "name": "IDENTIFIER",
  "cloud-init": {
    "userdata": {
      "write_files": [
        {
          "content": "hello world",
          "path": "/etc/hello"
        }
      ]
    },
    "metadata": {
      "groups": [
        {
          "x3000": {
            "syslog_aggregator": "192.168.0.1"
          }
        }
      ]
    }
  }
}

Then, doing curl 'http://127.0.0.1:27777/cloud-init/IDENTIFIER/meta-data' gives me this:

groups:
- x3000:
    syslog_aggregator: 192.168.0.1

Is this already doing what is expected? I think we'd still want to the following:

  • address the merge behavior
  • remove the SMD query
  • add logging
  • add a PUT to update (there's already an endpoint for this with /{id})

Other than that, it seems to me that this is already doing the above.

@travisbcotton
Copy link
Collaborator

Why are we removing the SMD query again?

@davidallendj
Copy link
Contributor

Why are we removing the SMD query again?

I think the idea here is that since group names can be any arbitrary thing without any validation from SMD, there's no need to do a query.

@travisbcotton
Copy link
Collaborator

The query finds the SMD groups the requesting node is a member of. I don't know how you are going to provide cloud-init data to groups of nodes without making that query

@alexlovelltroy
Copy link
Member Author

when generating the payload to send to the client, the cloud-init server needs to call SMD to find out what groups the node is a part of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants