Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: add password to configdrive vendor_data.json #10061

Open
wants to merge 1 commit into
base: 4.20
Choose a base branch
from

Conversation

phsm
Copy link
Contributor

@phsm phsm commented Dec 8, 2024

Description

This PR adds the VM password to the vendor_data.json generated when ConfigDrive is used.
This way the ConfigDrive can be used as a full substitute for Virtual Router-based metadata servers, and it works with the ConfigDrive datasource of cloud-init. The existing password file cloudstack/password/vm-password.txt is left for compatibility.

Fixes: #9404

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Downloaded a standard Ubuntu qcow2 image, that already has the cloud-init with ConfigDrive datasource in it: https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
The network configuration, the hostname, ssh key work, as well as the password for the user "ubuntu" is set to the Cloudstack-generated one.

How did you try to break this feature and the system with this change?

I don't see how it can be broken, I specifically made sure the writeVendorEmptyJsonFile() is only called with the password when it was found among the vmData fields. However, if you have suggestions how to test it, then I can perform those tests.

@weizhouapache
Copy link
Member

Good
This should fix #9404

@weizhouapache
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link

codecov bot commented Dec 8, 2024

Codecov Report

Attention: Patch coverage is 15.78947% with 16 lines in your changes missing coverage. Please review.

Project coverage is 16.13%. Comparing base (e57a82a) to head (2634694).
Report is 14 commits behind head on 4.20.

Files with missing lines Patch % Lines
...dstack/storage/configdrive/ConfigDriveBuilder.java 15.78% 14 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #10061      +/-   ##
============================================
+ Coverage     16.03%   16.13%   +0.10%     
- Complexity    12814    12988     +174     
============================================
  Files          5637     5637              
  Lines        493506   497550    +4044     
  Branches      59831    61561    +1730     
============================================
+ Hits          79129    80303    +1174     
- Misses       405601   408349    +2748     
- Partials       8776     8898     +122     
Flag Coverage Δ
uitests 4.10% <ø> (+0.07%) ⬆️
unittests 16.97% <15.78%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@phsm phsm changed the title ehancement: add password to configdrive vendor_data.json enhancement: add password to configdrive vendor_data.json Dec 8, 2024
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11752

@weizhouapache
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@weizhouapache
Copy link
Member

@phsm
have you tested the scenario below ?

  • create vm with password in config drive
  • when vm is started, it gets the userdata from config drive ISO and change the password
  • reset the root password again inside the vm
  • reboot the vm (not stop/start), will the vm password be reset to the password in config drive ISO ?

@phsm
Copy link
Contributor Author

phsm commented Dec 9, 2024

@phsm have you tested the scenario below ?

* create vm with password in config drive

* when vm is started, it gets the userdata from config drive ISO and change the password

* reset the root password again inside the vm

* reboot the vm (not stop/start), will the vm password be reset to the password in config drive ISO ?

Hi Wei,

I tested what you've asked and a bit more on the top of it.

Test 1: change the password, and rebooting the VM

  1. change the password from within the Virtual Machine.
  2. reboot the virtual machine.
  3. The new password works. So it was not reverted back

Test 2: Stop and start the VM after the test 1

  1. After the password was changed in the test 1, stop the virtual machine.
  2. Start the virtual machine.
  3. Try logging in using the new password that was manually set in the Test 1. It worked.

Test 3: Change password from Cloudstack UI/API

  1. Reset the password for the VM from Cloudstack.
  2. Try to login using the new password. Does not work. Therefore, the password reset does not happen.
  3. Try to login using the password that was manually set in the test 1. Works.

A bit more details about the Test 3:
The configdrive ISO was actually updated with the new password. However, the default behavior for the set_password cloud-init module is that it only runs once per instance, i.e. only during the first boot.
I tried changing the frequency for the cloud-init set_password module to always, but cloud-init just uses the password cached deep in the /var/lib/cloud directory. Conclusion: not working password reset is not Cloudstack's fault, but cloud-init's.

@weizhouapache
Copy link
Member

@phsm have you tested the scenario below ?

* create vm with password in config drive

* when vm is started, it gets the userdata from config drive ISO and change the password

* reset the root password again inside the vm

* reboot the vm (not stop/start), will the vm password be reset to the password in config drive ISO ?

Hi Wei,

I tested what you've asked and a bit more on the top of it.

Test 1: change the password, and rebooting the VM

  1. change the password from within the Virtual Machine.
  2. reboot the virtual machine.
  3. The new password works. So it was not reverted back

Test 2: Stop and start the VM after the test 1

  1. After the password was changed in the test 1, stop the virtual machine.
  2. Start the virtual machine.
  3. Try logging in using the new password that was manually set in the Test 1. It worked.

Test 3: Change password from Cloudstack UI/API

  1. Reset the password for the VM from Cloudstack.
  2. Try to login using the new password. Does not work. Therefore, the password reset does not happen.
  3. Try to login using the password that was manually set in the test 1. Works.

A bit more details about the Test 3: The configdrive ISO was actually updated with the new password. However, the default behavior for the set_password cloud-init module is that it only runs once per instance, i.e. only during the first boot. I tried changing the frequency for the cloud-init set_password module to always, but cloud-init just uses the password cached deep in the /var/lib/cloud directory. Conclusion: not working password reset is not Cloudstack's fault, but cloud-init's.

thanks @phsm for the testing

Test1 and Test2 look fine.
For Test3, yes, it is a cloud-init issue. You can try with cloud-init clean --machine-id -s, stop the VM, reset password and then start the VM.

@blueorangutan
Copy link

[SF] Trillian test result (tid-11866)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 53288 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10061-t11866-kvm-ol8.zip
Smoke tests completed. 141 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@DaanHoogland
Copy link
Contributor

@phsm , does this apply to 4.19 as well? If so, can you rebase onto the 4.19 branch?

@phsm
Copy link
Contributor Author

phsm commented Dec 11, 2024

@DaanHoogland I can try. How shall I do it: as a separate PR to the 4.19 branch?

@DaanHoogland
Copy link
Contributor

@DaanHoogland I can try. How shall I do it: as a separate PR to the 4.19 branch?

no, you can rebase your current branch. It is not good practice to name your branch main, but it doesn't hurt and you can still base it on 4.19. We do not need separate PRs as we usually merge older fixes forward to newer branches.

@phsm
Copy link
Contributor Author

phsm commented Dec 12, 2024

I checked the state of 4.19 branch: this code can't be easily merged into it, as it depends on the changes from PR #9329 that is about network configuration support for the config drive.

Of course, it can be ported to the code at its v4.19 state (its just injecting the right text into a file), but having a working password while the network is not working seems to not add any value.

Thus, perhaps its better to rebase it to 4.20 branch?

@weizhouapache
Copy link
Member

I checked the state of 4.19 branch: this code can't be easily merged into it, as it depends on the changes from PR #9329 that is about network configuration support for the config drive.

Of course, it can be ported to the code at its v4.19 state (its just injecting the right text into a file), but having a working password while the network is not working seems to not add any value.

Thus, perhaps its better to rebase it to 4.20 branch?

agree @phsm
4.20 is better

@phsm
Copy link
Contributor Author

phsm commented Dec 12, 2024

Done, rebased to 4.20.

@weizhouapache
Copy link
Member

Done, rebased to 4.20.

thanks @phsm

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@DaanHoogland DaanHoogland added this to the 4.20.1 milestone Dec 12, 2024
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11783

@weizhouapache
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-11887)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 50870 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10061-t11887-kvm-ol8.zip
Smoke tests completed. 141 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@weizhouapache
Copy link
Member

I know some users use the cloudstack script in the vm template
https://github.com/apache/cloudstack/blob/main/setup/bindir/cloud-set-guest-sshkey-password-userdata-configdrive.in

at some points I think the script may be better

  • the username can be changed in the script (by default it uses root)
  • the script gets the password every time the vm starts. It fixes the Test case 3 in @phsm 's testing

cc @phsm @DaanHoogland

@phsm
Copy link
Contributor Author

phsm commented Dec 13, 2024

I specifically left the old functionality, so the users can choose whether to use the script you mentioned, or the cloud-init's native functionality. So this code will not break the script functionality.

Regarding the "the username can be changed in the script (by default it uses root)": cloud-init supports custom usernames too, it is a separate parameter in the cloud.cfg that is set to "ubuntu" in Ubuntu cloud image.

I think the main issue with the cloud-init set_password module is that it was meant to set the initial password for the VM, hence the default "force changing the password on the first login" behavior.
If the need arises, the PR to cloud-init can be made for it to compare the cached password with the password in the config drive ISO, and change user's password if needed.

@DaanHoogland
Copy link
Contributor

@phsm your PR is against branch 4.20 but seems to contain main. Can you have a look?

@DaanHoogland
Copy link
Contributor

@phsm your PR is against branch 4.20 but seems to contain main. Can you have a look?

It seems your own code is actually not in here anymore.

@phsm
Copy link
Contributor Author

phsm commented Dec 20, 2024

Apologies,
I wanted to synchronize my fork with the main branch, and forgot that I had an unmerged PR. I've reverted it.

Next time I will create a different branch for my PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cloud-init not running when guest network use UsedData: ConfigDrive
4 participants