New Question

Revision history [back]

click to hide/show revision 1
initial version

Hi,

Resizes and cold migrations follow the same workflow, which goes like this:

  • the vm is unclustered
  • the instance files are moved to a temporary location (having the "_revert" suffix). Those are the original files, which can be used to revert the migration/resize, if needed
  • the original vm is deleted from the hyper-v inventory (but not its files)
  • the instance files are now copied to their destination (which may or may not be on the same host/location)
  • image files (as well as other resources) are resized, if requested and the instance is imported on the destination and clustered again.
  • the backup files are removed when the migration is confirmed

The main advantages of this workflow are that:

  • it allows recovering the unaltered instance files in case of resize/migration failures
  • it doesn't rely on hyper-v assisted migrations nor shared storage, simplifying the deployment and its requirements
  • the destination host pulls the files, so the source doesn't have to know where the instance will be placed, nor have access to the new location
  • the workflow is more or less the same both in the situation in which the instance is resized on the same host, as well as migrations that don't actually involve resizes

We're aware that this comes with a couple of disadvantages that may be addressed in the future:

  • the instance files are copied to a backup location even if shared storage is used and the instance is just migrated yet not resized
  • the instance is removed from the failover cluster and hyper-v inventories. This comes from the fact that we're not using the standard Hyper-V migration APIs.

To answer your questions: it's possible to recover the instance. The required actions depend on which step failed:

  • are the instance files copied at their destination? if not (or you'd like to use the original files), copy them from the '*_revert' dir.
  • import the vm to the hyper-v inventory as well as failover cluster. You may simply use import-vm and add-vmtocluster
  • update the nova state (double check the vm state as well as the host set in the Nova DB)
  • are you using OVS? If so, you'd need to either re-attach the ovs ports manually or simply restart nova-compute after importing the vm to have Nova automatically reattach the port
  • double check your instance. if everything worked as expected, feel free to remove the backup files
  • for more recent releases that make use of the placement service, the placement inventories would have to be checked, making sure that the allocations are set against the right provider (host) and that the resource quantities are correct.

The nova revert-resize command won't help too much in this situation. While some docs state that it can be used to recover failed resizes, it doesn't accept instances that are in ERROR state or instances that have been cold migrated and not resized (which may be considered a Nova limitation).

There are a couple of (ugly) workarounds if you don't want the instance to be removed from hyper-v/failover cluster during cold migrations:

  • simply use live migration
  • use failover cluster APIs directly (e.g. move-clustergroup) to move the instance and let Nova handle it as an unexpected move/failover. It will automatically pick up the instance and do the plumbing on the destination host.

There aren't many migration related config options. Here's a full list of Nova config options that are specific to the Hyper-V driver [1]. You may be interested in the live migration timeouts (don't apply to cold migrations or resizes, we may use such an option in the future) or the failover settings. Apart from those, there is an option to force the instance files to use the same location after cold migrations/resizes (some people use that when having multiple CSVs). It's called "movedisksoncoldmigration" and it's True by default.

Let me know if you have further questions.

Best regards,

Lucian Petrut

[1] https://compute-hyperv.readthedocs.io/en/latest/configuration/config.html#hyperv

Hi,

Resizes and cold migrations follow the same workflow, which goes like this:

  • the vm is unclustered
  • the instance files are moved to a temporary location (having the "_revert" suffix). Those are the original files, which can be used to revert the migration/resize, if needed
  • the original vm is deleted from the hyper-v inventory (but not its files)
  • the instance files are now copied to their destination (which may or may not be on the same host/location)
  • image files (as well as other resources) are resized, if requested and the instance is imported on the destination and clustered again.
  • the backup files are removed when the migration is confirmed

The main advantages of this workflow are that:

  • it allows recovering the unaltered instance files in case of resize/migration failures
  • it doesn't rely on hyper-v assisted migrations nor shared storage, simplifying the deployment and its requirements
  • the destination host pulls the files, so the source doesn't have to know where the instance will be placed, nor have access to the new location
  • the workflow is more or less the same both in the situation in which the instance is resized on the same host, as well as migrations that don't actually involve resizes

We're aware that this comes with a couple of disadvantages that may be addressed in the future:

  • the instance files are copied to a backup location even if shared storage is used and the instance is just migrated yet not resized
  • the instance is removed from the failover cluster and hyper-v inventories. This comes from the fact that we're not using the standard Hyper-V migration APIs.

To answer your questions: it's possible to recover the instance. The required actions depend on which step failed:

  • are the instance files copied at their destination? if not (or you'd like to use the original files), copy them from the '*_revert' dir.
  • import the vm to the hyper-v inventory as well as failover cluster. You may simply use import-vm and add-vmtocluster
  • update the nova state (double check the vm state as well as the host set in the Nova DB)
  • are you using OVS? If so, you'd need to either re-attach the ovs ports manually or simply restart nova-compute after importing the vm to have Nova automatically reattach the port
  • double check your instance. if everything worked as expected, feel free to remove the backup files
  • for more recent releases that make use of the placement service, the placement inventories would have to be checked, making sure that the allocations are set against the right provider (host) and that the resource quantities are correct.

The nova revert-resize command won't help too much in this situation. While some docs state that it can be used to recover failed resizes, it doesn't accept instances that are in ERROR state or instances that have been cold migrated and not resized (which may be considered a Nova limitation).

There are a couple of (ugly) workarounds if you don't want the instance to be removed from hyper-v/failover cluster during cold migrations:

  • simply use live migration
  • use failover cluster APIs directly (e.g. move-clustergroup) to move the instance and let Nova handle it as an unexpected move/failover. It will automatically pick up the instance and do the plumbing on the destination host.

There aren't many migration related config options. Here's a full list of Nova config options that are specific to the Hyper-V driver [1]. You may be interested in the live migration timeouts (don't apply to cold migrations or resizes, we may use such an option in the future) or the failover settings. Apart from those, there is an option to force the instance files to use the same location after cold migrations/resizes (some people use that when having multiple CSVs). It's called "movedisksoncoldmigration" and it's True by default.

Let me know if you have further questions.questions. Feel free to post the actual error message, we may help debugging the actual cause of the migration failure.

Best regards,

Lucian Petrut

[1] https://compute-hyperv.readthedocs.io/en/latest/configuration/config.html#hyperv