Migrating to Cloud and Running Hybrid: Part 3 - Guest OS & Replication

Sat, Oct 12, 2024

Migrating to Cloud and Running Hybrid: Part 3 - Guest OS & Replication

If the environment that is being moved to a new platform is not VMware-based, or if vVols are not an option for some reason, then we move to the next layer down and look at performing data migrations from within an operating system. This is performed by enabling and configuring iSCSI within Windows or Linux, creating a host object in the FlashArray with the IQN initiator for the iSCSI initiator, and then mapping a volume to this new host object on the FlashArray. Once the device is visible within the operating system, the raw device should be formatted with the appropriate file system option for the intended usage, and this newly formatted device can then be used for data migration. At this point, we need to discuss a few options and considerations which will be different for Windows versus Linux operating systems. Always have proper planning and backups in place prior to data conversions or migrations.

Within Windows, copying data is most likely going to be performed by robocopy, which can copy large filesystems with options for recursion, permissions, and selection of specific properties to copy from source to destination. If anyone has had to work with cloning or migrating any significant amount of data before, you are most likely not a stranger to robocopy. If you are new to Windows migrations, understanding all the options of robocopy to do bulk copy operations in the console takes some practice.

Sometimes people are looking for simpler GUI options as a tradeoff for ease of use. There are a decent number of open source tools that have come around (EasyCopy), and even some tools with free/paid versions which have great functionality (TeraCopy).

Beyond the actual data copying itself, you should also understand that this method of cloning data for Windows can be done for both entire disks or just for directories. This is especially useful if you want to move data to external storage devices without changing your layout of the file structure on your disk currently, or having to shut down applications to move data that is on your OS disk to a new secondary drive letter. This is all done via the capability of Windows to mount a drive as a folder, which lets you create an empty NTFS folder and mount a drive to that folder path.

Now, the reality of moving data within Windows to a new device is that you will probably face some downtime. Depending on your applications and how the system is serving data, this can involve stopping and restarting of applications or services, or a full system reboot.

When we look at Linux, things might be easier depending on how the existing system is configured. If logical volumes are in use, the Logical Volume Manager (lvm) has functionality and commands to handle creating volume groups, creating mirrored logical volumes, and adding or removing disks to a mirror. Before you begin, confirm your multipathing configuration and any aliases in use on your system.

Essentially the rough steps for this process are:

Use pvcreate to create a new physical volume (PV) from the new block device.
Use vgextend to add the new PV to the existing volume group (VG).
Use lvconvert -m 1 to add the new disk as a mirror to the existing logical volume (LV).
Wait for the operation of the converting of the data to be completed.
Confirm that the disks are working as mirrors.
Use lvconvert -m 0 to remove the original disk from the mirror within the LV.
Use vgreduce to remove the original PV from the VG.
Use pvremove to remove the original block device as a PV.

In the end, we are just trying to get your data disentangled from your operating system so that it can be handled in an easier fashion, which is replication. When we discuss migration to the cloud, plus potentially back on-prem, or a hybrid model, we are focusing on asynchronous replication.

The good and the bad of discussing replication in regard to Pure Storage is that it is ridiculously easy as far as the core functionality and initial setup. We can demonstrate setting up replication between two arrays within a few minutes. For any two FlashArrays, or a FlashArray and Cloud Block Storage instance, we need network connectivity between the arrays; then we simply copy our connection key, the management and replication address of the source array, and enter these details into the destination array.

Once our arrays are connected, we create a protection group (pgroup) on our source array and add the replication target of our second array. We specify the snapshot and replication schedules that meet our needs. Once we add our volumes as members, our replication will begin between our arrays based on the schedules.

OK, that is a lot of discussion for the high-level overview of how we can separate our data and replicate it to and from the cloud. In the next few blog posts, we will take a look at how we migrate our virtual machines themselves to AWS or Azure.