Objective 3.3 – Configure vSphere Storage Multipathing and Failover
Continuing the storage objective, here is Objective 3.3 – Configure vSphere Storage Multipathing and Failover all sorted as per the VCP6.5-DCV certification blueprint.
Happy Revision
Simon
Objective 3.3 – Configure vSphere Storage Multipathing and Failover
Explain common multi-pathing components
To maintain a constant connection between a host and its storage, ESXi supports multipathing. Multipathing is a technique that lets you use more than one physical path that transfers data between the host and an external storage device.
In case of a failure of any element in the SAN network, such as an adapter, switch, or cable, ESXi can switch to another physical path, which does not use the failed component. This process of path switching to avoid failed components is known as path failover.
In addition to path failover, multipathing provides load balancing. Load balancing is the process of distributing I/O loads across multiple physical paths. Load balancing reduces or removes potential bottenecks.
Failover with Fibre Channel
To support multipathing, your host typically has two or more HBAs available. This configuration supplements the SAN multipathing configuration that generally provides one or more switches in the SAN fabric and one or more storage processors on the storage array device itself.
Host-Based Failover with iSCSI
When setting up your ESXi host for multipathing and failover, you can use multiple iSCSI HBAs or multiple NICs depending on the type of iSCSI adapters on your host.
When you use multipathing, specific considerations apply.
ESXi does not support multipathing when you combine an independent hardware adapter with software iSCSI or dependent iSCSI adapters in the same host.
Multipathing between software and dependent adapters within the same host is supported.
On different hosts, you can mix both dependent and independent adapters.
Failover with Hardware iSCSI
With hardware iSCSI, the host typically has two or more hardware iSCSI adapters available, from which the storage system can be reached using one or more switches. Alternatively, the setup might include one adapter and two storage processors so that the adapter can use a different path to reach the storage system.
On the Host-Based Path Failover illustration, Host1 has two hardware iSCSI adapters, HBA1 and HBA2, that provide two physical paths to the storage system. Multipathing plug-ins on your host, whether the VMkernel NMP or any third-party MPPs, have access to the paths by default and can monitor health of each physical path. If, for example, HBA1 or the link between HBA1 and the network fails, the multipathing plug-ins can switch the path over to HBA2.
Failover with Software iSCSI
With software iSCSI, as shown on Host 2 of the Host-Based Path Failover illustration, you can use multiple NICs that provide failover and load balancing capabilities for iSCSI connections between your host and storage systems.
For this setup, because multipathing plug-ins do not have direct access to physical NICs on your host, you first need to connect each physical NIC to a separate VMkernel port. You then associate all VMkernel ports with the software iSCSI initiator using a port binding technique. As a result, each VMkernel port connected to a separate NIC becomes a different path that the iSCSI storage stack and its storage-aware multipathing plug-ins can use.
Differentiate APD and PDL states
Permanent Device Loss (PDL):
- A datastore is shown as unavailable in the Storage view
- A storage adapter indicates the Operational State of the device as Lost Communication
All-Paths-Down (APD):
- A datastore is shown as unavailable in the Storage view.
- A storage adapter indicates the Operational State of the device as Dead or Error.
If PDL SCSI sense codes are not returned from a device (when unable to contact the storage array, or with a storage array that does not return the supported PDL SCSI codes), then the device is in an All-Paths-Down (APD) state, and the ESXi host continues to send I/O requests until the host receives a response.
As the ESXi host is not able to determine if the device loss is permanent (PDL) or transient (APD), it indefinitely retries SCSI I/O, including:
- Userworld I/O (hostd management agent)
- Virtual machine guest I/O
Due to the nature of an APD situation, there is no clean way to recover.
- The APD situation needs to be resolved at the storage array/fabric layer to restore connectivity to the host.
- All affected ESXi hosts may require a reboot to remove any residual references to the affected devices that are in an APD state.
Compare and contrast Active Optimized vs. Active non-Optimized port group states
The active unoptimized is the path to the LUN which is not owned by the storage processor. This situation occurs on Arrays which are Asymmetric Active-Active
- Optimized paths: are the paths to the Storage Processor which owns the LUN.
- Unoptimized paths: are the paths to the Storage Processor which does not own the LUN. And has a connection to the LUN via interconnect between the processors.
Explain features of Pluggable Storage Architecture (PSA)
vSphere APIs for Multipathing
Known as the Pluggable Storage Architecture (PSA), these APIs allow storage partners to create and deliver multipathing and load-balancing plug-ins that are optimized for each array. Plug-ins communicate with storage arrays and determine the best path selection strategy to increase I/O performance and reliability from the ESXi host to the storage array.
Managing Multiple Paths
To manage storage multipathing, ESXi uses a collection of Storage APIs, also called the Pluggable Storage Architecture (PSA). The PSA is an open, modular framework that coordinates the simultaneous operation of multiple multipathing plug-ins (MPPs). The PSA allows 3rd party software developers to design their own load balancing techniques and failover mechanisms for particular storage array, and insert their code directly into the ESXi storage I/O path.
The VMkernel multipathing plug-in that ESXi provides by default is the VMware Native Multipathing PlugIn (NMP). The NMP is an extensible module that manages sub plug-ins. There are two types of NMP sub plug-ins, Storage Array Type Plug-Ins (SATPs), and Path Selection Plug-Ins (PSPs). SATPs and PSPs can be built-in and provided by VMware, or can be provided by a third party.
If more multipathing functionality is required, a third party can also provide an MPP to run in addition to, or as a replacement for, the default NMP.
When coordinating the VMware NMP and any installed third-party MPPs, the PSA performs the following tasks:
- Loads and unloads multipathing plug-ins.
- Hides virtual machine specifics from a particular plug-in.
- Routes I/O requests for a specific logical device to the MPP managing that device.
- Handles I/O queueing to the logical devices.
- Implements logical device bandwidth sharing between virtual machines.
- Handles I/O queueing to the physical storage HBAs.
- Handles physical path discovery and removal.
- Provides logical device and physical path I/O statistics.
Multiple third-party MPPs can run in parallel with the VMware NMP. When installed, the third-party MPPs replace the behavior of the NMP and take complete control of the path failover and the load-balancing operations for specified storage devices.
The multipathing modules perform the following operations:
- Manage physical path claiming and unclaiming.
- Manage creation, registration, and deregistration of logical devices.
- Associate physical paths with logical devices.
- Support path failure detection and remediation.
- Process I/O requests to logical devices: n Select an optimal physical path for the request.
- Depending on a storage device, perform specific actions necessary to handle path failures and I/O command retries.
- Support management tasks, such as reset of logical devices.
Understand the effects of a given claim rule on multipathing and failover
When you start your ESXi host or rescan your storage adapter, the host discovers all physical paths to storage devices available to the host. Based on a set of claim rules, the host determines which multipathing plug-in (MPP) should claim the paths to a particular device and become responsible for managing the multipathing support for the device.
By default, the host performs a periodic path evaluation every 5 minutes causing any unclaimed paths to be claimed by the appropriate MPP.
The claim rules are numbered. For each physical path, the host runs through the claim rules starting with the lowest number first, The attributes of the physical path are compared to the path specification in the claim rule. If there is a match, the host assigns the MPP specified in the claim rule to manage the physical path. This continues until all physical paths are claimed by corresponding MPPs, either third-party multipathing plug-ins or the native multipathing plug-in (NMP).
For the paths managed by the NMP module, a second set of claim rules is applied. These rules determine which Storage Array Type Plug-In (SATP) should be used to manage the paths for a specific array type, and which Path Selection Plug-In (PSP) is to be used for each storage device.
Use the vSphere Web Client to view which SATP and PSP the host is using for a specific storage device and the status of all available paths for this storage device. If needed, you can change the default VMware PSP using the client. To change the default SATP, you need to modify claim rules using the vSphere CLI.
Claim Rule Options
-A|–adapter=<str>
Indicate the adapter of the paths to use in this operation.
-u|–autoassign
The system will auto assign a rule ID.
-C|–channel=<long>
Indicate the channel of the paths to use in this operation.
-c|–claimrule-class=<str> Indicate the claim rule class to use in this operation. Valid values are: MP, Filter, VAAI.
-d|–device=<str>
Indicate the device Uid to use for this operation.
-D|–driver=<str>
Indicate the driver of the paths to use in this operation.
-f|–force
Force claim rules to ignore validity checks and install the rule anyway.
–if-unset=<str>
Execute this command if this advanced user variable is not set to 1.
-i|–iqn=<str>
Indicate the iSCSI Qualified Name for the target to use in this operation.
-L|–lun=<long>
Indicate the LUN of the paths to use in this operation.
-M|–model=<str>
Indicate the model of the paths to use in this operation.
-P|–plugin=<str>
Indicate which PSA plugin to use for this operation. (required)
-r|–rule=<long>
Indicate the rule ID to use for this operation.
-s|–satp=string
The SATP for which a new rule will be added
-T|–target=<long>
Indicate the target of the paths to use in this operation.
-R|–transport=<str>
Indicate the transport of the paths to use in this operation. Valid values are: block, fc, iscsi, iscsivendor, ide, sas, sata, usb, parallel, unknown.
-t|–type=<str>
Indicate which type of matching is used for claim/unclaim or claimrule. Valid values are: vendor, location, driver, transport, device, target. (required)
-V|–vendor=<str>
Indicate the vendor of the paths to user in this operation.
–wwnn=<str>
Indicate the World-Wide Node Number for the target to use in this operation.
–wwpn=<str>
Indicate the World-Wide Port Number for the target to use in this operation.
Explain the function of claim rule elements:
Use the esxcli command to list available multipathing claim rules.
Claim rules indicate which multipathing plug-in, the NMP or any third-party MPP, manages a given physical path. Each claim rule identifies a set of paths based on the following parameters:
- Vendor/model strings
- Transportation, such as SATA, IDE, Fibre Channel, and so on
- Adapter, target, or LUN location
- Device driver, for example, Mega-RAID
In the procedure, –server=server_name specifies the target server. The specified target server prompts you for a user name and password. Other connection options, such as a configuration file or session files are supported.
Vendor
-V|–vendor=<str>
Indicate the vendor of the paths to user in this operation.
Model
-M|–model=<str>
Indicate the model of the paths to use in this operation.
Device ID
-d|–device=<str>
Indicate the device Uid to use for this operation.
SATP
-s|–satp=string
The SATP for which a new rule will be added
PSP
-P|–plugin=<str>
Indicate which PSA plugin to use for this operation. (required)
Change the Path Selection Policy using the UI
Generally, you do not need to change the default multipathing settings your host uses for a specific storage device. However, if you want to make any changes, you can use the Edit Multipathing Policies dialog box to modify a path selection policy and specify the preferred path for the Fixed policy. You can also use this dialog box to change multipathing for SCSI-based protocol endpoints.
- Browse to the host in the vSphere Web Client navigator.
- Click the configure tab.
- Under Storage, click Storage Devices or Protocol Endpoints.
- Select the item whose paths you want to change and click the Properties tab.
- Under Multipathing Policies, click Edit Multipathing.
- Select a path policy.
By default, VMware supports the following path selection policies. If you have a third-party PSP installed on your host, its policy also appears on the list.
- Fixed (VMware)
- Most Recently Used (VMware)
- Round Robin (VMware)
For the fixed policy, specify the preferred path.
- Click OK to save your settings and exit the dialog box.
Determine required claim rule elements to change the default PSP
Use the esxcli command to list available multipathing claim rules.
Claim rules indicate which multipathing plug-in, the NMP or any third-party MPP, manages a given physical path. Each claim rule identifies a set of paths, the parameters for which are earlier on this page.
Prerequisites
Install vCLI or deploy the vSphere Management Assistant (vMA) virtual machine.
- Run the esxcli –server=server_name storage core claimrule list –claimrule-class=MP command to list the multipathing claim rules.
Determine the effect of changing PSP on multipathing and failover
Changing the default pathing policy for SATP when there are other storage arrays utilising the same PSP can create issues for those storage arrays because the other storage array pathing policy and failover will change.
Determine the effects of changing SATP on relevant device behaviour
VMware provides a SATP for every type of array on the HCL. The SATP monitors the health of each physical path and can respond to error messages from the storage array to handle path failover. If you change the SATP for an array, it may change the PSP which might create unexpected failover results.
Configure/Manage Storage load balancing
Multipathing provides load balancing. Load balancing is the process of distributing I/O loads across multiple physical paths. Load balancing reduces or removes potential bottlenecks.
Load balancing is the process of spreading server I/O requests across all available SPs and their associated host server paths. The goal is to optimize performance in terms of throughput (I/O per second, megabytes per second, or response times).
Configuration via the web client can be found under Storage > Datastore > Manage > Settings > Connectivity and Multipathing.
Differentiate available Storage load balancing options
SAN storage arrays require continual redesign and tuning to ensure that I/O is load balanced across all storage array paths. To meet this requirement, distribute the paths to the LUNs among all the SPs to provide optimal load balancing. Close monitoring indicates when it is necessary to rebalance the LUN distribution.
Tuning statically balanced storage arrays is a matter of monitoring the specific performance statistics (such as I/O operations per second, blocks per second, and response time) and distributing the LUN workload to spread the workload across all the SPs.
Differentiate available Storage multipathing policies
For each storage device, the ESXi host sets the path selection policy based on the claim rules.
By default, VMware supports the following path selection policies. If you have a third-party PSP installed on your host, its policy also appears on the list.
Fixed (VMware)
The host uses the designated preferred path, if it has been configured. Otherwise, it selects the first working path discovered at system boot time. If you want the host to use a particular preferred path, specify it manually. Fixed is the default policy for most active-active storage devices.
Most Recently Used (VMware)
The host selects the path that it used most recently. When the path becomes unavailable, the host selects an alternative path. The host does not revert to the original path when that path becomes available again. There is no preferred path setting with the MRU policy. MRU is the default policy for most active-passive storage devices.
Round Robin (VMware)
The host uses an automatic path selection algorithm rotating through all active paths when connecting to active-passive arrays, or through all available paths when connecting to active-active arrays. RR is the default for a number of arrays and can be used with both active-active and active passive arrays to implement load balancing across paths for different LUNs
Configure Storage Policies including vSphere Storage APIs for Storage Awareness
For entities represented by storage (VASA) providers, verify that an appropriate provider is registered. After the storage providers are registered, the VM Storage Policies interface becomes populated with information about datastores and data services that the providers represent.
Entities that use the storage provider include Virtual SAN, Virtual Volumes, and I/O filters. Depending on the type of the entity, some providers are self-registered. Other providers, for example, the Virtual Volumes storage provider, must be manually registered. After the storage providers are registered, they deliver the following data to the VM Storage Policies interface:
Storage capabilities and characteristics for such datastores as Virtual Volumes and Virtual SAN
I/O filter characteristics
- Browse to vCenter Server in the vSphere Web Client navigator.
- Click the configure tab, and click Storage Providers.
- In the Storage Providers list, view the storage providers registered with vCenter Server.
The list shows general information including the name of the storage provider, its URL and status, storage entities that the provider represents, and so on. To display more details, select a specific storage provider or its component from the list.
Locate failover events in the UI
Failover events can be located within the events tab of the monitor window from the vCenter server navigator pane.