Contents | Previous | Next

Guest Instruction Simulation Improvements

Abstract

In APAR VM66467 IBM shipped an improvement to how z/VM simulates the instruction whose opcode is x'B9D3'. Guests issue this instruction when they drive virtual PCIe functions. In our workload the improvement resulted in about a 25% decrease in transaction response time and about a 24% decrease in steal percent. Customer results will vary.

Introduction

An IBM Z Systems CPC supports a kind of a communication adapter called an RDMA Over Converged Ethernet adapter or an RoCE adapter. This adapter lets the host computer use DMA operations to move data on an Ethernet network. An example of such an adapter is the IBM 10GbE RoCE Express2 feature.

Software drives the RoCE adapter by setting up memory buffers and then issuing an instruction whose opcode is x'B9D3'. The instruction operand informs the adapter of where in memory the DMA buffers are located.

The issuing of said instruction is not specific to guest use of an RoCE adapter. Rather it applies to any virtual PCIe function the guest might be manipulating.

When considering the virtualized computing environment provided by z/VM, we start to ask questions about the cost of virtualizing this mechanism. When a z/VM guest tries to change its buffer locations, the z/VM Control Program must intervene, discern what the guest is trying to accomplish, and then issue its own instructions to inform the hardware accordingly. After z/VM finishes manipulating the hardware, z/VM returns control to the guest. In this respect supporting this guest function is similar to supporting guest Start Subchannel (SSCH) instructions. The host hypervisor must intervene, carry out appropriate simulation, manipulate the hardware as needed, and then return control to the guest.

As with any simulation, z/VM's intervention takes time. The time z/VM uses affects the guest in these important ways:

  1. z/VM's having to intervene can elongate transaction response time.
  2. While z/VM is doing simulation, the guest sees itself as running but not accruing CPU time. The percent of elapsed time the guest perceives itself to be in this state is reported by Linux as a phenomenon called steal percent. The more frequent, lengthy, or complex z/VM's simulation is, the higher the steal percent will be.

The operand the guest designates on its x'B9D3' instruction informs the adapter of which memory buffers are now invalid and which ones are now valid. In the case where the guest's operand accomplishes only making pages valid, it is not necessary for the hypervisor to issue a corresponding instruction to the hardware. This is because in a later operation, when the guest actually uses the buffers, the hardware will sense the buffers are valid.

The APAR causes z/VM not to inform the hardware when there is no need. In this way efficiency of simulation is improved. Improving the efficiency of simulation can decrease transaction response time and decrease steal percent.

Method

To measure the effect of the enhancement, IBM created an environment and workload that stresses this particular aspect of simulation. The environment consisted of a 3906-M04 CPC with two partitions. One partition ran Linux on z/VM and the other ran Linux alone. The two Linux images ran the Uperf network performance tool with a workload that very closely resembles AWM RR-200x1000. SADC data was collected on the client side. Figure 1 illustrates the experiment.

Figure 1. Measurement Environment.
Picture of measurement environment
Notes: z14 3906-M04. Client side was shared LPAR, 16 cores, entitlement 1377, 64 GB host real, IBM 25GbE RoCE Express2, z/VM 7.1 without or with VM66467, SMT-2, one Linux guest, 8 VCPUs, 8 GB guest real, RHEL 8.1, Uperf 1.04, RR-200x1000. Server side was same CEC, shared LPAR, IBM 25GbE RoCE Express2, Linux running without z/VM, RHEL 8.1, Uperf 1.04, responding to client.

Results and Discussion

The z/VM APAR reduced transaction response time as illustrated in Figure 2.

Figure 2. Comparison of Transaction Response Time.
Graph of effect on response time
Notes: z14 3906-M04. Client side was shared LPAR, 16 cores, entitlement 1377, 64 GB host real, IBM 25GbE RoCE Express2, z/VM 7.1 without or with VM66467, SMT-2, one Linux guest, 8 VCPUs, 8 GB guest real, RHEL 8.1, Uperf 1.04, RR-200x1000. Server side was same CEC, shared LPAR, IBM 25GbE RoCE Express2, Linux running without z/VM, RHEL 8.1, Uperf 1.04, responding to client.

The z/VM APAR also reduced steal percent as illustrated in Figure 3.

Figure 3. Comparison of Steal Percent.
Graph of effect on steal percent
Notes: z14 3906-M04. Client side was shared LPAR, 16 cores, entitlement 1377, 64 GB host real, IBM 25GbE RoCE Express2, z/VM 7.1 without or with VM66467, SMT-2, one Linux guest, 8 VCPUs, 8 GB guest real, RHEL 8.1, Uperf 1.04, RR-200x1000. Server side was same CEC, shared LPAR, IBM 25GbE RoCE Express2, Linux running without z/VM, RHEL 8.1, Uperf 1.04, responding to client.

Summary

Avoiding unnecessary instructions was shown to have a positive effect on transaction response time. In our workload transaction response time was reduced an average of 25%. Avoiding unnecessary instructions was also shown to have a positive effect on steal percent. In our workload steal percent was reduced an average of 24%.

Though our experiment used the IBM RoCE adapter, the improvement will apply to guest use of any virtual PCIe function.

Contents | Previous | Next