Contents | Previous | Next

Enhanced Large Real Storage Exploitation

This section summarizes the results of a number of new measurements that were designed to demonstrate the performance benefit of the Enhanced Large Real Storage Exploitation support. This support includes:

  • The CP virtual address space no longer has a direct identity mapping to real storage
  • The CP control block that maps real storage (frame table) was moved above 2G
  • Some additional CP modules were changed to 64-bit addressing
  • Some CP modules (including those involved in Linux virtual I/Os) were changed to use access registers to reference guest pages that reside above 2G

These changes remove the need for CP to move user pages below the 2G line, thereby providing a large benefit to workloads that were previously constrained below the 2G line. Removing this constraint lets z/VM 5.2.0 use large amounts of real storage.

For more information about these improvements, refer to the Performance Improvements section. For guidelines on detecting a below-2G constraint, refer to the CP Regression section.

The benefit of these enhancements will be demonstrated using three separate sets of measurements. The first set will show the benefit of z/VM 5.2.0 for a below-2G-constrained workload in a small configuration with three processors and 3G of real storage. The second set will show that z/VM 5.2.0 continues to scale as workload, real storage, and processors are increased. The third set will demonstrate that all 128G of supported real storage can be efficiently supported.

Improvements in 3 GB of Real Storage

The Apache workload was used to create a z/VM 5.1.0 below-2G-constrained workload in 3G of real storage and to demonstrate the z/VM 5.2.0 improvements.

The following table contains the Apache workload parameter settings.

Apache workload parameters for measurements in this section

Server virtual machines 3
Client virtual machines 2
Client connections per server 1
Number of 1M files 5000
Location of the files Xstor MDC
Server virtual storage 1024M
Server virtual processors 1
Client virtual processors 1

This is a good example of the basic value of these enhancements and a demonstration of a z/VM 5.1.0 below-2G constraint without a large amount of storage above 2G.

Here is a summary of the z/VM 5.2.0 results compared to the z/VM 5.1.0 measurement.

  • Transaction rate increased 13%
  • Below-2G paging was eliminated
  • Xstor page rate decreased by 89%
  • User resident pages above 2G increased by 770%
  • CP microseconds (µsec) per transaction decreased by 23%
  • Virtual µsec per transaction decreased by 4.6%
  • Total µsec per transaction decreased by 12%

The following table compares z/VM 5.1.0 and z/VM 5.2.0 measurements for this workload.

Apache workload selected measurement data

z/VM Release 5.1.0 5.2.0
Run ID 3GBS9220 3GBTA190
Tx rate 54.591 62.064
Below-2G Page Rate 83 0
Xstor Total Rate 29327 3209
Resident Pages above 2G 28373 246924
Resident Pages below 2G 503540 508752
Total µsec/Tx 62050 54490
CP µsec/Tx 25674 19769
Emul µsec/Tx 36377 34721
2064-116; 3 dedicated processors; 3G central storage; 8G expanded storage; 6 connections

Scaling by Number of Processors

The Apache workload was used to create a z/VM 5.1.0 below-2G-constrained workload to demonstrate the z/VM 5.2.0 relief and to demonstrate that z/VM 5.2.0 would scale correctly as processors are added.

Since, in this example, the z/VM 5.1.0 below-2G constraint is created by Linux I/O, constraint relief can alternatively be provided by the Linux Fixed I/O Buffers feature. This minimizes the number of guest pages Linux uses for I/O at the cost of additional data moves inside the guest.

z/VM 5.1.0 preliminary experiments, not included in this report, had shown that a 5-way was the optimal configuration for this workload. Since the objective of this study was to show that z/VM 5.2.0 scales beyond z/VM 5.1.0, a 5-way was chosen as the starting point for this processor scaling study. z/VM 5.1.0 measurements of this workload with 9 processors or 16 processors could not be successfully completed because the below-2G-line constraint caused response times greater than the Application Workload Modeler (AWM) timeout value.

The following table contains the Apache workload parameter settings.

Apache workload parameters for measurements in this section

  5-way 9-way 16-way
Client virtual machines 3 6 9
Server virtual machines 10 12 12
5000 files; 1 megabyte URL file size; Xstor MDC is primary location of the files; 1024M server virtual storage size; 1 server virtual processor; 1 client virtual processor; 1 client connection per server

Figure 1 shows a graph of transaction rate for all the measurements in this section.

Figure 1



Figure 1 not
displayed.


Here is a summary of the z/VM 5.2.0 results compared to the z/VM 5.1.0 measurements.

  • With 5 processors, z/VM 5.2.0 provided a 73% increase in transaction rate over z/VM 5.1.0 with "Fixed I/O Buffers" OFF
  • With 5 processors, z/VM 5.2.0 provided a 6.5% increase in transaction rate over z/VM 5.1.0 with "Fixed I/O Buffers" ON
  • With 9 processors, z/VM 5.2.0 provided a 11% increase in transaction rate over z/VM 5.1.0 with "Fixed I/O Buffers" ON
  • With 16 processors, z/VM 5.2.0 provided a 15% increase in transaction rate over z/VM 5.1.0 with "Fixed I/O Buffers" ON

On z/VM 5.2.0, as we increased the number of processors in the partition, transaction rate increased appropriately too. When we moved from five processors to nine processors, perfect scaling would have forecast an 80% increase in transaction rate, but we achieved a 72% increase, or a scaling efficiency of 90%. Similarly, when we moved from five processors to sixteen processors, we achieved a scaling efficiency of 73%. Both of these scaling efficiencies are better than the corresponding efficiencies we obtained on z/VM 5.1.0 with fixed Linux I/O buffers.

Here is a summary of the z/VM 5.2.0 results compared to the z/VM 5.1.0 with "Fixed I/O Buffers" OFF measurement.

  • Transaction rate increased 73%
  • Below-2G paging was eliminated
  • Xstor/DASD paging was eliminated
  • User resident pages above 2G increased by 3796%
  • CP µsec per transaction decreased by 38.2%
  • Virtual µsec per transaction decreased by 13.8%
  • Total µsec per transaction decreased by 24.1%

Here is a summary of the z/VM 5.2.0 results compared to the z/VM 5.1.0 with "Fixed I/O Buffers" ON measurement.

  • Transaction rate increased 6.5%
  • Below-2G paging was eliminated
  • Xstor/DASD paging was eliminated
  • User resident pages above 2G increased by 51%
  • CP µsec per transaction decreased by 2.1%
  • Virtual µsec per transaction decreased by 7.0%
  • Total µsec per transaction decreased by 5.3%

The following table compares z/VM 5.1.0, z/VM 5.1.0 with "Fixed I/O Buffers", and z/VM 5.2.0 for the 5-way measurements.

Apache workload selected measurement data

z/VM Release 5.1.0 5.1.0 V.5.2.0
Fixed I/O Buffers No Yes No
Fixed I/O Buffers tuning Na Default Na
Run ID FIXS9222 FIXS9223 FIXT3290
Tx rate 62.734 102.444 109.083
Below-2G Page Rate 1983 118 0
Xstor Total Rate 48970 6562 0
Xstor Migr Rate 3604 0 0
Total Pg to/from DASD 8578 150 0
Resident Pages above 2G 76464 1966146 2979288
Resident Pages below 2G 463104 457132 182
Total µsec/Tx 70726 56676 53654
CP µsec/Tx 30079 18993 18598
Emul µsec/Tx 40647 37683 35055
2064-116; 5 dedicated processors, 30G central storage, 8G expanded storage; 30 connections

Here is a summary of the z/VM 5.2.0 results compared to the z/VM 5.1.0 with "Fixed I/O Buffers" ON measurement.

  • Transaction rate increased 15.3%
  • Below-2G paging was eliminated
  • Xstor/DASD paging was eliminated
  • User resident pages above 2G increased by 57%
  • CP µsec per transaction decreased by 9.5%
  • Virtual µsec per transaction decreased by 11.2%
  • Total µsec per transaction decreased by 10.6%

The following table compares z/VM 5.1.0 and z/VM 5.2.0 for the 16-way measurements.

Apache workload selected measurement data

z/VM Release 5.1.0 5.2.0
Fixed I/O Buffers Yes No
Fixed I/O Buffers tuning Default Na
Run ID FIXS9229 FIXT3292
Tx rate 241.568 278.414
Below-2G Page Rate 933 0
Xstor Total Rate 19078 0
Xstor Migr Rate 0 0
Total Pg to/from DASD 1036 0
Resident Pages above 2G 2473126 3890865
Resident Pages below 2G 431460 198
Total µsec/Tx 73040 65291
CP µsec/Tx 25720 23288
Emul µsec/Tx 47319 42003
2064-116; 16 dedicated processors, 48G central storage, 14G expanded storage; 108 connections

Scaling to 128G of Storage

The Apache workload was used to create a z/VM 5.2.0 storage usage workload and to demonstrate that z/VM 5.2.0 could fully utilize all 128G of supported real storage. Real storage was overcommitted in the 128G measurement and Xstor paging was the factor used to prove that all 128G was actually being used. A base measurement using the same Apache workload parameters was also completed in a much smaller configuration to prove that the transaction rate scaled appropriately. There are no z/VM 5.1.0 measurements of this Apache performance workload scenario.

Since the purpose of this workload is to use real storage, not to create a z/VM 5.1.0 below-2G constraint, the server virtual machines were defined large enough so that nearly all of the URL files could reside in the Linux page cache. The number of servers controlled the amount of real storage used and the number of clients controlled the total number of connections necessary to use all the processors.

The following table contains the Apache workload parameter settings.

Apache workload parameters for measurements in this section

  22G 128G
Client virtual machines 2 7
Server virtual machines 2 13
10000 files; 1 megabyte URL file size; 10240M server virtual storage size; Linux page cache is primary location of the files; 1 server virtual processor; 1 client virtual processor; 1 client connection per server

The results show that all 128G of storage is being used with a lot of Xstor paging and all processors are at 100.0% utilization in the steady state intervals. Xstor paging increased by a higher percentage than other factors because real storage happens to be more overcommitted in the 128G measurement than in the 22G base measurement. Total µsec per transaction remained nearly identical despite a shift of lower CP µsec per transaction and higher virtual µsec per transaction. Transaction rate scaled at 99% efficiency compared to the number of processors and seems sufficient to prove efficient utilization of 128G.

Here is a summary of the 128G results compared to the 22G base results.

  • Transaction rate increased 446%
  • Xstor page rate increased by 1716%
  • User resident pages above 2G increased by 531%
  • CP µsec per transaction decreased by 10%
  • Virtual µsec per transaction increased by 7%
  • Total µsec per transaction remained nearly identical

The following table compares the 22G and the 128G measurements.

Apache storage usage workload selected measurement data

Processors 2 11
Total Real 22528 131072
Total CP Xstor 16384 31744
Total connections 4 91
Run ID 128T8157 128T8154
Tx rate 101.291 553.336
Xstor Total Rate 723 13139
Resident Pages above 2G 5187091 32755866
Resident Pages below 2G 6749 57618
Total µsec/Tx 19997 19990
CP µsec/Tx 8267 7436
Emul µsec/Tx 11731 12555
PGMBK Frames 46744 282k
SXS Available Pages 516788 515725
2094-738; z/VM 5.2.0

z/VM 5.2.0 should be able to scale other applications to this level unless limited by some other factor.

Page Management Blocks (PGMBKs) still must reside below 2G and will become a limiting factor prior to using 256G of in-use virtual. For the 128G measurement, about 53% of the below-2G storage is being used for PGMBKs. See the Performance Considerations section for further discussion.

Available pages in the System Execution Space (SXS) do not appear to be approaching any limitation since they did not increase between the 22G and the 128G measurement. In both measurements, more than 90% of the SXS pages are still available.

Contents | Previous | Next