z/VM ITR Scaling, z9 to z10
The purpose of this document is to
discuss factors that affect z/VM workloads
scaling from z9 to z10, the data available to detect
these factors,
and how to estimate a scaling ratio using these factors.
At the time this document was created, the
z/VM LSPR ITR Ratios for IBM Processors
showed a z/VM scaling ratio range from the z9 to the z10 of 1.63 on a
1-way to 1.30 on a 16-way. These LSPR measurements were completed on
dedicated systems and real storage was increased by the same ratio as
processors were increased.
The scaling ratios discussed in this
document
are measured in dedicated LPARs but not on fully dedicated systems and
are thus subject to a variable amount of interference from other LPARs
for shared resources.
The workloads discussed in this document have widely varying
characteristics and demonstrate a number of factors that affect the
expected scaling ratio between the z9 and z10. Most ratios fall within
the
range published in the LSPR data and this document will show
workload characteristics that fall in various portions of this range.
Workloads that scaled better than the LSPR range were single guest,
compute-intensive,
with very little storage access and no storage overcommitment.
Workloads that scaled worse than the LSPR range contained many of the
documented scaling factors and all had storage overcommitment.
Here is a summary of scaling ranges and workload factors that fall
into the ranges.
| Scaling Ranges |
Scaling Factors |
|
Above 1.6
|
uniprocessor, compute-intensive
with very small storage reference pattern and very few system
interactions (low T/V ratio)
|
|
1.5 to 1.6
|
MP, non-storage-overcommitment
with small data movement or
uniprocessor with no storage overcommitment
but increased data movement
|
|
1.4 to 1.5
|
small number of processors (i.e., less than 8) with small storage
overcommitment and limited data movement
|
|
1.3 to 1.4
|
MP workload with mild storage overcommitment, average storage
references, and average system interactions
|
|
Below 1.3
|
MP workloads with high storage overcommitment that causes high
storage management activity
|
Following is more detail about the factors known to affect the
scaling ratios.
The example used for any individual scaling
factor may also be affected
by other scaling factors.
In these cases, no attempt was made to exactly quantify the
effect of each scaling factor.
Number of Processors
Everything else being equal, the scaling ratio
varies inversely to the number of processors.
This was validated by measuring the same workload with different
numbers of processors.
Unlike the LSPR workload, storage reference is not increased between
the 2-way and the 8-way measurements.
The number of processors is available as
"CPU online"
from the
SYSSUMLG
Performance Toolkit for VM report.
| Workload |
Number of Processors |
z9 to z10 Scaling Ratio |
|
Apache CPU-intensive
|
2
|
1.43
|
|
Apache CPU-intensive
|
8
|
1.41
|
Storage References
Everything else being equal, the scaling ratio
varies inversely to the amount of storage that is referenced.
This was validated by varying the number of URL files that are used
for a measurement, the virtual storage size of the servers,
and the number of servers.
For our workload, the amount of storage being referenced is
calculated from the >System< userid information on the
UPAGE
Performance Toolkit for VM report
using the formula (Nr of Users) * ("R<2GB" + "R>2GB"). If all users
are not included in this report, an alternate method would be required.
| Workload |
User Resident Pages |
z9 to z10 Scaling Ratio |
|
Apache
|
5971948
|
1.41
|
|
Apache
|
39317553
|
1.23
|
Data Movement
Everything else being equal, the scaling ratio
varies inversely to
the amount of data movement. This was validated
by measuring our AWM
to Apache application using 1 MB files versus
using two small
files (10 KB and 20 KB). Moving large amounts of data generally
requires references to real storage. Not all data movement can
be determined from the Performance Toolkit
information
available for data moved across FICON channels, across virtual
networks, and through various communication services.
| Workload |
Average File Size |
z9 to z10 Scaling Ratio |
|
Apache (2 clients and 12 servers)
|
15 KB
|
1.53
|
|
Apache (2 clients and 12 servers)
|
1024 KB
|
1.31
|
Virtual I/O to Real Devices
Everything else being equal, the scaling ratio
varies inversely to
the amount of I/O that is issued for real devices.
This was validated by changing the amount of data that
could be cached in the Linux file caches.
Our creation of virtual I/O also
caused the amount of referenced storage to be reduced. This
should have provided an improved scaling ratio and thus offset
some of the
effect from the virtual I/Os.
The virtual I/O rate is obtained from
"Virtual I/O rate" on the CPU
Performance Toolkit for VM report.
| Workload |
Virtual I/Os
per Second |
z9 to z10 Scaling Ratio |
|
Apache (1.5 GB virtual servers)
|
26
|
1.41
|
|
Apache (256 MB virtual servers)
|
515
|
1.35
|
Searches
Everything else being equal, the scaling ratio
varies inversely to
the amount of storage that is referenced by long searches.
Although data is not available in
Performance Toolkit for VM reports
regarding search lengths for most storage services,
information is available that implies the length
of certain storage management
searches
so one of these was selected to
evaluate the scaling ratio of storage searches.
The specific search used for validation was the
"Emergency Scan-Page Frames" counters
from the
DEMNDLOG
Performance Toolkit for VM report.
Application search may have a higher scaling ratio than the storage
management searches because the storage management searches involve
storage key manipulations.
Everything else being equal, the scaling ratio
varies inversely to
number of storage key operations.
| Workload |
Emergency Scan Frames
per Second |
z9 to z10 Scaling Ratio |
|
Apache (2 clients and 12 servers)
|
38055
|
1.53
|
|
Apache (2 clients and 12 servers)
|
12000000
|
1.31
|
Storage Overcommitment
Everything else being equal, the scaling ratio
varies inversely to
the amount of storage overcommitment.
This was validated by measuring the same workload in different
real storage sizes. Performance Toolkit contains a lot of
information dealing with storage overcommitment.
| Workload |
Storage Size |
z9 to z10 Scaling Ratio |
|
CMMA 64 servers
|
6 GB
|
1.32
|
|
CMMA 64 servers
|
3 GB
|
1.23
|
Last revised March 18, 2008 (Virg)
|