|
Contents | Previous | Next
z/VM HyperPAV Support
Abstract
In z/VM 5.3, the Control Program (CP) can use the
HyperPAV feature of the IBM System Storage DS8000 line of storage
controllers.
The HyperPAV feature is similar to IBM's PAV
(Parallel Access Volumes) feature in that HyperPAV
offers the host system more than one device number for a volume,
thereby enabling per-volume I/O concurrency.
Further, z/VM's use of HyperPAV is like its use of PAV:
the support is for
ECKD disks only, the bases and aliases must all be ATTACHed
to SYSTEM, and only guest minidisk I/O or I/O provoked by
guest actions (such as MDC full-track reads) is parallelized.
We used our
PAV measurement workload to study the performance of HyperPAV
aliases as compared to classic PAV aliases.
We found, as we
expected, that HyperPAV aliases match the performance of
classic PAV aliases.
However, HyperPAV aliases require
different management and tuning techniques than classic
PAV aliases did.
This section discusses the differences
and illustrates how to monitor and tune a z/VM system that
uses PAV or HyperPAV aliases.
Introduction
In May 2006 IBM equipped z/VM 5.2 with the ability to use Parallel
Access Volumes (PAV) aliases so as to parallelize I/O to
user extents (minidisks) on SYSTEM-attached volumes. In its
PAV section, this report describes
the performance characteristics of z/VM's PAV support, under
various workloads, on several different IBM storage subsystems.
Readers not familiar with PAV or not familiar with z/VM's
PAV support should read that section and our
PAV technology description
before continuing here.
With z/VM 5.3, IBM extended z/VM's PAV capability so as to
support the IBM 2107's HyperPAV feature. Like PAV,
HyperPAV offers the host system the opportunity to use
many different device numbers to address
the same disk volume, thereby
enabling per-volume I/O concurrency. Recall that
with PAV, each alias device is affiliated with
exactly one base, and it remains with that base
until the system programmer reworks the I/O
configuration.
With HyperPAV,
though, the base and alias devices are grouped into
pools, the rule being that an alias device in a
given pool
can perform I/O on behalf of any base device in said pool.
This lets the
host system achieve per-volume
I/O concurrency while potentially consuming fewer
device numbers for alias devices.
IBM's performance objective for z/VM's HyperPAV support was
that with equivalent numbers of aliases,
HyperPAV disk
performance should equal PAV disk performance. Measurements
showed z/VM 5.3 meets this criterion, to within a very
small margin. The study revealed, though, that the
performance management techniques necessary to exploit
HyperPAV effectively are not the same as the techniques
one would use to exploit PAV. Rather than discussing
the performance of HyperPAV aliases, this section
describes the performance management techniques necessary
to use HyperPAV effectively. For completeness' sake,
this section also discusses tuning techniques appropriate
for classic PAV.
Customers must apply VM64248 (UM32072) to z/VM
5.3 for its HyperPAV support to work correctly.
This fix is not
on the z/VM 5.3 GA RSU. Customers must order
it from IBM.
z/VM Performance Toolkit does not calculate
volume response times correctly for base or alias
devices,
either classic PAV or HyperPAV.
Service times, I/O rates, and queue depths are correct.
In this section, DEVICE report excerpts
for classic PAV scenarios
have been hand-corrected to show accurate response time
values.
DEVICE report excerpts
for HyperPAV scenarios
have not been corrected.
Understanding Disk Performance
Largely speaking, z/VM disk performance can be understood
by looking at the amount of time a guest machine perceives
is required to do a single I/O operation to a single disk
volume. This time, called response time, consists
of two main components.
The first,
queue time (aka
wait time),
is the
time the guest's I/O spends waiting for access to the
appropriate real volume.
The second component,
service time, is the
time required for the System z I/O subsystem to perform
the real I/O, once z/VM starts it.
Technologies like PAV and HyperPAV can help reduce queue time
in that they provide the System z host with means to run
more than one real I/O to a volume concurrently. This is
similar to there being more than one teller window operating
at the local bank. Up to a certain point, adding tellers
helps decrease the amount of time a customer stands in line
waiting for a teller to become available. In a similar
fashion, PAV and HyperPAV aliases help decrease the amount
of time a guest I/O waits in queue for access to the real
volume.
This idea -- that PAV and HyperPAV offer I/O concurrency
and thereby decrease the likelihood of I/Os queueing at a
volume -- leads us to our first principle as regards using
PAV or HyperPAV to adjust volume performance. If a
volume is not experiencing queueing, adding aliases
for the volume will not help the volume's performance.
Consider adding
aliases for a volume only if there is an I/O queue
for the volume.
In the bank metaphor,
once a given customer
has reached a teller,
the number of other tellers working does not appreciably
change the time needed to perform a customer's transaction.
With PAV and HyperPAV, though, IBM has seen evidence that
in some environments, increasing the number of aliases for
a volume
can increase service time for the volume. Most of
the time, the decrease in wait time outweighs the
increase in service time, so response time improves.
At worst, service time increases exactly as wait time
decreases, so response time stands unchanged.
This trait
-- that adding aliases will generally change
the
split between wait time and service time, but
will generally not increase
their sum --
leads us to our second principle for using PAV
or HyperPAV. If a queue is forming at a volume,
add aliases until you run out of alias capability,
or until the queue disappears.
Depending on the workload, it might take several
aliases before things start to get better.
A performance analyst can come to an understanding of the right
number of PAV or HyperPAV aliases for his environment by examining
the disk performance statistics z/VM emits in its monitor data.
Performance monitoring products such as IBM's z/VM Performance
Toolkit comment on device performance and thus are invaluable in
tuning and configuring PAV or HyperPAV.
The Basic DEVICE Report
z/VM Performance Toolkit emits a report called DEVICE
which comments on the performance statistics for the z/VM
system's real devices. This report is the analyst's primary
tool for understanding disk performance. Below is an excerpt
from the DEVICE report for one of the disk exercising workloads
we use for PAV and HyperPAV measurements,
run with no aliases.
Readers: please note that due to browser window width
limitations, all of the report excerpts in this section
are truncated on the right,
after the "Req. Qued" column. The rest of the columns are
interesting, but not in this discussion. Ed.
FCX108 Run 2007/06/05 14:00:12 DEVICE
General I/O Device Load and Performance
From 2007/05/27 15:00:14
To 2007/05/27 15:10:14
For 600 Secs 00:10:00 Result of Y040180P Run
________________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390-3 BWPVS0 0 4 755 .0 .2 .2 .9 1.3 3.7 .0 1.84
The following columns are interesting in this discussion:
-
I/O is the I/O rate for the device,
in operations per second.
-
Serv (service time) is the average
amount of time (milliseconds) the
device is using in performing a single I/O.
-
Resp (response time) is the average
amount of time (milliseconds) the guest
machine perceives is required to perform
an I/O to its minidisk on the volume.
Response time includes service time plus
wait time (aka queue time).
-
Req. Qued is the average length
of the wait queue for the real
device. This is the average
number of I/Os waiting in line to use the volume.
The excerpt above shows device 522A
that has a wait queue and has low
pending time. This suggests opportunity to tune the volume by
using PAV or HyperPAV. Let's look at the two approaches.
DASD Tuning via Classic PAV
For classic PAV, the strategy is to add aliases for the volume
until the volume's I/O rate maximizes or the volume's wait
queue disappears, whichever comes first. Ordinarily, we
would expect these to happen simultaneously.
First, let's
use Performance Toolkit's DEVICE report to estimate how
many aliases the volume will need.
The Req. Qued column
gives us the number we seek.
For a given volume, the estimate for aliases needed is just the
queue depth, smoothing any fractional part up to the next
integer.
In the excerpt above, device 522A is reporting a queue depth of
1.84. This suggests
two aliases will be needed to tune the
volume. Keep this in mind as we work through the tuning
exercise.
Starting small, we first
added one alias to the workload.
Here is the corresponding DEVICE excerpt, showing how the
performance of the 522A volume changed, now that one alias
is helping.
FCX108 Run 2007/06/05 14:03:05 DEVICE
General I/O Device Load and Performance
From 2007/05/27 14:48:26
To 2007/05/27 14:58:26
For 600 Secs 00:10:00 Result of Y040181P Run
________________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390-3 BWPVS0 0 4 477 .0 .2 .3 1.4 1.9 2.8 .0 .91
5249 ->522A BWPVS0 0 4 465 .0 .2 .3 1.5 2.0 2.9 .0 .00
Notice several things about this example:
-
In this situation, there is one base device, 522A, and there is one
classic PAV alias device for it, as notated in the second column by
->522A.
-
The 522A volume,
although it now has one alias, is still experiencing queueing.
We could have forecast this, given our estimate that two aliases
would be needed.
-
The alias device does not have a wait queue. When CP owns
the base and alias devices,
volume
queueing happens only on the volume's base device, not on
its alias devices. Guest minidisk I/O never queues on
an alias device.
-
Volume I/O rate has increased from 755/sec
to (477+465) = 942/sec.
-
The service time has increased from 1.3 msec to about 1.9 msec.
However, because wait time is reduced, response time improved
from 3.7 msec to about 2.8 msec.
Adding this one alias increased volume I/O rate and decreased
volume response time. We made progress.
Because there's still a wait queue at base device 522A,
and because we'd estimated that two aliases would be needed
to tune the volume, let's keep going.
Let's see what happens if we add another classic PAV alias for
volume 522A.
FCX108 Run 2007/06/05 14:27:46 DEVICE
General I/O Device Load and Performance
From 2007/05/27 14:36:37
To 2007/05/27 14:46:37
For 600 Secs 00:10:00 Result of Y040182P Run
________________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390-3 BWPVS0 0 4 552 .0 .3 .2 1.2 1.7 1.7 .0 .02
5249 ->522A BWPVS0 0 4 545 .0 .3 .2 1.2 1.7 1.7 .0 .00
524C ->522A BWPVS0 0 4 522 .0 .3 .2 1.3 1.8 1.8 .0 .00
By adding another PAV alias, we
increased the volume I/O rate to (552+545+522) = 1619/sec.
Note we also
decreased response time to about 1.7 msec.
Because the 522A wait queue is now gone,
adding more aliases will not further improve volume performance.
The overall result was that we tuned 522A from 742/sec and 3.7
msec response time to 1619/sec and 1.7 msec response time.
DASD Tuning via HyperPAV
With
HyperPAV, base devices and alias devices are organized into pools.
Each alias in the pool can perform I/O on behalf of any base device in its
same pool.
To reduce queueing at a base device, we add an alias to the pool in which
the base resides. However, we must remember that said alias will be used
to parallelize I/O for all bases in the pool. It follows that with HyperPAV,
there really isn't any notion of "volume tuning" per se. Rather, we
tune the pool.
Usually, some base devices in a pool will be experiencing little queueing
while others will be experiencing more. The design of HyperPAV makes it
possible to add just enough aliases to satisfy the I/O concurrency level
for the pool. Usually this will result in needing fewer aliases, as
compared to having to equip each base with its own aliases.
For example, in a pool having ten base devices, it might be possible to
satisfy the I/O concurrency requirements for all ten bases by adding
merely five aliases to the pool. This lets us conserve device numbers.
IBM is aware that in large environments, conservation of device numbers
is an important requirement.
Let's look at a DEVICE report excerpt for a measurement involving
our DASD volumes 522A-5231.
FCX108 Run 2007/06/05 11:01:14 DEVICE
General I/O Device Load and Performance
From 2007/02/04 13:28:34
To 2007/02/04 13:38:35
For 600 Secs 00:10:00 Result of Y032180H Run
_______________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390 BWPVS0 0 4 711 .0 .2 .2 .9 1.3 3.9 .0 1.8
522B 3390 BWPVS1 0 4 745 .0 .2 .2 .9 1.3 3.8 .0 1.8
522C 3390 BWPVS2 0 4 744 .0 .2 .2 .9 1.3 3.8 .0 1.8
522D 3390 BWPVS3 0 4 745 .0 .2 .2 .9 1.3 3.8 .0 1.8
522E 3390 BWPVT0 0 4 769 .0 .2 .2 .8 1.2 3.6 .0 1.8
522F 3390 BWPVT1 0 4 740 .0 .2 .2 .9 1.3 3.7 .0 1.8
5230 3390 BWPVT2 0 4 716 .0 .2 .2 .9 1.3 3.9 .0 1.8
5231 3390 BWPVT3 0 4 719 .0 .2 .2 .9 1.3 3.9 .0 1.8
In this workload we see that each of the eight volumes is experiencing
queueing, with pending time not being an issue. Again, volume tuning
looks promising.
Notice also that each volume is experiencing
an I/O rate of about 735/sec and
a response time of about 3.8 msec.
When we are done tuning this pool, we will take
another look at these, to see what happened.
Because we are going to use HyperPAV this time, we will not be tuning
these volumes individually. Rather, we will be tuning them as a group.
Noticing that the total queue depth for the group is (1.8*8) = 14.4, we
can estimate that 15 HyperPAV aliases should suffice to tune the pool.
Let's start by adding eight HyperPAV aliases 5249-5250 and see what
happens.
FCX108 Run 2007/06/05 14:33:16 DEVICE
General I/O Device Load and Performance
From 2007/02/04 13:16:46
To 2007/02/04 13:26:47
For 600 Secs 00:10:00 Result of Y032181H Run
_______________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390-3 BWPVS0 0 4 665 .0 .2 .2 1.0 1.4 2.8 .0 .90
522B 3390-3 BWPVS1 0 4 651 .0 .2 .2 1.0 1.4 2.9 .0 .98
522C 3390-3 BWPVS2 0 4 658 .0 .2 .2 1.0 1.4 2.8 .0 .90
522D 3390-3 BWPVS3 0 4 644 .0 .2 .2 1.0 1.4 2.8 .0 .90
522E 3390-3 BWPVT0 0 4 597 .0 .2 .2 1.1 1.5 3.1 .0 .93
522F 3390-3 BWPVT1 0 4 721 .0 .2 .1 .9 1.2 2.4 .0 .89
5230 3390-3 BWPVT2 0 4 606 .0 .2 .2 1.1 1.5 3.1 .0 .95
5231 3390-3 BWPVT3 0 4 649 .0 .2 .2 1.0 1.4 2.8 .0 .94
5249 3390-3 0 4 608 .0 .2 .2 1.1 1.5 1.5 .0 .00
524A 3390-3 0 4 621 .0 .2 .2 1.0 1.4 1.4 .0 .00
524B 3390-3 0 4 611 .0 .2 .2 1.0 1.4 1.4 .0 .00
524C 3390-3 0 4 601 .0 .2 .2 1.1 1.5 1.5 .0 .00
524D 3390-3 0 4 578 .0 .2 .2 1.1 1.5 1.5 .0 .00
524E 3390-3 0 4 615 .0 .2 .2 1.1 1.5 1.5 .0 .00
524F 3390-3 0 4 562 .0 .2 .2 1.2 1.6 1.6 .0 .00
5250 3390-3 0 4 592 .0 .2 .2 1.1 1.5 1.5 .0 .00
There are lots of interesting things in this report, such as:
-
Base devices 522A-5231 are still
experiencing queueing. So some benefit could
likely be had by adding more HyperPAV aliases to the pool.
We predicted this.
-
Alias devices are not showing "->" notation to indicate base affiliation.
That's because the
base with which a HyperPAV
alias is affiliated changes for every I/O the alias
does.
-
Alias devices 5249-5250 are not showing volume labels.
Again, affiliation changes with every I/O, so an alias has no long-lived
volume label.
-
Alias devices 5249-5250 are not showing device queues. This is correct.
Again, I/O queues form only on base devices.
One note about I/O rates needs mention. When we tuned via classic PAV,
it was easy to calculate the aggregate I/O rate for a volume. All we
did was add up the rates for the volume's base and alias devices.
By doing this summing, we could see the volume I/O rates rise as we
added aliases.
With
HyperPAV, though, an alias does I/Os for all of the bases in the pool.
Thus there is no way from the DEVICE report to calculate the aggregate
I/O rate for a specific
volume. There is relief in the raw monitor data, though.
More on this later.
Bear in mind also that z/VM Performance Toolkit
does not calculate response times correctly in
PAV or HyperPAV situations, so we can't really see
how well we're doing at this interim step.
Again, there
is relief in the raw monitor data. More on this later, too.
To continue to tune this pool, we can add
some more HyperPAV aliases. Again summing
the queue depths for the pool's base devices
yields a sum of 7.39 I/Os
still queued for these bases.
Let's add eight more HyperPAV aliases
for this pool at device numbers 5251-5258
and see what happens. Again, for convenience we have sorted the report
by device number.
FCX108 Run 2007/06/05 14:39:47 DEVICE
General I/O Device Load and Performance
From 2007/02/04 13:04:57
To 2007/02/04 13:14:57
For 600 Secs 00:10:00 Result of Y032182H Run
________________________________________________________________________________
. . . ___ . . . . . . . .
<-- Device Descr. --> Mdisk Pa- <-Rate/s-> <------- Time (msec) -------> Req.
Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued
522A 3390-3 BWPVS0 0 4 536 .0 .3 .2 1.2 1.7 1.8 .0 .03
522B 3390-3 BWPVS1 0 4 553 .0 .3 .2 1.2 1.7 1.7 .0 .00
522C 3390-3 BWPVS2 0 4 576 .0 .3 .2 1.1 1.6 1.6 .0 .00
522D 3390-3 BWPVS3 0 4 570 .0 .3 .2 1.1 1.6 1.6 .0 .01
522E 3390-3 BWPVT0 0 4 548 .0 .3 .2 1.2 1.7 1.7 .0 .00
522F 3390-3 BWPVT1 0 4 573 .0 .3 .2 1.1 1.6 1.6 .0 .01
5230 3390-3 BWPVT2 0 4 584 .0 .3 .2 1.1 1.6 1.6 .0 .00
5231 3390-3 BWPVT3 0 4 572 .0 .3 .2 1.1 1.6 1.6 .0 .00
5249 3390-3 0 4 558 .0 .3 .2 1.2 1.7 1.7 .0 .00
524A 3390-3 0 4 569 .0 .3 .2 1.1 1.6 1.6 .0 .00
524B 3390-3 0 4 562 .0 .3 .2 1.1 1.6 1.6 .0 .00
524C 3390-3 0 4 566 .0 .3 .2 1.1 1.6 1.6 .0 .00
524D 3390-3 0 4 564 .0 .3 .2 1.1 1.6 1.6 .0 .00
524E 3390-3 0 4 538 .0 .3 .2 1.2 1.7 1.7 .0 .00
524F 3390-3 0 4 563 .0 .3 .2 1.1 1.6 1.6 .0 .00
5250 3390-3 0 4 548 .0 .3 .2 1.2 1.7 1.7 .0 .00
5251 3390-3 0 4 524 .0 .3 .2 1.2 1.7 1.7 .0 .00
5252 3390-3 0 4 535 .0 .3 .2 1.2 1.7 1.7 .0 .00
5253 3390-3 0 4 568 .0 .3 .2 1.1 1.6 1.6 .0 .00
5254 3390-3 0 4 570 .0 .3 .2 1.1 1.6 1.6 .0 .00
5255 3390-3 0 4 557 .0 .3 .2 1.2 1.7 1.7 .0 .00
5256 3390-3 0 4 543 .0 .3 .2 1.2 1.7 1.7 .0 .00
5257 3390-3 0 4 544 .0 .3 .2 1.2 1.7 1.7 .0 .00
5258 3390-3 0 4 574 .0 .3 .2 1.1 1.6 1.6 .0 .00
We see that by adding the eight HyperPAV aliases, we have eliminated queueing
at the eight bases, which was our objective.
Further, now that there is no queueing,
we can
assess volume response time by inspecting the
service times in the DEVICE report.
For this example, we can conclude that we
reduced volume
response time in this pool from
about 3.8 msec to
about 1.7 msec.
Because this pool is comprised only of bases 522A-5231 and aliases
5249-5268, summing the device
I/O rates gives 13395/sec aggregate I/O rate to the
pool. We can approximate the volume I/O rate by dividing by 8, because
there are eight bases. This gives us a volume I/O rate of about 1674/sec,
which is an increase from our original value of 735/sec.
Regarding HyperPAV pools, one other point needs mention. The span of
a HyperPAV pool is typically the logical subsystem (LSS) (aka logical
control unit, or LCU) within the IBM 2107. IBM anticipates that customers
using HyperPAV will have more than one LSS (LCU) configured in HyperPAV
mode, and so keeping track of the base-alias relationships can become
a bit challenging. Unfortunately, z/VM Performance Toolkit does not
report on the base-alias relationships for HyperPAV, so the system
administrator must resort to other means. The CP command
QUERY PAV yields comprehensive console output that describes the
organization of HyperPAV bases and aliases into pools, thereby telling
the system administrator what he needs to know to interpret Performance
Toolkit reports and subsequently tune the I/O subsystem. Customers
interpreting raw monitor data will notice that the MRIODDEV record
has new bits IODDEV_RDEVHPBA and IODDEV_RDEVHPAL which tell
whether the device is a HyperPAV base or alias respectively.
If one of those bits is set,
a new field, IODDEV_RDEVHPPL, gives the pool number in which
the device resides.
Another Look at HyperPAV Alias Provisioning
New monitor record MRIODHPP (domain 6, record 28) comments on the
configuration of a HyperPAV pool.
From a pool tuning perspective, perhaps the most important fields
in this record are IODHPP_HPPTRIES and IODHPP_HPPFAILS. The former
counts the number of times CP went to the pool to try to
get an alias
to do an I/O on behalf of a base. The latter counts the number
of those tries where CP came up empty, that is, there were no
aliases available.
Trend analysis on IODHPP_HPPTRIES and IODHPP_HPPFAILS reveals
whether there are enough aliases in the pool. If HPPTRIES is
increasing but HPPFAILS is remaining constant, there are enough
aliases.
If both are rising, there are not enough aliases.
Fields IODHPP_HPPMINCT and IODHPP_HPPMAXCT are low-water and
high-water marks on free alias counts. CP updates these
fields each time it tries to get an alias from the pool, and
it resets them each time it cuts an MRIODHPP record. Thus
each MRIODHPP record
comments on the minimal and maximal number of
free aliases CP found in the pool since the previous MRIODHPP
record. If IODHPP_HPPMINCT is consistently large, we can
conclude the pool probably has too many aliases. If our
I/O configuration is suffering for device numbers, some of
those aliases could be removed and the device numbers
reassigned for other purposes.
z/VM Performance Toolkit does not yet report on
the MRIODHPP record.
The customer must use other means, such as the
MONVIEW package on
our download page, to inspect
it.
A Unified Look at Volume I/O Rate and Volume Response Time
In z/VM 5.3, IBM has extended the MRIODDEV record so that it
comments on the I/O contributions made by alias devices, no
matter which aliases contributed. These
additional fields make it simple to calculate volume performance
statistics, such as volume I/O rate, volume service time, and
volume response time.
Analysts familiar with fields
IODDEV_SCGSSCH, IODDEV_SCMCNTIM, IODDEV_SCMFPTIM,
and friends already
know how to use those fields to calculate
device I/O rate,
device pending time,
device connect time,
device service time,
and so on.
In z/VM 5.3,
this set of fields continues to have the same meaning,
but it's important to realize that in a PAV or HyperPAV
situation, those fields comment on the behavior of
only the base device for the volume.
The new MRIODDEV fields IODDEV_PAVSSCH, IODDEV_PAVCNTIM,
IODDEV_PAVFPTIM, and friends comment on the aggregate
corresponding phenomena
for all aliases ever acting for
this base, regardless of PAV or HyperPAV, and regardless
of alias device number.
What this means is that by looking
at MRIODDEV and doing the appropriate arithmetic, a
reduction program can calculate volume behavior quite
easily, by weighting the base and aggregate-alias
contributions
according to their respective I/O rates.
For example, if the traditional
MRIODDEV base device fields show
an I/O rate of 400/sec and a connect time of 1.2 msec,
and the same calculational technique applied to the
new aggregate-alias fields reveals an I/O rate of
700/sec and a connect time of 1.4 msec, the
expected value of the
volume's
connect time is calculated to be
(400*1.2 + 700*1.4) / (400 + 700),
or 1.33 msec.
Authors of reduction programs can update their software so
as to calculate and report on volume behavior.
z/VM Performance Toolkit does not yet report on
the new MRIODDEV fields.
The customer must use other means, such as the
MONVIEW package on
our download page, to
examine them.
I/O Parallelism: Other Thoughts
In this section we have discussed that for the case of guest
I/O to minidisks, z/VM CP can exploit PAV or HyperPAV so as to
parallelize the corresponding real
I/O, thereby reducing or eliminating CP's
serializing on real volumes.
This support is useful in removing I/O queueing
in the case where several guests require access to the same
real volume, each such guest manipulating only its own slice
(minidisk) of the volume.
Depending on the environment or configuration, other opportunities
exist in a z/VM system
for disk I/O to become serialized inadvertently, and
other remedies exist too.
For example, even though z/VM can itself use PAV or HyperPAV to run
several guest minidisk I/Os concurrently to a single real volume,
each such guest still can do only one I/O at a time to any given
minidisk.
Depending on the workload inside the
guest, the guest itself might be maintaining its own I/O queue for
the minidisk. z/VM Performance Toolkit would not ordinarily report
on such a queue, nor would z/VM's PAV or HyperPAV support be useful
in removing it.
A holistic approach to removing I/O queueing requires an end-to-end
analysis of I/O performance
and the application of appropriate relief measures at each
step. Returning to the earlier example, if guest I/O queueing is the
real concern, and if the guest is PAV-aware, it might make sense to
move the guest's data to a dedicated volume, and then attach the
volume's base and alias device numbers to the guest. Such an
approach would give the guest an opportunity to use PAV or HyperPAV
to do its own I/O
scheduling and thereby mitigate its queueing problem.
If the guest is PAV-aware, another tool for removing a guest I/O
queue is to have z/VM create some virtual PAV aliases for the
guest's minidisk. z/VM can virtualize either classic PAV aliases
or HyperPAV aliases for the minidisk. This approach lets the
guest's data continue to reside on a minidisk but also lets the
guest itself achieve I/O concurrency for the minidisk.
Depending on software levels, guests
running Linux for System z can use PAV
to parallelize I/O to volumes containing Linux file
systems,
so as to achieve file system
I/O concurrency. Again, whether this is useful or appropriate
depends on the results of a comprehensive study of the I/O habits
of the guest.
As always, customers must study performance data and apply tuning
techniques appropriate to the bottlenecks discovered.
Contents | Previous | Next
|