Contents | Previous | Next

z/VM PAV Exploitation

Starting in z/VM 5.2.0 with APAR VM63855, z/VM now exploits IBM's Parallel Access Volumes (PAV) technology so as to expedite guest minidisk I/O. In this article we give some background about PAV, describe z/VM's exploitation of PAV, and show the results of some measurements we ran so as to assess the exploitation's impact on minidisk I/O performance.

2007-06-15: With z/VM 5.3 comes HyperPAV support. For performance information about HyperPAV, and for performance management advice about the use of both PAV and HyperPAV, see our HyperPAV chapter.

Introduction

A zSeries data processing machine lets software perform only one I/O to a given device at a time. For DASD, this means zSeries lets software perform only one I/O to a given disk at a time.

In some environments, this can have limiting effects. For example, think of a real 3390 volume on z/VM, carved up into N user minidisks, each minidisk being a CMS user's 191 disk. There is no reason why we couldn't have N concurrent I/Os in progress at once, one to each minidisk. There would be no data integrity exposure, because the minidisks are disjoint. As long as there were demand, and as long as the DASD subsystem could keep up, we might experience increased I/O rates to the volume, and thereby increase performance.

Since 1999 zSeries DASD subsystems (such as the IBM TotalStorage Enterprise Storage Server 800) have supported technology called Parallel Access Volumes, or PAV. With PAV, the DASD subsystem can offer the host processor more than one device number per disk volume. For a given volume, the first device number is called the "base" and the rest are called "aliases". If there are N-1 aliases, the host can have N I/Os in progress to the volume concurrently, one to each device number.

DASD subsystems offering PAV do so in a static fashion. The IBM CE or other support professional uses a DASD subsystem configuration utility program to equip selected volumes with selected fixed PAV alias device numbers. The host can sense the aliases' presence when it varies the devices online. In this way, the host operating system can form a representation of the base-alias relationships present in the DASD subsystem and exploit that relationship if it chooses.

z/VM's first support for PAV, shipped as APAR VM62295 on VM/ESA 2.4.0, was to let guests exploit PAV. When real volumes had PAV, and when said volumes were DEDICATEd or ATTACHed to a guest, z/VM could pass its PAV knowledge to the guest, so the guest could exploit it. But z/VM itself did not exploit PAV at all.

With APAR VM63855 to z/VM 5.2.0, z/VM can now exploit PAV for I/O to PERM extents (user minidisks) on volumes attached to SYSTEM. This support lets z/VM exploit a real volume's PAV configuration on behalf of guests doing virtual I/Os to minidisks defined on the real volume. For example, if 20 users have minidisks on a volume, and if the volume has a few PAV aliases associated with it, and if those users generate sufficient I/O demand for the volume, the Control Program will use the aliases to drive more than one I/O to the volume concurrently. This support is not limited to driving one I/O per minidisk. If 20 users are all linked to the same minidisk, and I/O workload to that one minidisk demands it, z/VM will use the real volume's PAV aliases to drive more than one I/O to the single minidisk concurrently.

To measure the effect of z/VM's PAV exploitation, we crafted an I/O-intensive workload whose concurrency level and read-write mix we could control. We shut off minidisk cache and then ran the workload repeatedly, varying its concurrency level, its read-write mix, the PAV configuration of the real volumes, and the kind of DASD subsystem. We looked for changes in three I/O performance metrics -- I/O response time, I/O rate, and I/O service time -- as a function of these variables. This article documents our findings.

Executive Summary

Adding PAV aliases helps improve a real DASD volume's performance only if I/O requests are queueing at the volume. We can tell whether this is happening by comparing the volume's I/O response time to its I/O service time. As long as response time equals service time, adding PAV aliases will not change the volume's performance. However, if I/O response time is greater than I/O service time, queueing is happening and adding some PAV capability for the volume might be helpful.

Results when using PAV will depend on the amount of I/O concurrency in the workload, the fraction of the I/Os that are reads, and the kind of DASD subsystem in use. In our scenarios, workloads with a very low percentage of reads or a very high I/O concurrency level tended not to improve as much as workloads where the concurrency level exactly matched the number of aliases available or the read percentage was high. Also, modern storage subsystems, such as the IBM DS8100, tended to do better with PAV than IBM's older offerings.

Measurement Environment

IO3390 Workload

Our exerciser IO3390 is a CMS application that uses Start Subchannel (SSCH) to perform random one-block I/Os to an 83-cylinder minidisk formatted at 4 KB block size. The random block numbers are drawn from a uniform distribution [0..size_of_minidisk-1].

We organized the IO3390 machines' minidisks onto real volumes so that as we logged on additional virtual machines, we added load to the real volumes equally. For example, with eight virtual machines running, we had one IO3390 instance assigned to each real volume. With sixteen virtual machines we had two IO3390s per real volume. Using this scheme, we ran 1, 2, 3, 4, 5, 10, and 20 IO3390s per volume.

For each number of concurrent IO3390 instances per volume, we varied the aliases per volume in the range [0..4].

For each combination of number of IO3390s and number of aliases, we tried four different I/O mixes: 0% reads, 33% reads, 66% reads, and 100% reads.

The IO3390 agents are CMS virtual uniprocessor machines, 24 MB.

System Configuration

Processor: 2084-C24, model-capacity indicator 322, 2 GB central, 2 GB XSTORE, 2 dedicated processors. Two 3390-3 paging volumes.

IBM TotalStorage ESS F20 (2105-F20) DASD: 2105-F20, 16 GB cache. Two 1 Gb FICON chpids leading to a FICON switch, then two 1 Gb FICON chpids from the switch to the 2105. Four 3390-3 volumes in one LSS and four 3390-3 volumes in a second LSS. Four aliases defined for each volume.

IBM TotalStorage DS8100 (2107-921) DASD: 2107-921, 32 GB cache. Four 1 Gb FICON chpids leading to a FICON switch, then four 1 Gb FICON chpids from the switch to the 2107. Eight 3390-3 volumes in a single LSS. Four aliases defined for each volume.

IBM TotalStorage DS6800 (1750-511) DASD: 1750-511, 4 GB cache. Two 1 Gb FICON chpids leading to a FICON switch, then two 1 Gb FICON chpids from the switch to the 1750. Eight 3390-3 volumes in a single LSS. Four aliases defined for each volume.

With these configurations, each of our eight real volumes has up to four aliases the z/VM Control Program can use to parallelize I/O. By using CP VARY OFF to shut off some of the aliases, we can control the amount of parallelism available for each volume.

We ran all measurements with z/VM 5.2.0 plus APAR VM63855, with CP SET MDCACHE SYSTEM OFF in effect.

Metrics

For each experiment, we measured I/O rate, I/O service time, and I/O response time.

I/O rate is the rate at which I/Os are completing at a volume. For example, a volume might experience an I/O rate of 20 I/Os per second. As long as the size of the I/Os remains constant, using PAV to achieve a higher I/O rate for a volume is a performance improvement, because we move more data each second.

For a PAV volume, we assess the I/O rate for the volume by adding up the I/O rates for the device numbers mapping the volume. For example, if the base device number experiences 50/sec and each of three alias devices experiences 15/sec, the volume experiences 95/sec. Such summing is how we measure the effect of PAV on I/O rate. We always compute the volume's I/O rate by summing the individual rates for the device numbers mapping the volume.

I/O service time is the amount of time it takes for the DASD subsystem to perform the requested operation, once the host system starts the I/O. Factors influencing I/O service time include channel speed, load on the DASD subsystem, amount of data being moved in the I/O, whether the I/O is a read or a write, and the presence or availability of cache memory in the controller, just to name a few.

For a PAV volume, we measure the I/O service time for the volume by computing the average I/O service time for the device numbers mapping the volume. The calculation takes into account the I/O rate at each device number and the I/O service time incurred at each device number, so as to form an estimate (aka expected value) of the I/O service time a hypothetical I/O to the volume would incur. For example, if the base device is doing 100/sec with service time 5 msec, and the lone alias is doing 50/sec with service time 7 msec, the I/O service time for the volume is calculated to be (100*5 + 50*7) / 150, or 5.7 msec.

I/O response time is the total amount of time a guest virtual machine perceives it takes to do an I/O to its minidisk. This comprises I/O service time, explained previously, plus wait time. As a real device becomes busy, guest I/O operations destined for that real volume wait a little while in the real volume's I/O wait queue before they start. Time spent in the wait queue, called I/O wait time, is added to the I/O service time so as to produce the value called I/O response time.

For a PAV volume owned by SYSTEM, I/Os queued to a volume spend their waiting time queued on the base device number. When the I/O gets to the front of the line, it is pulled off the queue by the first device (base or one of its aliases) that becomes free. For a PAV volume, then, I/O response time is equal to the wait time spent in the base device queue plus the expected value of the I/O service time for the volume, the calculation of which was explained previously.

Of these three metrics, the most interesting ones from an application performance perspective are I/O rate and I/O response time. Changes in I/O service time, while indicative of storage server performance, are not too important to the application as long as they do not cause increases in I/O response time.

We ran each configuration for ten minutes, with CP Monitor set to emit sample records at one-minute intervals. To calculate average performance of a volume over the ten-minute interval, we threw away the first minute's and the last minute's values (so as to discard samples possibly affected by the run's startup and shutdown behaviors) and then averaged the remaining eight minutes' worth of samples. We used Performance Toolkit's interim FCX168 reports as the raw input for our calculations.

Tabulated Results

The cells in the tables below state the average values of the three I/O metrics over the eight volumes being exercised.

IBM TotalStorage ESS F20 (2105)

IO3390 4K
0% reads
Suite K411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
282.18
3.46
3.46
281.90
3.48
3.48
282.60
3.45
3.45
282.65
3.46
3.46
281.83
3.47
3.47
2
ior
iost
iort
159.55
6.24
12.16
159.31
12.47
12.47
159.65
12.46
12.46
159.41
12.48
12.48
159.98
12.45
12.45
3
ior
iost
iort
138.63
7.18
21.21
138.35
14.42
21.21
138.54
21.56
21.56
138.63
21.56
21.56
138.22
21.64
21.64
4
ior
iost
iort
133.50
7.47
29.46
133.17
14.99
29.56
133.64
22.40
29.36
133.90
29.76
29.76
133.56
29.87
29.87
5
ior
iost
iort
131.93
7.56
37.31
132.64
15.05
37.16
132.99
22.50
37.01
132.15
30.20
37.39
132.41
37.59
37.59
10
ior
iost
iort
123.68
8.25
79.98
125.93
15.96
78.73
131.49
22.76
75.49
131.14
30.43
75.71
130.38
38.23
76.25
20
ior
iost
iort
123.04
8.12
161.94
124.97
15.98
159.46
127.02
23.56
156.76
125.83
31.71
158.34
126.10
39.53
158.03
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2105-F20, 16 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
33% reads
Suite L411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
361.50
2.70
2.70
366.49
2.65
2.65
360.53
2.69
2.69
360.74
2.69
2.69
361.94
2.68
2.68
2
ior
iost
iort
237.59
4.17
8.06
237.64
8.42
8.42
237.89
8.41
8.41
237.37
8.44
8.44
238.06
8.41
8.41
3
ior
iost
iort
205.52
4.81
14.20
206.21
9.64
14.12
205.64
14.49
14.49
205.62
14.50
14.50
205.71
14.50
14.50
4
ior
iost
iort
199.29
4.97
19.65
198.59
10.02
19.73
199.55
14.97
19.64
198.99
19.98
19.98
198.45
20.06
20.06
5
ior
iost
iort
197.91
5.00
24.85
196.34
10.14
25.04
197.74
15.14
24.86
197.82
20.13
24.81
196.63
25.29
25.29
10
ior
iost
iort
195.79
5.10
50.69
191.78
10.41
51.70
195.03
15.36
50.88
195.58
20.41
50.65
189.62
26.42
52.52
20
ior
iost
iort
189.59
5.27
105.01
189.75
10.53
105.01
187.96
15.94
105.98
188.96
21.14
105.45
149.15
21.78
89.40
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2105-F20, 16 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
66% reads
Suite M411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
559.96
1.77
1.77
574.78
1.72
1.72
572.96
1.71
1.71
559.90
1.77
1.77
568.88
1.73
1.73
2
ior
iost
iort
464.41
2.17
4.11
464.27
4.24
4.24
465.49
4.23
4.23
465.51
4.23
4.23
466.16
4.23
4.23
3
ior
iost
iort
403.43
2.49
7.22
402.61
4.95
7.19
402.89
7.40
7.40
402.18
7.40
7.40
403.46
7.39
7.39
4
ior
iost
iort
389.54
2.58
10.04
388.27
5.14
10.06
388.64
7.71
10.06
388.94
10.23
10.23
388.97
10.23
10.23
5
ior
iost
iort
387.09
2.60
12.70
387.98
5.14
12.62
386.14
7.76
12.70
385.67
10.36
12.70
384.91
12.93
12.93
10
ior
iost
iort
386.50
2.56
25.62
381.45
5.23
25.96
383.04
7.82
25.84
383.08
10.43
25.86
380.56
13.12
26.05
20
ior
iost
iort
376.91
2.64
52.79
356.04
5.71
55.75
372.14
8.04
53.46
370.56
10.77
53.69
370.46
13.47
53.73
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2105-F20, 16 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
100% reads
Suite N411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
898.79
1.05
1.05
899.90
1.03
1.03
899.40
1.03
1.03
899.69
1.04
1.04
898.47
1.05
1.05
2
ior
iost
iort
916.39
1.08
2.06
1023.45
1.93
1.93
1023.28
1.95
1.95
1023.46
1.94
1.94
1023.17
1.96
1.96
3
ior
iost
iort
918.34
1.08
3.14
1022.59
2.00
2.87
1023.95
2.90
2.90
1023.74
2.90
2.90
1023.48
2.90
2.90
4
ior
iost
iort
916.25
1.08
4.24
1021.84
2.00
3.85
1022.39
2.90
3.78
1011.00
3.90
3.90
1011.09
3.90
3.90
5
ior
iost
iort
916.41
1.10
5.35
1022.07
2.00
4.82
1023.18
2.90
4.77
1010.14
3.90
4.78
1006.32
4.90
4.90
10
ior
iost
iort
915.96
1.09
10.81
1019.32
2.00
9.74
1019.66
2.90
9.68
1007.75
3.90
9.74
1003.11
4.90
9.80
20
ior
iost
iort
911.61
1.10
21.82
1016.30
2.00
19.58
1017.43
2.90
19.49
1005.77
3.90
19.68
1002.09
4.91
19.77
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2105-F20, 16 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).
IBM TotalStorage DS8100 (2107)

IO3390 4K
0% reads
Suite O411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
557.79
1.78
1.78
571.71
1.76
1.76
564.97
1.76
1.76
568.99
1.74
1.74
572.72
1.74
1.74
2
ior
iost
iort
589.14
1.78
3.33
1283.96
1.55
1.55
1271.95
1.55
1.55
1293.98
1.52
1.52
1281.78
1.54
1.54
3
ior
iost
iort
538.08
1.55
4.55
1070.25
1.91
2.75
1306.80
2.23
2.23
1313.59
2.22
2.22
1318.88
2.22
2.22
4
ior
iost
iort
607.13
1.68
6.48
903.36
2.23
4.34
958.98
3.13
4.08
957.63
4.11
4.12
969.61
4.06
4.06
5
ior
iost
iort
640.50
1.60
7.70
748.93
2.69
6.57
767.05
3.90
6.42
759.23
5.25
6.45
757.54
6.50
6.55
10
ior
iost
iort
401.39
3.40
25.65
396.25
5.07
25.07
390.16
7.69
25.44
396.46
10.08
25.10
399.00
12.49
24.88
20
ior
iost
iort
279.17
3.77
71.51
263.96
7.66
75.54
255.56
11.86
78.02
266.08
15.09
74.82
256.57
19.65
77.83
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2107-921, 32 GB cache, 4 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
33% reads
Suite P411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
716.52
1.36
1.36
700.03
1.41
1.41
697.48
1.42
1.42
713.63
1.36
1.36
695.38
1.41
1.41
2
ior
iost
iort
643.95
1.59
3.00
1346.79
1.47
1.47
1312.92
1.50
1.50
1360.22
1.44
1.44
1341.15
1.45
1.45
3
ior
iost
iort
711.50
1.45
4.13
1168.07
1.72
2.49
1870.02
1.54
1.54
1857.06
1.55
1.55
1868.13
1.55
1.55
4
ior
iost
iort
600.47
1.42
5.59
1169.05
1.72
3.34
1391.02
2.12
2.78
1455.05
2.74
2.74
1445.03
2.74
2.75
5
ior
iost
iort
690.64
1.48
7.13
1065.95
1.88
4.60
1128.43
2.63
4.32
1138.34
3.46
4.26
1141.86
4.35
4.37
10
ior
iost
iort
529.96
1.60
16.35
590.32
3.39
16.80
580.55
5.14
17.09
596.66
6.66
16.60
585.64
8.49
16.95
20
ior
iost
iort
390.48
2.59
51.03
396.70
5.09
50.25
390.43
7.75
51.02
403.78
9.93
49.28
382.22
13.19
52.21
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2107-921, 32 GB cache, 4 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
66% reads
Suite Q411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
906.39
1.05
1.05
913.47
1.05
1.05
916.39
1.03
1.03
917.06
1.04
1.04
928.57
1.02
1.02
2
ior
iost
iort
832.63
1.19
2.28
1427.73
1.37
1.37
1408.68
1.37
1.37
1423.42
1.37
1.37
1431.67
1.36
1.36
3
ior
iost
iort
787.54
1.27
3.69
1265.54
1.59
2.29
2028.43
1.46
1.46
2010.89
1.48
1.48
2020.28
1.47
1.47
4
ior
iost
iort
803.98
1.27
4.87
1178.73
1.69
3.30
1682.16
1.73
2.26
2498.16
1.55
1.55
2510.04
1.53
1.54
5
ior
iost
iort
736.06
1.15
5.75
1183.04
1.67
4.11
1689.36
1.73
2.85
2154.82
1.85
2.25
2243.75
2.15
2.16
10
ior
iost
iort
903.84
1.11
10.95
1128.23
1.75
8.75
1157.41
2.56
8.51
1155.38
3.39
8.51
984.05
4.06
8.39
20
ior
iost
iort
784.86
1.27
25.35
751.38
2.66
26.46
789.00
3.77
25.16
760.05
5.29
26.19
771.30
6.49
25.78
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2107-921, 32 GB cache, 4 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
100% reads
Suite R411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
1981.82
0.49
0.49
1979.01
0.50
0.50
1973.69
0.49
0.49
1992.94
0.49
0.49
1982.42
0.50
0.50
2
ior
iost
iort
2033.59
0.50
0.93
3030.20
0.61
0.61
3038.26
0.61
0.61
3020.83
0.61
0.61
3028.31
0.62
0.62
3
ior
iost
iort
2035.20
0.50
1.42
3102.04
0.60
0.87
3802.91
0.70
0.70
3768.40
0.73
0.73
3799.37
0.70
0.70
4
ior
iost
iort
2030.54
0.50
1.92
3078.61
0.64
1.23
3845.68
0.70
0.91
4255.41
0.90
0.90
4258.65
0.90
0.90
5
ior
iost
iort
2036.59
0.50
2.40
3053.20
0.66
1.59
3852.45
0.74
1.20
4298.41
0.90
1.08
4521.93
1.00
1.00
10
ior
iost
iort
2039.14
0.50
4.85
3024.77
0.70
3.29
3843.86
0.76
2.53
4301.13
0.90
2.23
4566.20
1.00
2.04
20
ior
iost
iort
2032.16
0.50
9.77
3048.54
0.68
6.52
3837.41
0.77
5.14
3767.62
0.85
4.52
4562.11
1.00
4.22
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 2107-921, 32 GB cache, 4 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).
IBM TotalStorage DS6800 (1750)

IO3390 4K
0% reads
Suite S411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
124.59
7.99
7.99
124.35
8.00
8.00
123.98
8.03
8.03
125.73
7.91
7.91
123.88
8.04
8.04
2
ior
iost
iort
105.12
9.55
18.66
92.64
21.58
21.58
93.11
21.49
21.49
92.29
21.68
21.68
92.59
21.61
21.61
3
ior
iost
iort
94.05
11.25
32.19
84.37
23.69
35.20
84.00
35.60
35.60
83.82
35.72
35.72
83.64
35.75
35.75
4
ior
iost
iort
74.04
13.53
53.47
81.68
24.41
48.52
81.05
36.88
48.96
81.38
48.88
48.90
80.94
49.12
49.12
5
ior
iost
iort
70.07
14.29
70.91
81.09
24.60
61.31
79.20
37.74
62.77
79.89
49.78
62.18
79.52
62.21
62.36
10
ior
iost
iort
64.71
15.47
154.16
78.13
25.57
127.46
77.65
38.47
128.41
77.47
51.30
128.61
77.68
63.65
128.23
20
ior
iost
iort
61.37
16.30
325.46
76.92
25.96
259.48
77.11
38.74
258.90
76.59
51.80
260.54
76.26
64.91
261.67
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 1750-511, 4 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
33% reads
Suite T411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
189.57
5.22
5.22
187.74
5.27
5.27
186.79
5.29
5.29
187.52
5.27
5.27
187.67
5.27
5.27
2
ior
iost
iort
171.32
5.82
11.38
138.61
14.41
14.41
138.69
14.41
14.41
138.37
14.46
14.46
138.03
14.47
14.47
3
ior
iost
iort
100.65
9.94
29.48
123.77
16.12
23.88
118.28
25.26
25.26
118.84
25.14
25.14
118.63
25.20
25.20
4
ior
iost
iort
87.79
11.41
45.13
114.66
17.41
34.58
113.12
26.44
35.03
113.85
34.95
34.95
112.65
35.33
35.33
5
ior
iost
iort
81.54
12.27
60.93
108.62
18.36
45.64
111.77
26.74
44.32
109.85
36.23
45.14
109.94
45.07
45.12
10
ior
iost
iort
74.21
13.49
134.31
98.14
20.33
101.45
106.29
28.11
93.69
105.58
37.69
94.50
105.54
47.00
94.38
20
ior
iost
iort
70.52
14.22
283.22
94.85
21.06
210.32
103.93
28.77
192.05
103.25
38.41
193.23
103.46
47.90
192.82
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 1750-511, 4 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
66% reads
Suite U411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
375.39
2.60
2.60
376.21
2.59
2.59
374.62
2.60
2.60
375.01
2.59
2.59
376.18
2.58
2.58
2
ior
iost
iort
350.75
2.84
5.54
312.01
6.36
6.36
313.63
6.34
6.34
313.47
6.33
6.33
312.67
6.35
6.35
3
ior
iost
iort
123.57
8.12
23.87
198.30
10.94
15.69
284.12
10.50
10.50
284.59
10.48
10.48
282.09
10.56
10.56
4
ior
iost
iort
107.14
9.35
36.92
148.86
13.41
26.49
181.33
16.95
21.34
181.10
21.96
21.96
181.52
21.93
21.93
5
ior
iost
iort
99.76
10.05
49.75
137.98
14.47
35.90
158.92
18.80
31.13
173.53
22.95
28.53
170.09
29.18
29.19
10
ior
iost
iort
88.47
11.34
112.60
125.57
15.90
79.22
141.51
21.13
70.32
156.22
25.49
63.68
154.40
32.20
64.48
20
ior
iost
iort
83.60
11.99
238.79
117.53
17.00
169.70
136.74
21.86
145.84
149.32
26.64
133.63
149.33
33.30
133.51
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 1750-511, 4 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

IO3390 4K
100% reads
Suite V411 Aliases per volume
Workers
per
volume
Metric 0 1 2 3 4
1
ior
iost
iort
889.96
1.09
1.09
880.92
1.07
1.07
877.05
1.08
1.08
883.99
1.07
1.07
862.92
1.11
1.11
2
ior
iost
iort
602.93
1.76
3.25
774.32
2.52
2.52
755.17
2.59
2.59
763.51
2.55
2.55
747.06
2.60
2.60
3
ior
iost
iort
166.82
6.01
17.62
269.46
7.40
10.81
346.04
8.59
8.59
334.15
8.92
8.92
336.34
8.87
8.87
4
ior
iost
iort
142.04
7.07
27.83
221.17
9.05
17.81
274.77
10.89
14.24
323.06
12.29
12.29
315.78
12.58
12.58
5
ior
iost
iort
128.05
7.83
38.66
203.27
9.84
24.26
249.85
11.98
19.70
284.71
13.99
17.24
284.98
17.45
17.45
10
ior
iost
iort
108.90
9.22
91.34
171.49
11.66
57.90
212.76
14.08
46.64
238.77
16.68
41.53
240.59
20.70
41.20
20
ior
iost
iort
100.21
10.01
199.04
159.64
12.53
124.81
198.28
15.11
100.35
223.60
17.82
88.95
221.90
22.43
89.65
Note: 2084-324, 2 dedicated processors, 2 GB central, 2 GB XSTORE. 1750-511, 4 GB cache, 2 FICON chpids. z/VM 5.2.0 + PAV SPE. ior is I/O rate (/sec). iost is I/O service time (msec). iort is I/O response time (msec).

Discussion

Expectations

In general, we expected that as we added aliases to a configuration, we would experience improvement in one or more I/O metrics, provided enough workload existed to exploit the aliases, and provided no other bottleneck limited the workload. For example, with only one IO3390 worker per volume, we would not expect adding aliases to help anything. However, as we increase IO3390 workers per volume, we would expect adding aliases to help matters, if the configuration is not otherwise limited. We also expected that adding aliases would help only up to the workload's ability to drive I/Os concurrently. For example, with only three workers per volume, we would not expect four exposures (one base plus three aliases) to perform better than three exposures.

We also expected that when the number of exposures was greater than or equal to the concurrency level, I/O response time would equal I/O service time. In other words, in such configurations, we expected device wait queues to disappear.

IBM ESS F20 (2105)

For the 2105, we saw that adding aliases did not appreciably change I/O rate or I/O response time. The 100%-read workloads were the exception to this. For those runs, we did notice that adding aliases did improve I/O rate and I/O response time. However, there was little improvement beyond adding one alias, that is, two or more aliases offered about the same performance as one alias.

We also noticed that as we added aliases to a workload, I/O service time increased. However, we almost always saw offsetting reductions in wait time, so I/O response time remained about flat. We believe this suggests that this workload drives the 2105 intensely enough that some bottleneck within it comes into play. Because adding aliases did not increase I/O rate or decrease I/O response time, we believe that by adding aliases, all we did was move the I/O queueing from z/VM to inside the 2105. To investigate our suspicion, we spot-checked the components of I/O service time (pending time, disconnect time, connect time) for some configurations. Generally we found that increases in I/O service time were due to increases in disconnect time. We believe this suggests queueing inside the 2105. We did not check every case, nor did we tabulate our findings.

IBM DS8100 (2107)

For the 2107, we saw that adding aliases definitely caused improvements in I/O rate and I/O response time. In some cases, the improvements were dramatic.

Like the 2105, we saw that for the 2107, adding aliases to a workload tended to increase I/O service time. However, for the 2107, the increase in service time was more than offset by a decrease in wait time, so I/O response time decreased. This was true in all but the most extreme workloads (100% writes or large numbers of users). In those extreme cases, we believe we hit a 2107 limit, just as we did in most of the 2105 runs.

IBM DS 6800 (1750)

The 1750, like the 2107, showed improvements in many workloads as we added aliases. However, the 1750 struggled with the 0%-read workload, and it did not do well with small numbers of users per volume. As workers per volume increased and as the fraction of reads increased, the effect of PAV became noticable and positive.

Conclusions

For the DS8100 and the DS6800, we can recommend PAV when the workload contains enough concurrency, especially for workloads that are not 100% writes. We expect customers to see decreases in I/O response time and increases in I/O rate per volume. Exact results will depend heavily on the customer's workload.

For the ESS F20, we can recommend PAV only when the customer's workload has a high read percentage. For low and moderate read percentages, neither I/O rate nor I/O response time improves as we add aliases.

Workloads that might benefit from adding PAV aliases are characterized by I/O response time being greater than I/O service time -- in other words, a wait queue forming. Customers considering adding PAV aliases can add an alias or two to volumes showing this trait. A second measurement will confirm whether I/O rate or I/O response time improved.

We do not recommend adding PAV aliases past the point where the wait queue disappears.

A guest that does its own I/O scheduling, such as Linux or z/OS, might be maintaining device wait queues on its own. Such queues would be invisible to z/VM and to performance management products that consider only CP real device I/O. If your analysis of what's happening inside your guest shows you that wait queues are forming inside your guest, you might consider exploring whether your guest can exploit PAV (sometimes we call this being PAV-aware). If it does, you can use the new z/VM minidisk PAV support to give your guest more than one virtual I/O device number for the minidisk on which the guest is doing its own queueing. We did not do any measurements of such configurations, but we would expect to see some queueing relief like what we observed in the configurations we measured.

Contents | Previous | Next