Dump Improvements
Abstract
In z/VM 6.4 APAR VM65989 dumping was changed to let the system operator reduce the size of the dump by choosing to omit usually extraneous data from the dump. z/VM 7.1 further reduces the size of the dump by dumping the map of real memory in a more efficient manner. z/VM 7.1 also includes CPU efficiency changes that let the dump be accomplished with less CPU time.
In our experiments with a workload running in a 1024 GB LPAR, exploiting all improvements resulted in a 97% reduction in dump elapsed time and a 99% reduction in dump size, compared to exploiting none of them. Customers' results will vary according to system configuration and workload.
Introduction
In PTF UM35132 for APAR VM65989 z/VM 6.4 was changed to use a more efficient channel program for writing dumps. The change in the structure of the channel program improved I/O performance of dumps. The article is here for the reading.
Further dump improvements include more than just channel program optimization.
-
PTF UM35132 also includes a new SNAPDUMP and SET DUMP
operand,
PGMBKS NONE,
that lets the
system operator omit page management blocks (PGMBKs)
from the dump. PGMBKs, CP data structures that map
guest real storage,
are seldom useful in diagnosis.
Omitting them from the dump
decreases dump time and
decreases dump size,
usually without
compromising the usefulness of the dump.
-
z/VM 7.1 includes a new SNAPDUMP and
SET DUMP operand,
FRMTBL NO,
that lets the system
operator use an alternate method for dumping
the information contained in the real frame table.
Instead of writing the frame table itself, the
new method writes
a new data structure, the
correlation table. The correlation table
expresses the same information as the real frame
table, but it is much smaller and so it can be
dumped in less space and in less time.
- z/VM 7.1 also reduces the amount of CPU time required for calculating what to dump. It does this by using subroutines that have been optimized and by using Prefetch Data (PFD) to have the processor prefetch real frame table rows, so by the time they are needed, they are already in cache. These changes are always in play, in other words, there is no command operand to invoke them. This fix is not in the z/VM 7.1 base but rather is found in APAR VM66176, available concurrently with the GA of z/VM 7.1.
This article describes the effects of all those enhancements.
Method
A workload was devised to populate storage. The workload was built in such a way that it would populate storage in about the same fashion each time it was run. A snap dump was then taken. After the snap dump was taken, the dump was loaded from spool onto minidisk. The loaded file was then analyzed to calculate how many 4 KB records were written during the dumping, and how much elapsed time was used, and how much CPU time was used.
Dumps were done using z/VM 6.4 plus VM65989 and also using z/VM 7.1, exploiting increasing levels of the enhancements. To suppress PGMBKs, the SNAPDUMP operand PGMBKS NONE was used. To dump a correlation table instead of the frame table, the SNAPDUMP operand FRMTBL NO was used.
Results and Discussion
Effects of PGMBK Omission and Correlation Table
Table 1 shows the effects of PGMBK suppression and correlation table exploitation on dump size and dump time.
Table 1. Dump Performance, z/VM 7.1 back to z/VM 6.4+VM65989 | ||||
Run ID | PRB00007 | PRB00006 | PRB00010 | PRB00017 |
Memory, GB | 512 | 512 | 512 | 512 |
z/VM level | VM65989 | VM65989 | 7.1- | 7.1- |
Kind of dump | SNAPDUMP | SNAPDUMP | SNAPDUMP | SNAPDUMP |
PGMBKs | ALL | none | none | none |
Storage table | FRAME | FRAME | FRAME | CORR |
Recs dumped | 6102214 | 1079539 | 1098659 | 50871 |
compared to PRB07 | -82.3% | |||
compared to PRB10 | -95% | |||
compared to PRB07 | -99% | |||
Elapsed time, sec | 270.0 | 82.1 | 86.9 | 14.6 |
compared to PRB07 | -69.6% | |||
compared to PRB10 | -83% | |||
compared to PRB07 | -95% | |||
Notes: 2964-NC9, four dedicated IFL cores, 512 GB central. 2412-951 with four FICON Express8S LX chpids. VM65989 is z/VM 6.4 plus VM65989. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted. |
Suppressing PGMBKs reduced the size of the dump by 82% and reduced the dump time by 70%. Changing from the frame table to the correlation table reduced the size of the dump by 95% and reduced the dump time by 83%. The effect of both changes used together was a 99% reduction in dump size and a 95% reduction in dump elapsed time.
Effect of CPU Mitigation, 512 GB
Table 2 shows the effect of CPU mitigation on a 512 GB dump.
Table 2. Dump CPU Mitigation, 512 GB | ||
Run ID | PRB00011 | PRB00012 |
Memory, GB | 512 | 512 |
z/VM level | 7.1- | 7.1 |
Kind of dump | SNAPDUMP | SNAPDUMP |
PGMBKs | none | none |
Storage table | CORR | CORR |
CPU mitigation | no | yes |
Recs dumped | 50916 | 52223 |
Elapsed time, sec | 14.5 | 8.6 |
compared to PRB11 | -41% | |
Notes: 2964-NC9, four dedicated IFL cores, 512 GB central. 2412-951 with four FICON Express8S LX chpids. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted. 7.1 is z/VM 7.1 of May 2018 with the complete dump enhancements line item. |
In our 512 GB measurement, CPU mitigation reduced elapsed time by 41%.
Effects, 1024 GB Workload
Table 3 shows the effects of the various enhancements on our 1024 GB experiment.
Table 3. Dump Performance, 1024 GB | ||||
Run ID | PRB00016 | PRB00015 | PRB00014 | PRB00013 |
Memory, GB | 1024 | 1024 | 1024 | 1024 |
z/VM level | 7.1- | 7.1 | 7.1- | 7.1 |
Kind of dump | SNAPDUMP | SNAPDUMP | SNAPDUMP | SNAPDUMP |
PGMBKs | ALL | ALL | none | none |
Storage table | FRAME | FRAME | CORR | CORR |
CPU mitigation | no | yes | no | yes |
Recs dumped | 12185504 | 12185705 | 78452 | 79715 |
compared to PRB16 | -99% | |||
Elapsed time, sec | 469.5 | 465.9 | 26.8 | 14.8 |
compared to PRB16 | -97% | |||
I/O time, sec | 440.2 | 447.6 | 2.8 | 2.9 |
compared to PRB16 | -99% | |||
CPU time, sec | 29.3 | 18.3 | 23.9 | 11.9 |
compared to PRB16 | -59% | |||
Notes: 2964-NC9, eight dedicated IFL cores, 1024 GB central. 2412-951 with four FICON Express8S LX chpids. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted. 7.1 is z/VM 7.1 of May 2018 with the complete dump enhancements line item. |
Compared to having no enhancements in play, having all enhancements in play reduced the size of the dump by 99% and reduced the dump elapsed time by 97%.
Summary and Conclusions
In our measurement in our 1024 GB LPAR, the PGMBK suppression operand, the correlation table operand, and the CPU mitigation improvements combined to reduce dump size by 99% and dump elapsed time by 97%. Customer experience will vary by hardware configuration and by workload.