Contents | Previous | Next

Dump Improvements

Abstract

In z/VM 6.4 APAR VM65989 dumping was changed to let the system operator reduce the size of the dump by choosing to omit usually extraneous data from the dump. z/VM 7.1 further reduces the size of the dump by dumping the map of real memory in a more efficient manner. z/VM 7.1 also includes CPU efficiency changes that let the dump be accomplished with less CPU time.

In our experiments with a workload running in a 1024 GB LPAR, exploiting all improvements resulted in a 97% reduction in dump elapsed time and a 99% reduction in dump size, compared to exploiting none of them. Customers' results will vary according to system configuration and workload.

Introduction

In PTF UM35132 for APAR VM65989 z/VM 6.4 was changed to use a more efficient channel program for writing dumps. The change in the structure of the channel program improved I/O performance of dumps. The article is here for the reading.

Further dump improvements include more than just channel program optimization.

  1. PTF UM35132 also includes a new SNAPDUMP and SET DUMP operand, PGMBKS NONE, that lets the system operator omit page management blocks (PGMBKs) from the dump. PGMBKs, CP data structures that map guest real storage, are seldom useful in diagnosis. Omitting them from the dump decreases dump time and decreases dump size, usually without compromising the usefulness of the dump.

  2. z/VM 7.1 includes a new SNAPDUMP and SET DUMP operand, FRMTBL NO, that lets the system operator use an alternate method for dumping the information contained in the real frame table. Instead of writing the frame table itself, the new method writes a new data structure, the correlation table. The correlation table expresses the same information as the real frame table, but it is much smaller and so it can be dumped in less space and in less time.

  3. z/VM 7.1 also reduces the amount of CPU time required for calculating what to dump. It does this by using subroutines that have been optimized and by using Prefetch Data (PFD) to have the processor prefetch real frame table rows, so by the time they are needed, they are already in cache. These changes are always in play, in other words, there is no command operand to invoke them. This fix is not in the z/VM 7.1 base but rather is found in APAR VM66176, available concurrently with the GA of z/VM 7.1.

This article describes the effects of all those enhancements.

Method

A workload was devised to populate storage. The workload was built in such a way that it would populate storage in about the same fashion each time it was run. A snap dump was then taken. After the snap dump was taken, the dump was loaded from spool onto minidisk. The loaded file was then analyzed to calculate how many 4 KB records were written during the dumping, and how much elapsed time was used, and how much CPU time was used.

Dumps were done using z/VM 6.4 plus VM65989 and also using z/VM 7.1, exploiting increasing levels of the enhancements. To suppress PGMBKs, the SNAPDUMP operand PGMBKS NONE was used. To dump a correlation table instead of the frame table, the SNAPDUMP operand FRMTBL NO was used.

Results and Discussion

Effects of PGMBK Omission and Correlation Table

Table 1 shows the effects of PGMBK suppression and correlation table exploitation on dump size and dump time.

Table 1. Dump Performance, z/VM 7.1 back to z/VM 6.4+VM65989
Run ID PRB00007 PRB00006 PRB00010 PRB00017
Memory, GB 512 512 512 512
z/VM level VM65989 VM65989 7.1- 7.1-
Kind of dump SNAPDUMP SNAPDUMP SNAPDUMP SNAPDUMP
PGMBKs ALL none none none
Storage table FRAME FRAME FRAME CORR
Recs dumped 6102214 1079539 1098659 50871
 compared to PRB07   -82.3%    
 compared to PRB10       -95%
 compared to PRB07       -99%
Elapsed time, sec 270.0 82.1 86.9 14.6
 compared to PRB07   -69.6%    
 compared to PRB10       -83%
 compared to PRB07       -95%
Notes: 2964-NC9, four dedicated IFL cores, 512 GB central. 2412-951 with four FICON Express8S LX chpids. VM65989 is z/VM 6.4 plus VM65989. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted.

Suppressing PGMBKs reduced the size of the dump by 82% and reduced the dump time by 70%. Changing from the frame table to the correlation table reduced the size of the dump by 95% and reduced the dump time by 83%. The effect of both changes used together was a 99% reduction in dump size and a 95% reduction in dump elapsed time.

Effect of CPU Mitigation, 512 GB

Table 2 shows the effect of CPU mitigation on a 512 GB dump.

Table 2. Dump CPU Mitigation, 512 GB
Run ID PRB00011 PRB00012
Memory, GB 512 512
z/VM level 7.1- 7.1
Kind of dump SNAPDUMP SNAPDUMP
PGMBKs none none
Storage table CORR CORR
CPU mitigation no yes
Recs dumped 50916 52223
Elapsed time, sec 14.5 8.6
 compared to PRB11   -41%
Notes: 2964-NC9, four dedicated IFL cores, 512 GB central. 2412-951 with four FICON Express8S LX chpids. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted. 7.1 is z/VM 7.1 of May 2018 with the complete dump enhancements line item.

In our 512 GB measurement, CPU mitigation reduced elapsed time by 41%.

Effects, 1024 GB Workload

Table 3 shows the effects of the various enhancements on our 1024 GB experiment.

Table 3. Dump Performance, 1024 GB
Run ID PRB00016 PRB00015 PRB00014 PRB00013
Memory, GB 1024 1024 1024 1024
z/VM level 7.1- 7.1 7.1- 7.1
Kind of dump SNAPDUMP SNAPDUMP SNAPDUMP SNAPDUMP
PGMBKs ALL ALL none none
Storage table FRAME FRAME CORR CORR
CPU mitigation no yes no yes
Recs dumped 12185504 12185705 78452 79715
 compared to PRB16       -99%
Elapsed time, sec 469.5 465.9 26.8 14.8
 compared to PRB16       -97%
I/O time, sec 440.2 447.6 2.8 2.9
 compared to PRB16       -99%
CPU time, sec 29.3 18.3 23.9 11.9
 compared to PRB16       -59%
Notes: 2964-NC9, eight dedicated IFL cores, 1024 GB central. 2412-951 with four FICON Express8S LX chpids. 7.1- is z/VM 7.1 of May 2018 with the CPU mitigation changes omitted. 7.1 is z/VM 7.1 of May 2018 with the complete dump enhancements line item.

Compared to having no enhancements in play, having all enhancements in play reduced the size of the dump by 99% and reduced the dump elapsed time by 97%.

Summary and Conclusions

In our measurement in our 1024 GB LPAR, the PGMBK suppression operand, the correlation table operand, and the CPU mitigation improvements combined to reduce dump size by 99% and dump elapsed time by 97%. Customer experience will vary by hardware configuration and by workload.

Contents | Previous | Next