z/VM Reorder Processing

z/VM 6.3 News

This article was originally written prior to z/VM 6.3. The Reorder process no longer exists in z/VM 6.3 as that part of the memory management algorithms has been re-written. If you are on z/VM 6.3, you no longer need to be concerned about Reorder processing. If you are not on z/VM 6.3, continue reading, but consider moving to z/VM 6.3 or newer to avoid having to tune for reorder processing.

Background

In z/VM, for each virtual machine there is a list of frames of real memory that are associated with a virtual machine. This list (UFO List - User Frame Owned List) is structured to facilitate finding frames of memory of different characteristics during a process called Demand Scan. Demand Scan is most commonly started when z/VM detects it has insufficient frames on its available lists. Page Reorder doesn't really reorder pages. It is more accurate to think of it as reordering a set of pointers to pages. It is the process that is used to make sure the virtual machine frame list (UFO) is valid. Part of this process deals with the hardware reference bit. So the time it takes for the Reorder process to run is dependent on the number of resident pages of a virtual machine (as each resident page is mapped to a real frame of memory). A larger virtual machine is not a problem for Reorder unless all the pages are resident. While Page Reorder is running for a virtual machine, it is stopped. All virtual machines go through Reorder at one time or another. There are a number of factors that affect how long Reorder may take, but the very rough rule of thumb is 1 second for every 8GB of resident memory. For virtual machines with more than one virtual processor, all are stopped during Reorder processing. While Reorder could occur for multiple virtual machines at the same time, it would still result in serializations of the individual virtual machines being reordered. Reorder processing does not serialize the z/VM system as a whole.

How often Reorder processing occurs is dependent on multiple factors. The two most significant are:

  • Processor time consumed by the virtual machine. The idea here is to allow the virtual machine to run long enough to have an opportunity to reference memory and develop a footprint. Because of this, virtual machines running a CPU burner type application tend to have Reorders occur at a very regular interval.
  • Overall system paging characteristics.

Identifying a Negative Impact from Reorder

In Performance Toolkit User State Sampling Report (USTAT FCX114 or USTATLOG FCX164), there is a field called %CFW (console function wait). High percentages here can be an indication of a problem. An example showing part of the FCX164 report follows: Wait State Data Log for User PRDLIN01 Interval End Time %ACT %RUN %CPU %LDG %PGW %IOW %SIM %TIW %CFW >>Mean>> 100 33 1 0 0 0 1 64 0 11:55:05 100 23 0 0 0 0 2 75 0 11:56:05 100 18 2 0 0 0 3 77 0 11:57:05 100 28 0 0 0 0 2 70 0 11:58:05 100 58 3 0 0 0 3 35 0 11:59:05 100 87 5 0 0 0 3 0 5 12:00:05 100 92 7 0 0 0 2 0 0 12:01:05 100 93 3 0 0 0 3 0 0 12:02:05 100 62 2 0 0 0 2 35 0 You see in the above example that the percentage of samples that virtual machine is in console function wait is zero percentage except for the interval that ends at 11:59:05. Then there is the spike to 5%. Note, that this sampling is not at a high enough frequency to be exact, but it can give an indication of the possible problem.

One can look further to see how many pages might be resident for the virtual machine, by examining either the FCX113 UPAGE or the FCX163 UPAGELOG reports. Here is an example of a portion of the UPAGE report.

<----------------- Number of Pages -----------------> <-Resident-> <--Locked--> Userid WSS Resrvd R<2GB R>2GB L<2GB L>2GB XSTOR DASD PRDLIN01 4699k 0 56258 4643k 6 253 0 0 The fields of interest here are the resident page counts, "R<2GB" and "R>2GB", which indicate the number of 4KB pages resident in real memory below and above the 2GB bar. If we combine these for our PRDLIN01 virtual machine it comes to 4699258 pages or approximately 18GB. Using the rule of thumb, a Reorder for this virtual machine could take about 2.25 seconds.

For more detail, a small utility called REORDMON is available from the VM Download page. This can be run against a Monwrite data file for all virtual machines, or in a real time fashion for a particular virtual machine. It provides results of this nature:

Num. of Average Average Userid Reorders Rsdnt(MB) Ref'd(MB) Reorder Times -------- -------- --------- --------- ----------------------- PRDLIN01 2 18356 13090 15:59:05 16:15:05 PRDLIN02 1 14277 5207 16:29:05 This shows that the data reduced found two Reorders had occurred for PRDLIN01 and one for PRDLIN02. The average resident page count through out the data was 18356 MB. The time stamps given are the interval end times of the intervals in which Reorders were found. Note, in extreme cases, a Reorder may occur more than once in a monitor interval. These time stamps are in GMT, so in our example, the Reorder at 15:59:05 actually corresponds to 11:59:05 as seen in the Performance Toolkit report.

APAR to Disable Reorder

An APAR for this problem is VM64774 and closed in September 2010. The PTF for z/VM 5.4.0 is UM33167 and the PTF for z/VM 6.1.0 is UM33169. Please use the normal service model to ensure you get required pre-regs. This support is in the base of z/VM 6.2.0. These PTFs replace a previous patch. The APAR allows one to disable/enable Reorder processing for individual machines or system-wide. If you are unsure of the applicability of disabling Reorder, please contact IBM z/VM Level 2.

The previous patch had been given to a small number of customers that were running with the patch in production. It was implemented differently than the PTF through a User defined CP command. IBM recommends that if you were using the patch, you move to the official PTF.

Impact of Disabling Page Reorder

Reorder processing has been in the system for decades and had significant value at one time. That value has diminished over time, particularly for some environments. Reorder is involved with managing the information about the characteristics of the various page frames for a virtual machine. Most significant is to reset the hardware reference bit which helps z/VM memory management estimate the least recently used pages. Even without the reference bit, z/VM has some information to use in this space. In many Linux guest environments where the guests never drop to dormant list, most of the demand scan processing occurs in later passes (i.e. Pass 2 or the Emergency Pass). In those cases, the memory management routines are very aggressive in taking pages and most often will ignore some of the information that Page Reorder would provide. To date the internal measurements and customer experience showed no visible negative effects when Reorder was disabled, particularly in comparison to the impact of Reorder delaying a guest. It is important to have expanded storage configured for z/VM paging. It is even more important when reorder is disabled. While z/VM estimates LRU when taking pages from central storage (real memory), it uses timestamps to determine reorder in expanded storage. So if z/VM, with Reorder off, makes some bad choices in taking pages, having expanded storage mitigates that. The following are some metrics from Performance Toolkit that can be monitored to check for undesirable behaviour.
  • FCX114 USTAT report - look for excessive delays in %PGW and %PGA. The first field is for synchronous page faults and second is for asynchrnous page faults. You'll want to look over this for all the users. The next two metrics are system wide.
  • FCX143 PAGELOG report - focus on "PGIN/second" and "Page Reads/second". Spikes on PGIN in the 1000s even should be tolerable.
  • FCX254 AVAILLOG report - check most importantly that the "times empty" column stays zero. Also check "Scan Fail" and "Pct Emerg Scan" fields.

Back to the Performance Tips Page