z/VM Reorder Processing
z/VM 6.3 News
This article was originally written prior to z/VM 6.3. The Reorder process no longer exists in z/VM 6.3 as that part of the memory management algorithms has been re-written. If you are on z/VM 6.3, you no longer need to be concerned about Reorder processing. If you are not on z/VM 6.3, continue reading, but consider moving to z/VM 6.3 or newer to avoid having to tune for reorder processing.Background
In z/VM, for each virtual machine there is a list of frames of real memory that are associated with a virtual machine. This list (UFO List - User Frame Owned List) is structured to facilitate finding frames of memory of different characteristics during a process called Demand Scan. Demand Scan is most commonly started when z/VM detects it has insufficient frames on its available lists. Page Reorder doesn't really reorder pages. It is more accurate to think of it as reordering a set of pointers to pages. It is the process that is used to make sure the virtual machine frame list (UFO) is valid. Part of this process deals with the hardware reference bit. So the time it takes for the Reorder process to run is dependent on the number of resident pages of a virtual machine (as each resident page is mapped to a real frame of memory). A larger virtual machine is not a problem for Reorder unless all the pages are resident. While Page Reorder is running for a virtual machine, it is stopped. All virtual machines go through Reorder at one time or another. There are a number of factors that affect how long Reorder may take, but the very rough rule of thumb is 1 second for every 8GB of resident memory. For virtual machines with more than one virtual processor, all are stopped during Reorder processing. While Reorder could occur for multiple virtual machines at the same time, it would still result in serializations of the individual virtual machines being reordered. Reorder processing does not serialize the z/VM system as a whole.How often Reorder processing occurs is dependent on multiple factors. The two most significant are:
- Processor time consumed by the virtual machine. The idea here is to allow the virtual machine to run long enough to have an opportunity to reference memory and develop a footprint. Because of this, virtual machines running a CPU burner type application tend to have Reorders occur at a very regular interval.
- Overall system paging characteristics.
Identifying a Negative Impact from Reorder
In Performance Toolkit User State Sampling Report (USTAT FCX114 or USTATLOG FCX164), there is a field called %CFW (console function wait). High percentages here can be an indication of a problem. An example showing part of the FCX164 report follows:One can look further to see how many pages might be resident for the virtual machine, by examining either the FCX113 UPAGE or the FCX163 UPAGELOG reports. Here is an example of a portion of the UPAGE report.
For more detail, a small utility called REORDMON is available from the VM Download page. This can be run against a Monwrite data file for all virtual machines, or in a real time fashion for a particular virtual machine. It provides results of this nature:
APAR to Disable Reorder
An APAR for this problem is VM64774 and closed in September 2010. The PTF for z/VM 5.4.0 is UM33167 and the PTF for z/VM 6.1.0 is UM33169. Please use the normal service model to ensure you get required pre-regs. This support is in the base of z/VM 6.2.0. These PTFs replace a previous patch. The APAR allows one to disable/enable Reorder processing for individual machines or system-wide. If you are unsure of the applicability of disabling Reorder, please contact IBM z/VM Level 2.
The previous patch had been given to a small number of customers that were running with the patch in production. It was implemented differently than the PTF through a User defined CP command. IBM recommends that if you were using the patch, you move to the official PTF.
Impact of Disabling Page Reorder
Reorder processing has been in the system for decades and had significant value at one time. That value has diminished over time, particularly for some environments. Reorder is involved with managing the information about the characteristics of the various page frames for a virtual machine. Most significant is to reset the hardware reference bit which helps z/VM memory management estimate the least recently used pages. Even without the reference bit, z/VM has some information to use in this space. In many Linux guest environments where the guests never drop to dormant list, most of the demand scan processing occurs in later passes (i.e. Pass 2 or the Emergency Pass). In those cases, the memory management routines are very aggressive in taking pages and most often will ignore some of the information that Page Reorder would provide. To date the internal measurements and customer experience showed no visible negative effects when Reorder was disabled, particularly in comparison to the impact of Reorder delaying a guest. It is important to have expanded storage configured for z/VM paging. It is even more important when reorder is disabled. While z/VM estimates LRU when taking pages from central storage (real memory), it uses timestamps to determine reorder in expanded storage. So if z/VM, with Reorder off, makes some bad choices in taking pages, having expanded storage mitigates that. The following are some metrics from Performance Toolkit that can be monitored to check for undesirable behaviour.- FCX114 USTAT report - look for excessive delays in %PGW and %PGA. The first field is for synchronous page faults and second is for asynchrnous page faults. You'll want to look over this for all the users. The next two metrics are system wide.
- FCX143 PAGELOG report - focus on "PGIN/second" and "Page Reads/second". Spikes on PGIN in the 1000s even should be tolerable.
- FCX254 AVAILLOG report - check most importantly that the "times empty" column stays zero. Also check "Scan Fail" and "Pct Emerg Scan" fields.
Back to the Performance Tips Page