Pre-z/VM 5.2.0 64-bit Considerations
The following discussion applies to the initial 64-bit support that
was first provided by z/VM 3.1.0 and that continues, with
incremental improvements, through z/VM 5.1.0. If you are currently
running on one of these releases with more than 2GB of central storage,
this discussion can help you to determine whether your system's
performance is being affected by 2GB-line constraints and, if so, what
actions you might take while still running on these releases.
z/VM 64-bit support was greatly enhanced in z/VM 5.2.0. Migration
to z/VM 5.2.0 is highly recommended for systems that are currently
experiencing significant 2GB-line constraints. See
z/VM 5.2.0 64-bit Considerations for
Understanding Use of Memory below 2GB
While z/VM has 64-bit support for virtual storage and can support more
save 2GB of central storage (real memory), there are areas of the Control Program (CP) that
are limited to 31-bit (2GB). A guest page can reside above 2GB and have
instructions in it executed or data referenced without being moved.
However, pages that require certain CP processing
must reside below 2GB in z/VM's central storage (host real memory). This
includes things such as I/O channel programs and data (both traditional
SSCH and the newer QDIO), simulation of
instructions, and Locked pages (e.g. QDIO stuctures for real devices).
For example, if a guest is executing an I/O where the channel program
and data area reside in a guest page which is in a frame (4KB of central
storage) above 2GB, then
when CP does page translation (in order to process the virtual I/O) the
guest page is moved to a frame of central storage below 2GB. In the case
of I/O, that page is locked until the I/O completes. Once the page is
moved below 2GB host real, it remains there until it is stolen or the page is
released (pages can be released explicitly or when the guest logs off or goes
through system reset).
This can occur for both guests that
run with 31-bit and 64-bit addressing. The determining factor is not the location
of the the page within guest memory, but the location within z/VM (host real)
memory. A guest that is 31-bit can have its pages above 2GB in host real
A page can be stolen as part of the CP steal processing which is started
when the number of available frames below 2GB is too low. During steal
processing, CP makes various passes in attempt to add frames to the
available list. Each pass represents a step in the hierarchy of page
value. For example, unreferenced pages of idle users will be stolen
before pages owned by CP. There is a separate steal process for pages
above 2GB. When stolen, a page will be moved to expanded storage if
This is one of the reasons configuring expanded storage is
If there is significant contention on the storage below 2GB, thrashing
can occur. The page rates to expanded storage, and potentially DASD,
grow to very large rates (1000s pages below per second). In this
thrashing scenario, the pages stolen from below 2GB will often be ones
that will need to be brought back below 2GB in a short period of time.
Identifying a Constraint with 2GB
The symptoms of being constrained by 2GB are high paging activity and
having storage available above 2GB. The simplest way to see this is
with the CP INDICATE LOAD and CP QUERY FRAMES commands. Additionally,
since stealing and related processing is a CP system function, you will
see an increase in System processor time.
In the example
that follows, the key values to look at are -
In this example, we have a high paging rate and a large number of frames
available above 2GB. Therefore, this is likely to be a system constrained
on storage below 2GB.
MDC READS-000020/SEC WRITES-000000/SEC HIT RATIO-100%
STORAGE-030% PAGING-0112/SEC STEAL-091%
Q2-00001(00000) EXPAN-002 E2-00000(00000)
Q3-00038(00000) EXPAN-002 E3-00000(00000)
PROC 0000-009% PROC 0001-008%
SYSGEN REAL USABLE OFFLINE
524287 524287 524287 000000
V=R RESNUC PAGING TRACE RIO370
000000 000735 523272 000280 000000
AVAIL PAGNUC LOCKRS LOCKCP SAVE FREE LOCKRIO
504611 011941 000902 000000 000042 005776 000000
Storage >= 2G:
Online = 1048576 Available List = 58339
Not init = 0 Offline = 0
- Indicate Load's XSTORE - rate of paging to expanded storage (13301
in our example).
- Indicate Load's PAGING - rate of paging to DASD (112 in our example).
- Indicate Load's STEAL - percentage of pages stolen from active users.
A high percentage is not necessarily bad if the rates for XSTORE and
PAGING are low. (91% in our example).
- Query Frame's >2G Available List - the number of frames on the
available list for use above 2GB (58339 in our example).
- Query Frame's AVAIL - different from the available list count
discussed above. It is the count of pages potentially in use by
guests below 2GB (504611 in our example).
Additional details on processing associated with storage below 2GB
can be found in VM monitor data. Various performance products may
report on these values.
How to Project Requirements for Storage Below 2GB
The following are factors that need to be understood to
determine your storage requirements.
There are a number of factors that impact the requirements for
storage under 2GB. Some of these can be influenced by configurations
and other tuning, while others are strictly workload dependent.
V=R Area is storage that is configured for V=R and V=F guests.
This area must reside below 2GB. Since CP and its control blocks must
also reside below 2GB, the V=R Area is limited to less than 2GB.
CP LOCK Command can be used to explicitly lock a guest's pages.
When this command is used, the pages are locked below 2GB. (The
CP SET RESERVE command is a better alternative to CP LOCK).
Real QDIO (FCP) Devices when used with a Linux guest require
certain structures to be locked below 2GB. Because of the direct memory
access nature of the QDIO architecture, these pages are locked for as
long as the device is in use. The default settings for a Linux guest with
a real QDIO device result in about 8MB of storage for network devices.
(For QDIO GbE there are 3 device addresses
associated with a connection, the 8MB is the total for all 3).
More recent releases of Linux (SLES 9 and RHEL 4) allow one to configure
the QDIO setup so that the maximum number (128) of buffers are not
always used. The default has also been lowered significantly so that
the amount of locked memory is closer to 1MB.
For FCP SCSI devices, there is a single device. The storage locked
below 2GB is not as predictable, but 800 pages is a good planning
Virtual QDIO for things such as Guest LAN and Virtual Switch does
not require the structures to be permanently locked. However, while an
I/O is being processed, pages will be locked below 2GB. The number of
pages is dependent on the I/O.
Traditional SSCH I/O except in assisted cases with V=R or V=F
guests, requires CCW translation by CP. In the CCW translation, CP will
need to have the guest page resident below 2GB. Also, when the real I/O
is issued, the I/O data areas are locked below 2GB until the I/O
Other CP Control Blocks besides the control blocks referenced
above also reside below 2GB. While some of these control blocks are
pageable, most are not. Some of the control blocks that may consume the
most storage are -
- VMDBKs - Virtual Machine Description Block describes each virtual
processor per guest logged on or running disconnected on the VM system.
- PGMBKs - Page Management Blocks (page tables). User PGMBKs can be
paged out under certain criteria. The number of PGMBKs is proportional
to the amount of virtual storage used.
- Other DAT (Dynamic Address Translated) Control blocks such as Region
and Segment tables for guests also consume storage below 2GB. Unlike
PGMBKs, these are not pageable. The number is also proportional to the
amount of virtual storage used.
- VDEVs - Virtual Device Blocks represent each virtual device defined.
- RDEVs - Real Device Blocks are created for each real device known to
- MDISKs - Minidisk Control Block are created for each minidisk in the
VM system. This control block only represents the minidisk, not the data
for that minidisk or related MDC structures.
- Spool Related control blocks on systems with large number of spool
files (Reader, Printer, Punch, Console)
may also consume a significant amount of storage below 2GB.
- Frame Table - CP uses a frame table to manage central storage. There
is an entry in the frame table for every frame of central storage. So the
more central storage available to VM, the larger the frame table.
(4MB of frame table entries for each GB of central storage).
Number of Virtual Machines - since many of the control blocks
listed above are required for virtual machines, having large numbers of
extra virtual machines running on the system increases the storage
requirements below the line. Included in this are service virtual
machines that are started automatically, but not used.
Worksheet for Storage Requirements
The following worksheet can be used to help compute fixed
storage requirements below 2GB. All values are decimal values and
this is calculated in pages. It also assumes CP running in 64-bit
mode. The units below is pages (4096 byte pages).
Frame Table = 1 page per MB of Real storage
Trace Table = 100 pages + (number_CPUs - 1) * 75
V=R/F Area = V=R size from system config
Dedicated OSAs = 2000 * number_of_dedicated_devices (triplets)
(use 150 instead of 2000 for newer releases of Linux)
Dedicated FCPs = 800 * number_of_dedicate devices
RDEVs = 0.125 * number_of_physical_devices
Spool Files = 0.05 * number_of_spool_files
Segment Tables = 0.001 * total_virtual_storage_MBs
Fixed PGMBKs = 2 * total_vdisk_size_MBs
Pageable PGMBKs= 2 * active_virtual_storage_MBs
VDEVs = 0.04 * total_number_virtual_devices
VMDBKs = 1 * total_number_virtual_processors
Methods to Lower Storage Requirements Below 2GB
There are several approaches to help improve the storage requirements.
Many of them involve trade-offs with other resources or characteristics.
The following are some of the most common approaches used, listed in no
Get on Current Software Levels
The following improvements can be found in various releases, more
details can be found in
the z/VM performance report.
- Linux Timer Kernel Patch reduces the timer pop frequency for idle
Linux guest virtual machines.
- z/VM 4.2.0 extended Minidisk Cache to 64-bit channel programs.
- z/VM 4.3.0 for certain cases, allowed pages stolen below 2GB to be
moved to central storage above 2GB instead of being paged out to expanded
storage or DASD. The control program is conservative in moving to central
storage above 2GB versus expanded storage.
- z/VM 4.4.0 changed how the dispatcher determines idle users.
- z/VM 4.4.0 allowed virtual disk in storage pages to be page faulted
in above 2GB.
- z/VM 5.1.0 further changed how the dispatcher determines idle users.
Other service to consider includes -
- VM63729 - improves efficiency when searching for pages below
2GB to steal. Applies to z/VM 4.4.0 and z/VM 5.1.0
- VM63730 - improves management of contiguous frame requests for
- VM63752 - allow steal for below 2GB to be more agressive about
moving stolen pages to real memory above 2GB rather than to expanded
Consider using the Linux Fixed I/O Buffer feature
On Linux SLES 9 SP1 or RHEL 4 systems, there is the ability to
use the Fixed I/O Buffer feature. This minimizes the number of guest
pages Linux uses for I/O at the cost of additional data moves inside
the guest. This is described in greater detail on the zSeries Linux
zSeries Linux Performance page.
Consider using a Guest LAN or Virtual Switch
In connecting Guests to the physical network, there are many choices.
Using a Guest LAN with a virtual router (or the Virtual Switch in
z/VM 4.4.0) instead of giving each guest its own GbE device reduces
required below 2GB.
Consider defining additional Expanded Storage
By configuring central storage as expanded storage, you lower the
storage required by the frame table. Also, if you have a large amount
of storage on the available list above 2GB, then a portion of that
storage would probably be more effective as expanded storage.
Lower Number of Real and Virtual Devices in System
While it can be nice to have access to all devices on all systems,
this approach consumes additional storage below 2GB.
Clean up Unnecessary Spool Files
Process or Disable Accounting, Logrec, Symptom Records
Various CP system services exist that will generate system information
which is kept in storage below 2GB if not processed. These services can
be disabled or one can set up the service machines to process the
Exploit Segments (DCSSs) Where Appropriate
One of the unique advantages of z/VM is the ability to use segments
and share them amongst guest virtual machines.
There are two things to consider in this area. First, there are a number
of segments that may be predefined on your system. If these are not used
then they should not be loaded into storage. The QUERY NSS MAP
command can be used determine the number of users (#USERS field)
using a segment.
The second consideration is that when a segment is shared with multiple
virtual machines, there is only one set of PGMBKs (page tables) which
can reduce storage requirements under 2GB.
Using an segment
for the Linux kernel is an example of this.
Limit Virtual Machine Storage Sizes
Since the number of segment and page tables are determined by the amount
of virtual storage a guest uses, you can limit these control blocks by
lowering the virtual machine storage size. PGMBKs are pageable, but only
with certain criteria (e.g. all pages in the segment represented by the
PGMBK must be paged out already).
Review usage of Virtual Disk in Storage
A virtual disk in storage is a volatile FBA disk that is backed by
memory (a system utility space) and is pageable. Virtual disks in storage
are often used as swap disks for Linux. While the pages that make up
the virtual disk data blocks are pageable, the PGMBKs for the virtual
disk are not pageable. (Also prior to z/VM 4.4.0, the virtual disk in
storage pages were page faulted in below 2GB.) Therefore, it would be
good to define the virtual disks in storage no larger than they need to
Evaluate MDC Usage for Traditional (SSCH) Disk I/O
Since I/Os satisfied from MDC do not require pages to be moved below
2GB, getting the full benefit of MDC can help lower storage requirements
below 2GB. For I/Os to be satisfied by MDC, they must be reads that
reference data that already exists in the cache. Possible ways to
influence this is to ensure there is sufficient memory for MDC (either
expanded storage or central storage) and that disks are eligible.
Dedicated disks are not eligible for MDC. In addition, some advantage
may be found with using Record level MDC instead of Track level
(default). Record Level MDC requires diagnose I/O instead of SSCH I/O.
The Linux DASD device driver
can be configured to use diagnose x'250' for its disk I/O.
Ensure Idle Guest Virtual Machines Treated as Idle
There have been a number of improvements to help ensure that inactive
guest virtual machines appear idle to the VM scheduler. When this
occurs the VM storage management algorithms can be more effective in
reclaiming the most appropriate page frames. The Linux Kernel Timer
Patch and z/VM APAR VM63282 (in base of z/VM 4.4.0) are examples of
Splitting into Multiple LPARs
A more involved alternative is to split the VM system into multiple
z/VM LPARs, thus effectively getting multiple 2GBs of storage. This
may also be desirable to provide hot backup or high availability
solutions. There is some duplication of base costs for running multiple
z/VM systems as well as administration costs to be considered.
Return to the Performance Tips Page