VM/ESA 2.1.0 Performance Changes
- Performance Improvements
- Performance Considerations
- Performance Management
Performance Improvements
In order to reduce overhead in the linkage of CP modules, changes were
made in three areas for some frequent linkage cases. The first was to
change some dynamic linkages to use fast dynamic linkage.
Fast dynamic linkage was first introduced in VM/ESA 1.1.1, and is a
more efficient method to do CP dynamic linkage.
The second area was to move some highly used modules from the pageable
list to the resident list to save overhead. This increases the size of
the resident nucleus slightly.
The last area was to avoid cutting trace entries for calls from
HCPALLVM. This module is passed the address of a routine and then
calls the passed routine for every VMDBK on the system. This is done
for such functions as CP monitor high frequency user state sampling.
By avoiding the trace entries, much overhead is saved and the trace
table is not cluttered with less useful trace entries.
The amount of processing time required to handle VMCF interrupts has
been reduced. The amount of improvement is directly related to the
number of pending VMCF interrupts. If there are only a few pending
interrupts, the effect is negligible. If there are hundreds of pending
interrupts, the processing time per interrupt is reduced by about a
factor of two. This improvement only applies to customers who are
migrating from VM/ESA 1.2.1 or VM/ESA 1.2.2.
The virtual channel to channel (VCTC) locking scheme has been
improved. The new scheme uses a separate lock word for each VCTC
instead of a global lock. This may improve VCTC throughput, especially
in cases where a virtual machine is doing a high rate of requests
through several VCTCs.
The global lock is still required in certain cases such as processing
the COUPLE command.
The default calculation for the size of the CP trace table has been
changed. This will ordinarily result in a smaller trace table for
systems that take the default. As before, the trace table size can
also be set explicitly.
In the past, the trace table size defaulted to 1/64th of available
storage for each processor. With VM/ESA 2.1.0, that calculation is done
for the master processor. However, if the result is greater than 100
pages, the master trace table size is set to 100. The trace table size
for each alternate processor is then set to 75% of the size of the
master processor trace table. Example savings for a 512MB 6-way system
is over 46MB.
Most people feel that the new calculation represents a much better
tradeoff between serviceability and real storage usage, especially
considering that VM/ESA reliability has improved dramatically over the
last several releases.
This improvement applies only to migrations from VM/ESA 1.2.2. It is also
available on VM/ESA 1.2.2 as APAR VM59590.
Minidisk caching (MDC) has a fair share algorithm to prevent any one
user from flooding the cache with data.
This can be disabled with the NOMDCFS directory option.
The fair share insert limit is dynamic but has a floor (minimum value).
Analysis of benchmarks and customer data on VM/ESA 1.2.2 systems showed that
the floor was too low and that this was degrading system performance.
Accordingly, the old floor of 8 inserts per minute was increased to 150
inserts per minute.
The I/Os excluded due to the fair share limit being exceeded and the
current fair share limit are reported by RTM (MDCACHE screen) and VMPRF
(MINIDISK_CACHE_BY_TIME report).
If, on your current system, this information shows that a significant
number of I/Os are being excluded from minidisk caching due to the fair
share limit being exceeded, the system may benefit from this change.
Prior to VM/ESA 2.1.0, VM/ESA supported guest-use of the Record Cache I
function in the 3990-6 storage control. This was for guest operating
systems that built their own channel programs. Record Cache I support
is now extended to users of DIAGNOSE X'A4', DIAGNOSE X'250', and the
block-I/O facility under certain conditions. When VM knows that data
being written meets the control unit's criteria for "regular data
format", VM sets a channel-program indicator to achieve improved
performance for I/O requests issued through these I/O-service
facilities. I/O requests must meet all of the following conditions:
- The request is to write data.
- The request is eligible to be stored in VM's minidisk cache.
- The request is for a track that was previously read and found to be
in standard format.
(1)
The record-cache function requires the DASD Fast Write (DFW) function
to be enabled and adds to the performance benefits of this function.
With DFW, the control unit completes the I/O request almost
immediately. The host need not wait for the data to be written
(destaged) to the DASD volume, since the data is protected from loss by
residing in the control unit's nonvolatile storage (NVS) while waiting
to be destaged. However, if the record is not already in the control
unit's cache when an update is received from the host, the entire track
must be staged (read) into the cache before the I/O request can
complete. The additional performance advantage of the record-cache
function is that staging of the track containing the record can be
avoided.
Because VM tells the control unit that the record being written is in a
standard format, the control unit knows that the record will fit within
the existing format of the track when the record is ultimately destaged
from the NVS.
The LOCATEVM CP command (class G) can use a very large amount of
processor time when a large search range is given. Because of this,
use of this command can adversely impact system performance. As a
precaution, some installations have chosen to reassign this command to
a more restrictive class.
In VM/ESA 2.1.0, LOCATEVM processor requirements have been substantially
reduced (by 75% in one test). While LOCATEVM can still use a
significant amount of processor and paging resources, it is now less
risky to leave it available to class G users.
Most of the CMS REXX execs and XEDIT macros on the S-disk are now
shipped as compiled REXX files.
This includes all files (except SYSPROF EXEC) that are in the CMSINST
shared segment and a number of others. They make use of a subset REXX
run-time library that is shipped with VM/ESA 2.1.0.
Some of the CMS execs (most notably FILELIST, RDRLIST, and PEEK) are
written in EXEC2. They remain in EXEC2 and are not affected by this
change.
This change can significantly improve the performance of CMS intensive
workloads that use REXX-implemented CMS functions such as DIRLIST,
DISCARD, NAMES, NOTE, RECEIVE, SENDFILE, TELL, and VMLINK, as well as
Xedit macros such as ALL and SPLTJOIN. Processor capacity improvements
exceeding 6% have been observed.
The uncompiled source files are provided on the S-disk for customers
who wish to make modifications. Customers with the REXX compiler are
advised to recompile the updated files before placing them back onto
CMSINST so as to retain the performance advantages.
The CMS nucleus was restructured in VM/ESA 2.1.0. This improved
performance in a number of ways:
- More of the CMS code has been moved above the 16MB line.
This can improve performance by allowing more use of SAVEFD and by
allowing more shared segments to be created that require space below
the 16MB line.
- The CMS shared system now starts at X'F00000' instead of X'E00000'.
- NLS language repository segments can now reside above the 16MB line.
- In prior releases, the default installation procedure placed the
VMLIB, VMMTLIB, and PIPES segments below the 16MB line (even though
they can be run above the line) because of the possibilty that CMS
could be run in a 370 mode machine. With VM/ESA 2.1.0, default
installation puts these segments above the 16MB line.
(VMMTLIB has been integrated into the portion of the CMS saved system
that is above the line.) This change frees storage that was previously
taken below the 16MB line.
- CMS now allocates some of its control blocks (such as the IUCV path
table) above the 16MB line when such space is available.
- The 370-mode code has been removed from the mainline paths in CMS.
- The fast path through the SVC interrupt handler (DMSITS) has been
further optimized.
- The following modules have been moved from the S-disk back into the
CMS nucleus:
- DMSQRC - query COMDIR
- DMSQRE - query ENROLL
- DMSQRF - query CMS (window manager)
- DMSQRG - query CMS (window manager)
- DMSQRH - query CMS (window manager)
- DMSQRN - query NAMEDEF
- DMSQRP - query FILEPOOL
- DMSQRQ - query LIMITS, FILEWAIT, RECALL
- DMSQRT - query AUTOREAD, CMSTYPE, and so forth
- DMSQRU - query FILEDEF, LABELDEF
- DMSQRV - query INPUT, OUTPUT, SYNONYM
- DMSQRW - query libraries (MACLIB, and so forth)
- DMSQRX - query DOS, DOSPART, UPSI, DLBL
- DMSSEC - set COMDIR
- DMSSEF - set CMS (window manager)
- DMSSML - set/query MACLSUBS
This can benefit the performance of workloads that use these functions
if they had not previously been used from a shared segment.
In VM/ESA 1.2.2, these modules resided in the CMSQRYL and CMSQRYH
logical segments. These segments no longer exist.
CMS Pipelines now provide assembler macros that perform basic pipeline
functions and are the building blocks for writing assembler stage
commands. User-written assembler stage commands provide increased
performance over similar stage commands written in REXX.
VM/ESA 2.1.0 includes data compression API support so vendors and customers
can more easily create applications that exploit the use of compression
services. Both a macro interface (CSRCMPSC) and a CSL interface
(DMSCPR) are provided. Use of this support can save DASD space, tape
storage space, and transmission line costs. The increase in processing
time associated with data compression and expansion is greatly reduced
on processors that have hardware compression (CMPSC instruction).
In addition, CMS and GCS support the VSE/VSAM Version 6 Release 1.0
interface for data compression. Using the COMPRESS parameter of the
DEFINE function will cause VSAM to automatically expand or compress
data during a VSAM read or write operation, respectively.
When available on the processor, the CMPSC instruction is used for
this purpose.
CMS and GCS system users can read and write to VSAM files that have
been compressed under the control of the VSE/VSAM program. No
application program changes are necessary.
This new XEDIT subcommand performs the same function as the existing
SET subcommand.
However, it can be used to set multiple options in one invocation.
The following CMS Productivity Aids were changed to use this
subcommand: FILELIST, RDRLIST, NOTE, SENDFILE, PEEK, DIRLIST, and the
EXECUTE XEDIT macro.
It can also be used to improve the performance of user-written
applications that include performance-sensitive XEDIT macros.
You can now make use of the CP PAGEX facility with GCS. PAGEX is
specified on a virtual machine basis. When PAGEX is ON and a given GCS
task takes a page fault, GCS will dispatch other active GCS tasks in
the virtual machine while waiting for that page fault to be resolved.
This can result in increased capacity for that virtual machine to do
work.
PAGEX is especially useful in cases where a virtual machine has a large
number of GCS tasks and these tasks are active on an intermittent
basis. A good example would be an RSCS machine with many line drivers.
If this is not the case, SET RESERVE remains the best method to
minimize the effects of paging. SET RESERVE works best when the
virtual machine's reference pattern has good locality of reference and
its working set size does not change much over time. In intermediate
cases, the best tuning solution might be to use a combination of PAGEX
ON and SET RESERVE. SET RESERVE would be used to protect the most
frequently used pages, while PAGEX ON would be used to keep those page
faults that do occur from serializing the whole virtual machine.
PAGEX is not recommended for the VTAM machine because most VTAM
execution is on one GCS task.
In prior releases, the GCS time slice was fixed at 300 milliseconds.
With VM/ESA 2.1.0, 300 milliseconds is retained as the default setting but
this can be altered for any given virtual machine in a GCS group by
using the new SET TSLICE GCS command.
A smaller time slice setting can be used to help avoid time-out
situations when multiple tasks are involved. You can estimate whether
the default time slice setting is likely to result in a time-out
situation. For example, if the QUERY TSLICE command shows 100 active
tasks, the maximum delay before a given task is run is 100 times 0.300,
or 30 seconds. If 30 seconds is the line time-out limit, you should
set the time slice lower.
Note: Setting the time slice lower than it needs to be will tend to
increase GCS dispatching overhead.
The performance of the VMFBLD function has been improved. Elapsed time
and processor time reductions exceeding 20% have been observed.
This improvement was first introduced through VM/ESA 1.2.2 APAR VM57938.
The performance of VMFCOPY has been improved by providing an SPRODID
option. In prior releases, all files that met the fn ft fm
criteria were copied regardless of what product they belonged to.
When the SPRODID option is specified, only those files that belong to
the specified product are copied.
Performance Considerations
A minidisk that is mapped to a VM data space should have minidisk cache
disabled. Support for improved data integrity was introduced in this
release for a unique scenario. If a minidisk is mapped to a VM data
space and is also eligible for minidisk cache, CP will now attempt to
flush minidisk cache whenever the Save function is invoked for the
mapped minidisk. This can result in significant CPU overhead. There
is typically no performance benefit from using minidisk cache for a
mapped minidisk.
The number of pages that are referenced during IPL CMS but are
(typically) unused thereafter has increased by about 12. This
increases DASD paging space requirements to some extent. Since these
referenced pages must ultimately be paged out, they can also reduce
performance in situations where large numbers of CMS users are logging
on over a short period of time.
Many additional virtual pages in the user's virtual machine are
referenced when the OpenExtensions environment is initialized. This
occurs implicitly when the first OpenExtensions request is made. Nearly
all of these additional pages are no longer referenced if there are no
subsequent OpenExtensions requests. However, these pages will add to the
number of occupied page slots on DASD. This leads to the following two
recommendations:
- If many users are (even occasionally) using the OpenExtensions
environment, take a look at whether the system's page space is still
sufficient.
- Do not put OpenExtensions-oriented commands such as OPENVM MOUNT in
your PROFILE EXEC unless you will normally be using OpenExtensions
functions subsequent to starting CMS.
CMS references more non-shared pages than it did in VM/ESA 1.2.2.
This will tend to increase paging, especially in storage-constrained
environments with large numbers of CMS users.
In VM/ESA 1.2.2, the CMS saved system occupied megabytes E, F, and 10.
In VM/ESA 2.1.0, it occupies megabytes F, 10, 11, and 12.
If your installation has defined any shared segments in megabytes 11 or
12, they will need to be moved in order to avoid overlapping CMS.
Performance Management
A number of new monitor records and fields have been added. Some of
the more significant changes are summarized below. For a complete list
of changes, see the MONITOR LIST1403 file for VM/ESA 2.1.0.
See for information about this file.
- User State Sampling
A number of changes were made to improve the usefulness of the user
state sampling data.
- Users doing diagnose I/O used to show up as being in simulation
wait. They now appear in I/O wait.
- Users in CP SLEEP or CP READ used to be shown as being in console
function mode wait. They now appear as idle.
- A new state, active page wait, has been added for virtual machines
that have a page request outstanding but can handle it with PAGEX or
asynchronous page fault handling.
- CP Configurability II
The CP Configurability II support allows I/O devices to be added or
removed from the I/O hardware configuration while VM/ESA is running.
In order to track these changes, several new event I/O domain records
were added, such as the delete device record (domain 6 record 15,
D6/R15).
A measurement block, sometimes referred to as a subchannel measurement
block, is a control block that is associated with a given device.
It contains measurement data for the device such as the I/O rate and
timing values for the various components of service time.
The hardware is responsible for updating this information.
From the measurement block information, performance products can
compute the device's service time, I/O rate, and utilization.
With the CP Configurability II support, it is now possible for a given
device to not have an associated measurement block.
Accordingly, information has been added to the monitor to indicate when
this is the case.
The new SET SCMEASURE command allows an administrator to enable or
disable the collection of subchannel measurement data for a specific
device or range of devices. An event record is created each time the
SET SCMEASURE command is used.
- SET THROTTLE
Monitor fields has been added in support of the new SET THROTTLE
command. This includes:
- whether a device has been throttled
- the throttle rate for a device
- the number of times I/O was delayed on a given device
- the number of times a given user had I/O delayed due to throttle
- RAMAC Support
Monitor support for RAMAC is available for VM/ESA 1.2.1 and VM/ESA
1.2.2 through development APAR VM59200.
This support is integrated into VM/ESA 2.1.0.
Since RAMAC DASD appear to VM as either 3380s or 3390s, additional
fields have been added to the device configuration data record (D1/R6)
and the vary on device record (D6/R1) to indicate the actual DASD and
control unit type where possible.
Cache activity data records (D6/R4) have been made available for the
RAMAC subsystem.
- SFS APPLDATA
The APPLDATA domain monitor data contributed by SFS filepool servers
has been extended to include counts and timings that pertain to the
byte file system. These include byte file request counts for each type
of request, lock conflict counts for each type of byte file lock
conflict, and token callback information.
- CMS Multitasking APPLDATA
CMS multitasking can contribute application data to the monitor in the
APPLDATA (10) monitor domain.
This includes the following information:
- Thread creation and deletion counts and timings
- Thread switch rates
- Number of blocked threads
- Highest number of threads and POSIX processes in use.
The I/O service times and related information for a device are computed
from data found in its associated subchannel measurement block, which
the hardware is responsible for updating. With the new functions
provided by CP Config II, there can now be scenarios where there is not
a subchannel measurement block associated with a device. In such
cases, the service times and related data are not available and are
shown as zeros in the monitor data.
This is a new CP command that can be used to set a maximum rate at
which the system's virtual machines, in aggregate, are permitted to
initiate I/Os to a given device. This limit does not apply to I/Os
initiated by CP. CP converts the specified rate into an interval
representing the minimum time that must pass after one I/O is started
before the next I/O to that device can start. If CP receives an I/O
request to a device that has been limited by SET THROTTLE, that I/O
request is delayed, if necessary, until the minimum time interval has
completed.
In multi-system configurations which have shared channels, control
units, or devices, SET THROTTLE can be used to help prevent any one
system from overutilizing the shared resources.
Information has been added to the QUERY FILEPOOL commands to provide
byte file system performance information. In particular, byte file
system counts and timings have been added to QUERY FILEPOOL REPORT and
its subset, QUERY FILEPOOL COUNTER.
The following list describes fields in the virtual machine resource
usage accounting record (type 01) that may be affected by performance
changes in VM/ESA 2.1.0.
The columns where the field is located are shown in parentheses.
- Milliseconds of processor time used (33-36)
- This is the total processor time charged to a user and includes
both CP and emulation time. For most workloads, this should not change
much as a result of the changes made in VM/ESA 2.1.0. Exception: CMS
intensive workloads that make significant use of DIRLIST, DISCARD,
NAMES, NOTE, RECEIVE, SENDFILE, TELL, and VMLINK, and/or Xedit macros
such as ALL and SPLTJOIN. Such workloads can experience a significant
reduction in total processor time arising from CMS's use of compiled
REXX. Most of this decrease will be virtual processor time.
- Milliseconds of virtual processor time (37-40)
- This is the virtual time charged to a user. See the above
discussion of total processor time.
- Requested virtual nonspooled I/O starts (49-52)
- This is a total count of requested starts.
All requests may not complete.
The value of this field could change, depending on the system I/O
characteristics, because of two changes made to CP:
- In previous releases, this counter was incremented for each real
I/O done.
This included the scenario where CP splits a virtual I/O into a
separate real I/O for each cylinder involved.
In VM/ESA 2.1.0, this counter will be incremented only once per virtual I/O.
- In the past, virtual I/Os eligible for minidisk caching, but not
satisfied from the minidisk cache, were not always being counted.
This has been corrected.
- Completed virtual nonspooled I/O starts (73-76)
- This is a total count of completed requests.
The previous discussion of "requested virtual nonspooled I/O starts"
also applies to this field.
VM Performance Reporting Facility 1.2.1 (VMPRF) requires APAR VM59656
(PTF UM27312) to run on VM/ESA 2.1.0. VMPRF at this service level
includes the following functional enhancements:
- 3990-6 cache controller support
- RAMAC support
- LE/370 support
- page active and limit list data have been added to the state sample
reports
Realtime Monitor VM/ESA 1.5.2 (RTM/ESA) requires APAR GC05374 (PTF
UG03792) to run on VM/ESA 2.1.0. It can be run on CMS11 (or earlier)
in 370 mode or on CMS12 in XA mode with CMS370AC on. If it was built
on CMS12 and is run on CMS12, it will set CMS370AC on.
RTM/ESA at this service level can be built using HLASM Release 2.
Support for RAMAC DASD has been added.
Performance Analysis Facility/VM 1.1.3 (VMPAF) will run on VM/ESA 2.1.0
with the same support as VM/ESA 1.2.2.
Footnotes:
(1)
Minidisk caching considers a track to be in standard format when it
meets certain criteria. For example, all DASD records on the track
must have the same length and an integral number of these records must
fit into a 4KB page. There are other criteria as well; for a complete
definition of standard format, see the minidisk caching chapter in
VM/ESA CP Diagnosis Reference.
Back to the Performance Changes Page
|