Contents | Previous | Next

Support for Guests Greater than One Terabyte

Abstract

Clients have asked for support for guests greater than 1 TB in size. IBM has made the support available, though certain considerations are in place. In testing the support IBM checked several aspects. First we checked that a guest could instantiate more than 1 TB. A memory stress inducer run in a 2 TB guest showed 2002 GB instantiated. Second we checked that when memory was overcommitted CP would page the guest. The stress inducer was shown to drive CP paging. Third we checked that logoff of such a guest could complete in a practical time. With the PTF for APAR VM66673, Large Guest LOGOFF Reset Time Mitigation, the LOGOFF-to-LOGON time for the memory stress guest reduced from about ten minutes down to ten seconds.

Introduction and Objectives

In considering the request IBM asked clients for their experiences in trying to run large guests. No clients reported instantiated counts or paging rates, but clients did report that large guests could have logoff delay up to thirty-five minutes or more. Thus our study had the following objectives:

  • Measure whether a guest greater than 1 TB can instantiate memory up to its defined limit.
  • Show that CP will page such a guest if needed.
  • Verify APAR VM66673 reduces LOGOFF-to-LOGON time.

Background

APAR VM66673 affects how logoff is done. Previously the guest retained its uppercase user ID until logoff finished. Said retention prevented a new instance of the guest from logging on. The change is to put the guest into a cleanup state known as LOGOFF or LOF state, changing its user ID to lowercase to indicate same. Once CP has put the guest into LOF state, CP then cleans up the guest's memory. In this way a new instance of the guest can be logged on even while CP is still cleaning up the previous instance.

Method

All runs were performed using a single logical partition on an IBM z16 3931-A01. No other partitions were active during collection. Runs used the items listed below and two storage hosts.

  • Dedicated LPAR with fifty processors.
  • Central storage is variable from 2.5 TB down to 1.5 TB. Half a terabyte is removed per scenario step.
  • z/VM 7.3 with and without Large Guest LOGOFF Reset Time Mitigation.
  • Non-SMT.
  • NVMe paging subsystem.
  • Four FCP devices on four FICON Express16SA.

The guest was configured as follows.

  • 2.0 TB of virtual memory.
  • SLES 15-SP4 Linux.

PTOUCH

The guest ran a memory stressor application called PTOUCH. One instance of the PTOUCH application touches its 40,000-MB guest memory buffer serially, page by page, over and over. The single guest ran fifty PTOUCH instances concurrently, for (50 x 40000) = 2000000 MB or about 2 TB of memory being touched by the PTOUCH applications. The test is run twice overall: first as a run of ten minutes, and second for long enough to instantiate guest memory space after the guest has been logged on again after logoff. This second test lets us measure a duration called LOGOFF-to-ACTIVE time, which is the amount of time for our workload to go from a LOGOFF command to the new instance of the guest being engaged in useful work.

There are six runs in total: three base runs and three experiment runs. Both runs use specific CP load modules: module NVPG710 which does not contain APAR VM66673, and module NVCPESS which contains APAR VM66673.

Results and Discussion

Instantiation and Paging

Table 1 presents the run metrics.

Table 1. PTOUCH 2 TB at Various Sizes of Central Storage.
Run ID PNB23828 PRB23824   PN223828 PR223825   PNA23828 PRA23825
Run date 2023/08/28 2023/08/24   2023/08/28 2023/08/25   2023/08/28 2023/08/25
Processor model 3931-A01 3931-A01   3931-A01 3931-A01   3931-A01 3931-A01
Processor serial 65E08 65E08   65E08 65E08   65E08 65E08
Partition name APRF1 APRF1   APRF1 APRF1   APRF1 APRF1
Cores 50 50   50 50   50 50
Logical CPUs 50 50   50 50   50 50
Central Memory TB 2.5 2.5   2.0 2.0   1.5 1.5
CP load module NVPG710 NVCPESS   NVPG710 NVCPESS   NVPG710 NVCPESS
LOGOFF PTF present no yes   no yes   no yes
ETR 2192.67 2176.65   2202.50 2178.08   1.28 1.35
ITR 4376.6 4353.3   4431.6 4373.7   9.1 9.2
Guest instantiated GB 2002.00 2002.00   2002.00 2002.00   2002.00 2002.00
Guest Inst/Central Memory 0.78 0.78   0.98 0.98   1.30 1.30
LOGOFF-to-LOGON sec 565 10   569 10   654 10
LOGOFF-to-ACTIVE sec 751 222   754 239   839 204
AUX Page Rate 0.0 0.0   0.0 45.9   122000.0 138000.0
AUX Page Rate/Tx 0.000 0.000   0.000 0.021   95312.500 102222.222

Notes:
IBM z16 3931-A01. One dedicated LPAR, 50 IFL cores, non-SMT. A SAMSUNG PM1733 15.36 TB split into 14 EDEVs. Paging uses 12. The Linux guest has access to two NVMe EDEVs. Internal development builds of z/VM 7.3 with and without Large Guest LOGOFF Reset Time Mitigation.

Paging

The 1.5 TB runs did what was expected. Sequential touching of pages by a memory thrashing program puts stress on the memory management algorithms when instantiated memory is greater than the dynamic paging area. This can be seen from the around 130k/sec AUX paging rate for both 1.5 TB runs. Because at steady state each sequential page-touch causes a page to be written and then a page to be read, and because a PTOUCH transaction is nothing but touching some number of pages sequentially, we can expect ETR to drop from memory-speed to disk-speed once central memory becomes too small to contain the whole guest.

Summary

When the guest fits into central memory, the PTOUCH workload functions without CP having to page the guest. The PTF reduced time to logoff from about ten minutes to about ten seconds.

When the guest does not fit into central memory, we can expect CP to page the guest. This specific workload is the worst-case scenario because it walks memory sequentially. A client workload will not necessarily touch pages the way PTOUCH does and so will not necessarily perform the way PTOUCH did.

Other Thoughts

The PTF makes it possible for a new instance of a guest to log on while CP is still cleaning up the memory of the instance that has logged off. Depending upon the cleanup rate for the old instance and the instantiation rate for the new instance, memory pressure during the cleanup period might be transiently higher than it was before the logoff started or after the cleanup completed.

Client results will be according to client configuration and workload. Not every greater-than-one-terabyte guest will instantiate all virtual memory frames or attempt to reference them all frequently or repeatedly, as our PTOUCH did. Such workloads might perform acceptably well when the I:R ratio is greater than 1.

The lowercase user IDs introduced by APAR VM66673 might impact z/VM performance analysis tools.

Recommendations

Clients should consult the list of considerations: Greater than One Terabyte Guest Support Considerations

In deploying large guests in memory-overcommitted configurations, clients should pay careful attention to measures of application performance and confine overcommit ratios to those that deliver acceptable application performance.

Contents | Previous | Next