SPXTAPE Performance

Last updated 18 January 1999


The following is an excerpt from VM/ESA Release 2.2 Performance Report (GC24-5673-01).

The SPXTAPE CP command is provided in VM/ESA Release 2.2 as a high-performance alternative to the SPTAPE command. (Note that SPTAPE is no longer supported.) These commands are primarily used to dump and load spool files to/from tape. See for a discussion of the techniques SPXTAPE uses to achieve this improved performance.

A set of measurements was collected in order to assess the performance of the SPXTAPE command. Most of these measurements were done in a dedicated environment (no other system activity) and were used to compare SPXTAPE to the SPTAPE command. The primary metric was the elapsed time required to process a given number of spool files.

One of the SPXTAPE DUMP cases was then repeated while the system was running the FS8F CMS-intensive workload at 90% processor utilization in order to observe the interactions between SPXTAPE and concurrent system activity.


Table of Contents

Dedicated Measurements

Workload: SPTAPE or SPXTAPE 

Hardware Configuration 

Processor model:             9121-742
Storage:
    Real:                    1024MB
    Expanded:                1024MB
Tapes:                       3490 (1, 2, or 4 used)
 
DASD:
 
Type of   Control  Number                  - Number of Volumes -
 DASD       Unit  of Paths PAGE    SPOOL    TDSK    User   Server   System
3390-2    3990-2     4       6        4               R                2 R
 
Note:  R or W next to the DASD counts means basic cache enabled or DASD
fast write (and basic cache) enabled, respectively.
 
 
Software Configuration
 
Virtual                           Machine
Machine     Number   Type         Size/Mode  SHARE   RESERVED  Other Options
OPERATOR        1    SPXTAPE      24MB/XA     100
SMART           1    RTM          16MB/370     3%       500    QUICKDSP ON
 
Note:  RTM was only active for certain measurements (see results).

Additional Information 

Spool files dumped/loaded:   34,000
Spool pages dumped/loaded:   353,000
Average pages per file:      10.4

Measurement Discussion  For the SPTAPE DUMP and SPXTAPE DUMP measurements, the system was prepared in advance by creating (approximately) 34,000 reader files on four 3390 spool volumes. The reader files ranged in size from 100 to 600,000 80-byte records and occupied, on average, 10.4 spool pages. The distribution of sizes was chosen so as to approximate the distribution of spool file sizes that was observed on a local CMS-intensive production system. The reader files were created concurrently by 8000 autologged users. SPTAPE or SPXTAPE were then used to dump these spool files to one or more 3490 tape drives.

For the SPTAPE and SPXTAPE LOAD measurements, the spool files were first purged from these same four spool volumes. The SPTAPE LOAD measurement was done by loading the tapes that had been created by SPTAPE DUMP. The SPXTAPE LOAD measurement was done by loading the tapes that had been created by the 1-tape SPXTAPE DUMP.

All required tape cartridges were preloaded into the 3490 cartridge stacker so that tape change time was as short and consistent as possible. Tape change time was observed to require about 1 minute and was mainly due to the time required to rewind the tape.

For each measurement, the timestamps provided in the summary log file created by SPTAPE or SPXTAPE were used to calculate the elapsed time. Additional timestamps in the summary log file were used to calculate the amount of time during which a tape drive was idle because a tape cartridge was being changed (Tape Change Time) and the amount of time this caused the system to be idle (System Idle Time).

RTM VM/ESA was active for some of the measurements. In those cases, the results table (Table 1) also includes processor time and processor utilization data.

Table 1. SPTAPE and SPXTAPE results -- dedicated 9121-742


Command
Function
Tape Drives


SPTAPE
DUMP
1


SPXTAPE
DUMP
1


SPXTAPE
DUMP
2


SPXTAPE
DUMP
4


SPTAPE
LOAD
1


SPXTAPE
LOAD
1


Elapsed Time
Elapsed Time (sec)
Elapsed Time Ratio
Rate Ratio


3:24:39
12279
1.000
1.00


22:42
1362
0.111
9.02


19:27
1167
0.095
10.52


18:55
1135
0.092
10.82


2:09:32
7772
1.000
1.00


1:07:42
4062
0.52
1.91


Tape Change Time (sec)
System Idle Time (sec)
Overlapped Tape Change Time (sec)
Percentage Overlap


653
653
0
0


371
230
141
38


309
0
309
100


247
0
247
100


671
671
0
0


371
0
371
100


Tapes
Percentage Reduction


12




7
42


7
42


8
33


12




7
42


Processor time (sec)
Processor utilization (%)1










79
1.7










902
5.6

Elapsed Time Ratio was calculated as SPXTAPE Elapsed Time divided by SPTAPE Elapsed Time. Rate Ratio was calculated as the reciprocal of Elapsed Time Ratio. Rate Ratio represents how many times faster SPXTAPE ran relative to the corresponding SPTAPE measurement.

Overlapped Tape Change Time was calculated by subtracting System Idle Time from Tape Change Time. Percentage Overlap was calculated as 100% * Overlapped Tape Change Time divided by Tape Change Time. This represents the percentage of time that changing tape cartridges was overlapped with other processing.

As shown in the above table, SPXTAPE was 9.0 to 10.8 times faster than SPTAPE for dumping spool files for the measured cases. The 2-tape and 4-tape cases were somewhat better than the 1-tape case because tape changes were completely overlapped with other processing. SPXTAPE was 1.9 times faster than SPTAPE for loading spool files for the measured 1-tape case.

The number of spool volumes is an important variable in determining how long it will take SPXTAPE to DUMP or LOAD a set of spool files. The more spool volumes there are, the faster SPXTAPE will tend to run because the spool I/Os are done in parallel across a larger number of devices. By contrast, SPTAPE only works with one spool volume at a time. Consequently, the performance relationship between SPTAPE and SPXTAPE will depend, among other things, upon the number of spool volumes. Four spool volumes were used in this test and resulted in roughly a 10:1 elapsed time improvement for dumping spool files. Cases where fewer spool volumes are involved will tend to see a smaller improvement than this, while cases with more spool volumes will tend to see a larger improvement.

SPXTAPE DUMP time is also sensitive to the distribution of spool file blocks across the spool volumes. This was illustrated by the results obtained from a 2-tape SPXTAPE DUMP of the same 34,000 files that had been built onto the 4 spool volumes by restoring from the SPTAPE dump tapes. Even though the files were the same, the elapsed time was 70% longer than the 1167 seconds shown in Table 1 because those files were less favorably distributed across the spool volumes.

One of the ways in which SPXTAPE provides better performance relative to SPTAPE is by reducing the amount of time it is idle waiting for a tape to be changed. This is shown by the reduction in System Idle Time in the results table. Three factors contribute to this improvement:

  1. SPXTAPE uses fewer tapes, resulting in fewer tape changes.

  2. Unlike SPTAPE, SPXTAPE supports the use of multiple tape drives. When two or more tape drives are made available, SPXTAPE is able to continue processing with another tape while one tape is being changed.

  3. SPXTAPE uses available real memory to buffer data. For example, during a SPXTAPE DUMP to one tape drive, when that tape drive becomes temporarily unavailable due to a tape cartridge change, SPXTAPE continues to read spool files from DASD and stages them in real storage. This buffering continues until there is no more available real storage or the tape drive becomes ready again.

    SPXTAPE looks at system paging indicators to determine how much of the system's total real storage it can use. In these dedicated measurements, SPXTAPE was able to use large amounts of real storage for this buffering, resulting in substantial overlapping of processing with tape changes. For the 1-tape DUMP, it was able to overlap 38% of the tape change time. For the 1-tape LOAD, it overlapped all of the tape change time. The degree of overlap would be much less, of course, on systems with less real memory or in cases where there is concurrent activity that has significant memory requirements.

The results show that SPXTAPE elapsed time was relatively insensitive to the number of tape drives that were used. Elapsed time decreased by 14% when going from 1 tape to 2 tapes. There was very little further elapsed time reduction when going from 2 tapes to 4 tapes. This is because, with only four spool volumes, the limiting factor tended to be the time required to transfer the spool files from/to DASD.

SPXTAPE DUMP and LOAD processing result in low processor utilizations. Even though the I/O processing is heavily overlapped, the SPXTAPE DUMP and LOAD functions are I/O-bound.

The SPXTAPE LOAD measurement was run with default options in effect. SPXTAPE offers a NODUP option. If selected, SPXTAPE ensures that each loaded spool file is not a duplicate of any spool file that is already on the system. Use of this option can increase processing requirements significantly because each incoming spool file must be checked against all existing spool files.

SPTAPE writes a tape mark between each backed up spool file. The smaller the files, the more tape space is taken up by these tape marks. SPXTAPE writes the spool file data as one tape file consisting of 32K blocks. This reduces the number of tape volumes required to hold the spool files. The smaller the average spool file size, the larger the reduction relative to SPTAPE. For the measured environment, the average spool file was 10.4 pages and a 42% reduction in tapes was observed (1-tape case).

One disadvantage of using multiple tape drives with SPXTAPE is that it can increase the number of tapes required. For example, the SPXTAPE DUMP to 1 tape drive required 7 tapes, while the SPXTAPE DUMP to 4 tape drives required 8 tapes. Using "n" tape drives means that there are "n" partially filled tapes when the dump has completed. Because of this, it is better to use no more tape drives than is necessary to keep DASD I/O as the limiting factor, with a minimum of two tapes in order to get the tape change overlap benefits. For the measured case, 2 tape drives was a good number. For a case where there are far more spool volumes, using more than two tape drives can be beneficial. The suggested rule-of-thumb is one tape drive for every 4 to 6 spool volumes, with a minimum of two.

SPXTAPE will tend to perform especially well relative to SPTAPE when one or more of the following conditions apply:

  • The spool files are spread across many spool volumes.

  • Two tape drives are used (or more, if appropriate).

  • Tape change times are long.

  • The spool files are small.

Interaction with Concurrent Activity

A measurement was done to explore the degree to which SPXTAPE DUMP can affect the response times of concurrent CMS interactive work and the degree to which that activity can affect the elapsed time required to complete the SPXTAPE DUMP function.

Workload: FS8F0R + SPXTAPE DUMP 

Hardware Configuration 

Processor model:            9121-742
Processors used:            4
Storage:
    Real:                   1024MB
    Expanded:               1024MB
Tapes:                      3480 (Monitor)
                            3490 (2 tapes, used by SPXTAPE)
 
DASD:
 
 
Type of   Control  Number                  - Number of Volumes -
 DASD       Unit  of Paths PAGE    SPOOL    TDSK    User   Server   System
3390-2    3990-3     4       6        7       7    32 R                2 R
3390-2    3990-2     4      16        6       6
 
Note:  R or W next to the DASD counts means basic cache enabled or DASD
fast write (and basic cache) enabled, respectively.
 
 
Note:  The spool files backed up by SPXTAPE were all contained on 4 of the
spool volumes behind a 3990-2 control unit.
 
 
Communications:
                                             Lines per
Control Unit                  Number      Control Unit               Speed
3088                              1                NA                4.5MB
 
Software Configuration
 
Driver:                     TPNS
Think time distribution:    Bactrian
CMS block size:             4KB
 
 
Virtual Machines:
 
Virtual                           Machine
Machine     Number   Type         Size/Mode  SHARE   RESERVED  Other Options
OPERATOR        1    SPXTAPE      24MB/XA     100
SMART           1    RTM          16MB/370     3%       500    QUICKDSP ON
VSCSn           3    VSCS         64MB/XA   10000      1200    QUICKDSP ON
VTAMXA          1    VTAM/VSCS    64MB/XA   10000       550    QUICKDSP ON
WRITER          1    CP monitor   2MB/XA      100              QUICKDSP ON
Unnnn        5500    Users        3MB/XC      100

Additional Information 

Spool files backed up:      26,000
Spool pages backed up:      264,000
Average pages per file:     10.2
Tapes required: 6

Measurement Discussion  The 34,000 spool files were created on the system in advance, as described in "Dedicated Measurements". The same four spool volumes were used. The system was then configured for a standard FS8F CMS-intensive workload measurement. This configuration had 9 additional spool volumes, for a total of 13 spool volumes. These additional spool volumes accommodate the spool activity generated by the FS8F workload.

The stabilization period for a 5500 user FS8F measurement was allowed to complete. An SPXTAPE DUMP to two 3490 tape drives was then started and remained active during the entire time that a FS8F measurement was obtained. A subset (26,000) of the 34,000 spool files was dumped. The FS8F measurement interval is usually 30 minutes for 9121-742 measurements. However, in this case, the measurement interval was ended when the SPXTAPE DUMP command completed, resulting in an FS8F measurement interval of about 15 minutes. 2

Table 2 compares the results of this measurement to a corresponding measurement without SPXTAPE activity. The results show the impact of the SPXTAPE DUMP activity on the performance of the overall system. Average external response time (AVG LAST(T)) increased from 0.3 seconds to about 1.7 seconds when the SPXTAPE DUMP was active. This increase was mainly due to I/O contention on the spool volumes. SPXTAPE processor and real storage usage were not significant factors. In both cases, they were a small percentage of the total system resources.

The rate at which SPXTAPE was able to dump spool pages decreased from 303 pages/second in the dedicated case to 237 pages/second when the system was running at 90% processor utilization with the FS8F workload, a 22% decrease. This decrease is primarily due to contention with the CMS users for processor time and the spool volumes.

Table 2. Performance interactions between SPXTAPE DUMP and concurrent CMS users


SPXTAPE DUMP
Release
Run ID


NO
VM/ESA 2.2
S47E5504


YES
VM/ESA 2.2
S47E5506



Difference





%Difference




Environment
Real Storage
Exp. Storage
Users
VTAMs
VSCSs
Processors



1024MB
1024MB
5500
1
3
4



1024MB
1024MB
5500
1
3
4




















Response Time
TRIV INT
NONTRIV INT
TOT INT
TOT INT ADJ
AVG FIRST (T)
AVG LAST (T)



0.110
0.337
0.250
0.233
0.231
0.308



0.079
0.466
0.292
0.341
0.531
1.661



-0.031
0.129
0.042
0.108
0.300
1.353



-28.18%
38.28%
16.80%
46.53%
130.11%
439.88%


Throughput
AVG THINK (T)
ETR
ETR (T)
ETR RATIO
ITR (H)
ITR
EMUL ITR
ITRR (H)
ITRR



26.07
179.76
193.12
0.931
218.42
50.91
76.29
1.000
1.000



21.57
218.37
187.01
1.168
213.07
62.29
95.86
0.975
1.224



-4.50
38.61
-6.12
0.237
-5.36
11.38
19.56
-0.025
0.224



-17.28%
21.48%
-3.17%
25.45%
-2.45%
22.35%
25.64%
-2.45%
22.35%


Proc. Usage
PBT/CMD (H)
PBT/CMD
CP/CMD (H)
CP/CMD
EMUL/CMD (H)
EMUL/CMD



18.313
18.279
6.490
6.058
11.823
12.220



18.773
18.769
6.994
6.577
11.779
12.192



0.461
0.491
0.505
0.519
-0.044
-0.028



2.51%
2.69%
7.78%
8.57%
-0.37%
-0.23%


Processor Util.
TOTAL (H)
TOTAL
UTIL/PROC (H)
UTIL/PROC
TOTAL EMUL (H)
TOTAL EMUL
MASTER TOTAL (H)
MASTER TOTAL
MASTER EMUL (H)
MASTER EMUL
TVR(H)
TVR



353.66
353.00
88.42
88.25
228.33
236.00
90.85
91.00
38.95
40.00
1.55
1.50



351.08
351.00
87.77
87.75
220.28
228.00
89.80
90.00
36.42
38.00
1.59
1.54



-2.59
-2.00
-0.65
-0.50
-8.06
-8.00
-1.06
-1.00
-2.53
-2.00
0.04
0.04



-0.73%
-0.57%
-0.73%
-0.57%
-3.53%
-3.39%
-1.16%
-1.10%
-6.50%
-5.00%
2.90%
2.92%


Storage
NUCLEUS SIZE (V)
TRACE TABLE (V)
WKSET (V)
PGBLPGS
PGBLPGS/USER
FREEPGS
FREE UTIL
SHRPGS



2540KB
800KB
72
233K
42.4
15585
0.92
1765



2544KB
800KB
72
230K
41.8
18180
0.90
1605



4KB
0KB
0
-3K
-0.5
2595
-0.02
-160



0.16%
0.00%
0.00%
-1.29%
-1.29%
16.65%
-2.03%
-9.07%


Paging
READS/SEC
WRITES/SEC
PAGE/CMD
PAGE IO RATE (V)
PAGE IO/CMD (V)
XSTOR IN/SEC
XSTOR OUT/SEC
XSTOR/CMD
FAST CLR/CMD



623
455
5.582
162.300
0.840
828
1409
11.583
8.813



502
453
5.107
158.300
0.846
797
1302
11.224
8.545



-121
-2
-0.475
-4.000
0.006
-31
-107
-0.359
-0.268



-19.42%
-0.44%
-8.51%
-2.46%
0.73%
-3.74%
-7.59%
-3.10%
-3.04%


Queues
DISPATCH LIST
ELIGIBLE LIST



102.55
0.02



126.30
0.00



23.74
-0.02



23.15%
-100.00%


I/O
VIO RATE
VIO/CMD
RIO RATE (V)
RIO/CMD (V)
NONPAGE RIO/CMD (V)
DASD RESP TIME (V)
MDC READS (I/Os)
MDC WRITES (I/Os)
MDC AVOID
MDC HIT RATIO



1806
9.352
547
2.832
1.992
19.900
557
26
515
0.91



1772
9.476
793
4.240
3.394
18.400
545
26
505
0.91



-34
0.124
246
1.408
1.402
-1.500
-12
0
-10
0.00



-1.88%
1.33%
44.97%
49.71%
70.38%
-7.54%
-2.15%
0.00%
-1.94%
0.00%


PRIVOPs
PRIVOP/CMD
DIAG/CMD
DIAG 04/CMD
DIAG 08/CMD
DIAG 0C/CMD
DIAG 14/CMD
DIAG 58/CMD
DIAG 98/CMD
DIAG A4/CMD
DIAG A8/CMD
DIAG 214/CMD
SIE/CMD
SIE INTCPT/CMD
FREE TOTL/CMD



20.570
25.324
0.948
0.738
1.125
0.025
1.249
0.345
3.592
2.824
13.333
56.959
38.162
44.821



20.465
25.411
0.988
0.685
1.106
0.025
1.261
0.355
3.648
2.910
13.292
58.821
38.822
45.480



-0.104
0.087
0.041
-0.053
-0.019
0.000
0.012
0.010
0.056
0.086
-0.041
1.863
0.660
0.658



-0.51%
0.34%
4.30%
-7.17%
-1.68%
0.99%
0.92%
2.96%
1.56%
3.06%
-0.31%
3.27%
1.73%
1.47%


VTAM Machines
WKSET (V)
TOT CPU/CMD (V)
CP CPU/CMD (V)
VIRT CPU/CMD (V)
DIAG 98/CMD (V)



4137
2.7703
1.2312
1.5390
0.345



4159
2.7883
1.2668
1.5215
0.356



22
0.0180
0.0356
-0.0175
0.012



0.53%
0.65%
2.89%
-1.14%
3.40%

Note: T=TPNS, V=VMPRF, H=Hardware Monitor, Unmarked=RTM

Footnotes:

1
In the table, processor utilization is calculated as processor time divided by total processing capacity, where total processing capacity is elapsed time * 4. Because the 9121-742 has 4 processors, its total capacity is 4 processor-seconds per elapsed second.

2
This shorter run interval is also the reason for the apparent decrease in think time between the two measurements.

Back to the Performance Tips Page