SPXTAPE Performance
Last updated 18 January 1999
The following is an excerpt from VM/ESA Release 2.2 Performance
Report (GC24-5673-01).
The SPXTAPE CP command is provided in VM/ESA Release 2.2 as a
high-performance alternative to the SPTAPE command. (Note that SPTAPE
is no longer supported.) These commands
are primarily used to dump and load spool files to/from tape. See
for a discussion of the techniques SPXTAPE uses
to achieve this improved performance.
A set of measurements was collected in order to assess the
performance of the SPXTAPE command. Most of these measurements were
done in a dedicated environment (no other system activity) and were
used to compare SPXTAPE to the SPTAPE command. The primary metric
was the elapsed time required to process a given number of
spool files.
One of the SPXTAPE DUMP cases was then repeated while the
system was running the FS8F CMS-intensive workload at 90%
processor utilization in order to observe the interactions between
SPXTAPE and concurrent system activity.
Workload: SPTAPE or SPXTAPE:
Hardware Configuration:
Processor model: 9121-742
Storage:
Real: 1024MB
Expanded: 1024MB
Tapes: 3490 (1, 2, or 4 used)
DASD:
Type of Control Number - Number of Volumes -
DASD Unit of Paths PAGE SPOOL TDSK User Server System
3390-2 3990-2 4 6 4 R 2 R
Note: R or W next to the DASD counts means basic cache enabled or DASD
fast write (and basic cache) enabled, respectively.
Software Configuration
Virtual Machine
Machine Number Type Size/Mode SHARE RESERVED Other Options
OPERATOR 1 SPXTAPE 24MB/XA 100
SMART 1 RTM 16MB/370 3% 500 QUICKDSP ON
Note: RTM was only active for certain measurements (see results).
Additional Information:
Spool files dumped/loaded: 34,000
Spool pages dumped/loaded: 353,000
Average pages per file: 10.4
Measurement Discussion:
For the SPTAPE DUMP and SPXTAPE DUMP measurements, the system was
prepared in advance by creating (approximately) 34,000 reader files
on four 3390 spool volumes. The reader files ranged in size from
100 to 600,000 80-byte records and occupied, on average, 10.4 spool
pages. The distribution of sizes was chosen so as to approximate
the distribution of spool file sizes that was observed on a local
CMS-intensive production system. The reader files were created
concurrently by 8000 autologged users. SPTAPE or SPXTAPE were
then used to dump these spool files to one or more 3490 tape drives.
For the SPTAPE and SPXTAPE LOAD measurements, the spool files were
first purged from these same four spool volumes. The SPTAPE LOAD
measurement was done by loading the tapes that had been created
by SPTAPE DUMP. The SPXTAPE LOAD measurement was done by loading the
tapes that had been created by the 1-tape SPXTAPE DUMP.
All required tape cartridges were preloaded into the 3490 cartridge
stacker so that tape change time was as short and consistent as
possible. Tape change time was observed to require about 1 minute and
was mainly due to the time required to rewind the tape.
For each measurement, the timestamps provided in the summary log file
created by SPTAPE or SPXTAPE were used to calculate the elapsed
time. Additional timestamps in the summary log file were used to
calculate the amount of time during which a tape drive was idle because
a tape cartridge was being changed (Tape Change Time) and the amount
of time this caused the system to be idle (System Idle Time).
RTM VM/ESA was active for some of the measurements. In those
cases, the results table (Table 1) also includes
processor time and processor utilization data.
Table 1. SPTAPE and SPXTAPE results -- dedicated 9121-742
Command
Function
Tape Drives
|
SPTAPE
DUMP
1
|
SPXTAPE
DUMP
1
|
SPXTAPE
DUMP
2
|
SPXTAPE
DUMP
4
|
SPTAPE
LOAD
1
|
SPXTAPE
LOAD
1
|
Elapsed Time
Elapsed Time (sec)
Elapsed Time Ratio
Rate Ratio
|
3:24:39
12279
1.000
1.00
|
22:42
1362
0.111
9.02
|
19:27
1167
0.095
10.52
|
18:55
1135
0.092
10.82
|
2:09:32
7772
1.000
1.00
|
1:07:42
4062
0.52
1.91
|
Tape Change Time (sec)
System Idle Time (sec)
Overlapped Tape Change Time (sec)
Percentage Overlap
|
653
653
0
0
|
371
230
141
38
|
309
0
309
100
|
247
0
247
100
|
671
671
0
0
|
371
0
371
100
|
Tapes
Percentage Reduction
|
12
|
7
42
|
7
42
|
8
33
|
12
|
7
42
|
Processor time (sec)
Processor utilization (%)1
|
|
|
79
1.7
|
|
|
902
5.6
|
Elapsed Time Ratio was calculated as SPXTAPE
Elapsed Time divided by SPTAPE Elapsed Time.
Rate Ratio was calculated as the reciprocal of
Elapsed Time Ratio. Rate Ratio represents
how many times faster SPXTAPE ran relative to the corresponding
SPTAPE measurement.
Overlapped Tape Change Time was calculated by
subtracting System Idle Time from Tape Change
Time. Percentage Overlap was calculated as 100% *
Overlapped Tape Change Time divided by Tape Change
Time. This represents the percentage of time that changing
tape cartridges was overlapped with other processing.
As shown in the above table, SPXTAPE was 9.0 to 10.8 times
faster than SPTAPE for dumping spool files for the measured cases.
The 2-tape and 4-tape cases were somewhat better than the 1-tape case
because tape changes were completely overlapped with other processing.
SPXTAPE was 1.9 times faster than SPTAPE for loading spool files
for the measured 1-tape case.
The number of spool volumes is an important variable in
determining how long it will take SPXTAPE to DUMP or LOAD a set of
spool files. The more spool volumes there are, the faster SPXTAPE
will tend to run because the spool I/Os are done in parallel across a
larger number of devices. By contrast, SPTAPE only works with one
spool volume at a time. Consequently, the performance relationship
between SPTAPE and SPXTAPE will depend, among other things, upon
the number of spool volumes. Four spool volumes were used in this
test and resulted in roughly a 10:1 elapsed time improvement for
dumping spool files. Cases where fewer spool volumes are involved
will tend to see a smaller improvement than this, while cases with
more spool volumes will tend to see a larger improvement.
SPXTAPE DUMP time is also sensitive to the distribution of spool
file blocks across the spool volumes. This was illustrated by the
results obtained from a 2-tape SPXTAPE DUMP of the same 34,000 files
that had been built onto the 4 spool volumes by restoring from the
SPTAPE dump tapes. Even though the files were the same, the elapsed
time was 70% longer than the 1167 seconds shown in Table 1 because those files were less favorably distributed
across the spool volumes.
One of the ways in which SPXTAPE provides better performance
relative to SPTAPE is by reducing the amount of time it is idle
waiting for a tape to be changed. This is shown by the reduction in
System Idle Time in the results table. Three factors
contribute to this improvement:
- SPXTAPE uses fewer tapes, resulting in fewer tape changes.
- Unlike SPTAPE, SPXTAPE supports the use of multiple tape
drives. When two or more tape drives are made available, SPXTAPE
is able to continue processing with another tape while one tape is
being changed.
- SPXTAPE uses available real memory to buffer data. For example,
during a SPXTAPE DUMP to one tape drive, when that tape drive becomes
temporarily unavailable due to a tape cartridge change, SPXTAPE
continues to read spool files from DASD and stages them in real
storage. This buffering continues until there is no more available
real storage or the tape drive becomes ready again.
SPXTAPE looks at system paging indicators to determine how much
of the system's total real storage it can use. In these dedicated
measurements, SPXTAPE was able to use large amounts of real storage
for this buffering, resulting in substantial overlapping of processing
with tape changes. For the 1-tape DUMP, it was able to overlap 38% of
the tape change time. For the 1-tape LOAD, it overlapped all of the
tape change time. The degree of overlap would be much less, of course,
on systems with less real memory or in cases where there is
concurrent activity that has significant memory requirements.
The results show that SPXTAPE elapsed time was relatively
insensitive to the number of tape drives that were used. Elapsed
time decreased by 14% when going from 1 tape to 2 tapes. There was
very little further elapsed time reduction when going from 2 tapes
to 4 tapes. This is because, with only four spool volumes, the
limiting factor tended to be the time required to transfer the spool
files from/to DASD.
SPXTAPE DUMP and LOAD processing result in
low processor utilizations. Even though the I/O processing is heavily
overlapped, the SPXTAPE DUMP and LOAD functions are I/O-bound.
The SPXTAPE LOAD measurement was run with default options in effect.
SPXTAPE offers a NODUP option. If selected, SPXTAPE ensures that
each loaded spool file is not a duplicate of any spool file that
is already on the system. Use of this option can increase processing
requirements significantly because each incoming spool file must be
checked against all existing spool files.
SPTAPE writes a tape mark between each backed up spool file. The
smaller the files, the more tape space is taken up by these tape
marks. SPXTAPE writes the spool file data as one tape file
consisting of 32K blocks. This reduces the number of tape volumes
required to hold the spool files. The smaller the average spool
file size, the larger the reduction relative to SPTAPE. For the
measured environment, the average spool file was 10.4 pages and a 42%
reduction in tapes was observed (1-tape case).
One disadvantage of using multiple tape drives with SPXTAPE is
that it can increase the number of tapes required. For example, the
SPXTAPE DUMP to 1 tape drive required 7 tapes, while the SPXTAPE
DUMP to 4 tape drives required 8 tapes. Using "n" tape drives
means that there are "n" partially filled tapes when the dump
has completed. Because of this, it is better to use no more tape
drives than is necessary to keep DASD I/O as the limiting factor,
with a minimum of two tapes in order to get the tape change overlap
benefits. For the measured case, 2 tape drives was a good number.
For a case where there are far more spool volumes, using more than
two tape drives can be beneficial. The suggested rule-of-thumb is
one tape drive for every 4 to 6 spool volumes, with a minimum of
two.
SPXTAPE will tend to perform especially well relative to SPTAPE
when one or more of the following conditions apply:
- The spool files are spread across many spool volumes.
- Two tape drives are used (or more, if appropriate).
- Tape change times are long.
- The spool files are small.
A measurement was done to explore the degree to which SPXTAPE
DUMP can affect the response times of concurrent CMS interactive
work and the degree to which that activity can affect the elapsed
time required to complete the SPXTAPE DUMP function.
Workload: FS8F0R + SPXTAPE DUMP:
Hardware Configuration:
Processor model: 9121-742
Processors used: 4
Storage:
Real: 1024MB
Expanded: 1024MB
Tapes: 3480 (Monitor)
3490 (2 tapes, used by SPXTAPE)
DASD:
Type of Control Number - Number of Volumes -
DASD Unit of Paths PAGE SPOOL TDSK User Server System
3390-2 3990-3 4 6 7 7 32 R 2 R
3390-2 3990-2 4 16 6 6
Note: R or W next to the DASD counts means basic cache enabled or DASD
fast write (and basic cache) enabled, respectively.
Note: The spool files backed up by SPXTAPE were all contained on 4 of the
spool volumes behind a 3990-2 control unit.
Communications:
Lines per
Control Unit Number Control Unit Speed
3088 1 NA 4.5MB
Software Configuration
Driver: TPNS
Think time distribution: Bactrian
CMS block size: 4KB
Virtual Machines:
Virtual Machine
Machine Number Type Size/Mode SHARE RESERVED Other Options
OPERATOR 1 SPXTAPE 24MB/XA 100
SMART 1 RTM 16MB/370 3% 500 QUICKDSP ON
VSCSn 3 VSCS 64MB/XA 10000 1200 QUICKDSP ON
VTAMXA 1 VTAM/VSCS 64MB/XA 10000 550 QUICKDSP ON
WRITER 1 CP monitor 2MB/XA 100 QUICKDSP ON
Unnnn 5500 Users 3MB/XC 100
Additional Information:
Spool files backed up: 26,000
Spool pages backed up: 264,000
Average pages per file: 10.2
Tapes required: 6
Measurement Discussion:
The 34,000 spool files were created on the system in advance, as
described in "Dedicated Measurements". The same four spool volumes were
used. The system was then configured for a standard FS8F
CMS-intensive workload measurement. This configuration had 9
additional spool volumes, for a total of 13 spool volumes. These
additional spool volumes accommodate the spool activity
generated by the FS8F workload.
The stabilization period for a 5500 user FS8F measurement was
allowed to complete. An SPXTAPE DUMP to two 3490 tape drives was then
started and remained active during the entire time that a FS8F
measurement was obtained. A subset (26,000) of the 34,000 spool
files was dumped. The FS8F measurement interval is usually 30
minutes for 9121-742 measurements. However, in this case, the
measurement interval was ended when the SPXTAPE DUMP command
completed, resulting in an FS8F measurement interval of about 15
minutes.
2
Table 2 compares the results of this measurement to
a corresponding measurement without SPXTAPE activity. The results
show the impact of the SPXTAPE DUMP activity on the performance of
the overall system. Average external response time (AVG LAST(T))
increased from 0.3 seconds to about 1.7 seconds when the SPXTAPE
DUMP was active. This increase was mainly due to I/O
contention on the spool volumes. SPXTAPE processor and real storage
usage were not significant factors. In both cases, they were a
small percentage of the total system resources.
The rate at which SPXTAPE was able to dump spool pages decreased
from 303 pages/second in the dedicated case to 237 pages/second when
the system was running at 90% processor utilization with the FS8F
workload, a 22% decrease. This decrease is primarily due to contention
with the CMS users for processor time and the spool volumes.
Table 2. Performance interactions between SPXTAPE DUMP and concurrent CMS users
SPXTAPE DUMP
Release
Run ID
|
NO
VM/ESA 2.2
S47E5504
|
YES
VM/ESA 2.2
S47E5506
|
Difference
|
%Difference
|
Environment
Real Storage
Exp. Storage
Users
VTAMs
VSCSs
Processors
|
1024MB
1024MB
5500
1
3
4
|
1024MB
1024MB
5500
1
3
4
|
|
|
|---|
Response Time
TRIV INT
NONTRIV INT
TOT INT
TOT INT ADJ
AVG FIRST (T)
AVG LAST (T)
|
0.110
0.337
0.250
0.233
0.231
0.308
|
0.079
0.466
0.292
0.341
0.531
1.661
|
-0.031
0.129
0.042
0.108
0.300
1.353
|
-28.18%
38.28%
16.80%
46.53%
130.11%
439.88%
|
Throughput
AVG THINK (T)
ETR
ETR (T)
ETR RATIO
ITR (H)
ITR
EMUL ITR
ITRR (H)
ITRR
|
26.07
179.76
193.12
0.931
218.42
50.91
76.29
1.000
1.000
|
21.57
218.37
187.01
1.168
213.07
62.29
95.86
0.975
1.224
|
-4.50
38.61
-6.12
0.237
-5.36
11.38
19.56
-0.025
0.224
|
-17.28%
21.48%
-3.17%
25.45%
-2.45%
22.35%
25.64%
-2.45%
22.35%
|
Proc. Usage
PBT/CMD (H)
PBT/CMD
CP/CMD (H)
CP/CMD
EMUL/CMD (H)
EMUL/CMD
|
18.313
18.279
6.490
6.058
11.823
12.220
|
18.773
18.769
6.994
6.577
11.779
12.192
|
0.461
0.491
0.505
0.519
-0.044
-0.028
|
2.51%
2.69%
7.78%
8.57%
-0.37%
-0.23%
|
Processor Util.
TOTAL (H)
TOTAL
UTIL/PROC (H)
UTIL/PROC
TOTAL EMUL (H)
TOTAL EMUL
MASTER TOTAL (H)
MASTER TOTAL
MASTER EMUL (H)
MASTER EMUL
TVR(H)
TVR
|
353.66
353.00
88.42
88.25
228.33
236.00
90.85
91.00
38.95
40.00
1.55
1.50
|
351.08
351.00
87.77
87.75
220.28
228.00
89.80
90.00
36.42
38.00
1.59
1.54
|
-2.59
-2.00
-0.65
-0.50
-8.06
-8.00
-1.06
-1.00
-2.53
-2.00
0.04
0.04
|
-0.73%
-0.57%
-0.73%
-0.57%
-3.53%
-3.39%
-1.16%
-1.10%
-6.50%
-5.00%
2.90%
2.92%
|
Storage
NUCLEUS SIZE (V)
TRACE TABLE (V)
WKSET (V)
PGBLPGS
PGBLPGS/USER
FREEPGS
FREE UTIL
SHRPGS
|
2540KB
800KB
72
233K
42.4
15585
0.92
1765
|
2544KB
800KB
72
230K
41.8
18180
0.90
1605
|
4KB
0KB
0
-3K
-0.5
2595
-0.02
-160
|
0.16%
0.00%
0.00%
-1.29%
-1.29%
16.65%
-2.03%
-9.07%
|
Paging
READS/SEC
WRITES/SEC
PAGE/CMD
PAGE IO RATE (V)
PAGE IO/CMD (V)
XSTOR IN/SEC
XSTOR OUT/SEC
XSTOR/CMD
FAST CLR/CMD
|
623
455
5.582
162.300
0.840
828
1409
11.583
8.813
|
502
453
5.107
158.300
0.846
797
1302
11.224
8.545
|
-121
-2
-0.475
-4.000
0.006
-31
-107
-0.359
-0.268
|
-19.42%
-0.44%
-8.51%
-2.46%
0.73%
-3.74%
-7.59%
-3.10%
-3.04%
|
Queues
DISPATCH LIST
ELIGIBLE LIST
|
102.55
0.02
|
126.30
0.00
|
23.74
-0.02
|
23.15%
-100.00%
|
I/O
VIO RATE
VIO/CMD
RIO RATE (V)
RIO/CMD (V)
NONPAGE RIO/CMD (V)
DASD RESP TIME (V)
MDC READS (I/Os)
MDC WRITES (I/Os)
MDC AVOID
MDC HIT RATIO
|
1806
9.352
547
2.832
1.992
19.900
557
26
515
0.91
|
1772
9.476
793
4.240
3.394
18.400
545
26
505
0.91
|
-34
0.124
246
1.408
1.402
-1.500
-12
0
-10
0.00
|
-1.88%
1.33%
44.97%
49.71%
70.38%
-7.54%
-2.15%
0.00%
-1.94%
0.00%
|
PRIVOPs
PRIVOP/CMD
DIAG/CMD
DIAG 04/CMD
DIAG 08/CMD
DIAG 0C/CMD
DIAG 14/CMD
DIAG 58/CMD
DIAG 98/CMD
DIAG A4/CMD
DIAG A8/CMD
DIAG 214/CMD
SIE/CMD
SIE INTCPT/CMD
FREE TOTL/CMD
|
20.570
25.324
0.948
0.738
1.125
0.025
1.249
0.345
3.592
2.824
13.333
56.959
38.162
44.821
|
20.465
25.411
0.988
0.685
1.106
0.025
1.261
0.355
3.648
2.910
13.292
58.821
38.822
45.480
|
-0.104
0.087
0.041
-0.053
-0.019
0.000
0.012
0.010
0.056
0.086
-0.041
1.863
0.660
0.658
|
-0.51%
0.34%
4.30%
-7.17%
-1.68%
0.99%
0.92%
2.96%
1.56%
3.06%
-0.31%
3.27%
1.73%
1.47%
|
VTAM Machines
WKSET (V)
TOT CPU/CMD (V)
CP CPU/CMD (V)
VIRT CPU/CMD (V)
DIAG 98/CMD (V)
|
4137
2.7703
1.2312
1.5390
0.345
|
4159
2.7883
1.2668
1.5215
0.356
|
22
0.0180
0.0356
-0.0175
0.012
|
0.53%
0.65%
2.89%
-1.14%
3.40%
|
| Note: T=TPNS, V=VMPRF, H=Hardware Monitor, Unmarked=RTM
|
Footnotes:
- 1
-
In the table, processor utilization is calculated as processor time
divided by total processing capacity, where total processing
capacity is elapsed time * 4. Because the 9121-742 has 4
processors, its total capacity is 4 processor-seconds per elapsed
second.
- 2
-
This shorter run interval is also the reason for the apparent
decrease in think time between the two measurements.
Back to the Performance Tips Page
|