VM Data Spaces
The following article was written by Kris Buelens and Guy De Ceulaer
for their VM Newsletter. Their customers are fortunate to have such
talented and dedicated support personnel.
So in order to allow the rest of the world to share in the benefits,
the article has been incorporated here into the VM Home Page. If you
see any formatting errors, let
me know, since they are most likely mine.
The term Data In Memory is often abbreviated to
DIM.
VM provides some "Data In Memory Techniques", (such as
Minidisk Cache, and Virtual Disks in Storage),
and exploits some other (such as
Dataspaces and DASD controller caching).
Problems can be :
- What DIM technique to use ?
- You can find information to help with this decision
in VM/ESA Data in Memory Techniques
and one of Bill Bitner's presentations
VM/ESA Data in Memory
Techniques
- How to interpret "page rate" when using VM dataspaces?
- Simply looking at the page rate returned by CP INDICATE LOAD is completely
misleading. This topic explains why.
To understand why pagerates can be misleading, we must cover first :
- What are VM Dataspaces ?
- Primary Address Space
- Data Spaces
- IO Performed to Read CMS Files
- Reading CMS Files Stored on Minidisks
- Reading a Normal SFS File
- Reading an SFS File Mapped in a Dataspace
- DataSpace Usage by DB2/VM(1)
- Page Rate Reported for DB2/VM
- Performance monitors
A dataspace is similar to the primary address space of any VM user:
it is virtual storage that can be directly addressed by assembler
programs.
As any virtual storage, CP can page it out to
its page datasets on DASD. Whereas a primary address space can contain
executable programs and data, a dataspace only contains data.
With CP Q SPACES PERMITTED <userid> one can query which
address spaces one has access to.
When a user logs on, CP creates its primary address space, and it
even assigns it a data space name,
namely userid:BASE.
The size of the primary address space is defined in the CP directory
and can be
- questioned with Query Virtual STORage
or with INDicate USER
- modified with DEFine STORage xxxM
Authorized users can create dataspaces, these dataspaces can be
shared or private and
mapped or unmapped.
Dataspaces get also a name, for example VMSYSU:0000000016SFSUSR
- Shared
- A shared dataspace can be directly addressed by users that have
permission. Users can even tell CP that they want to make their
primary address space shareable. So there is no need to use APPC (or
any other protocol) to access the data that a server maintains in a
shared dataspace.
VM's Shared Files System (SFS) uses shared dataspaces for
directories indicated by the SFS administrator. This gives a good
performance advantage for read-only data.
- Private
- A Private dataspace is simply extra virtual storage for the user
that creates it.
DB2/VM (formerly named SQL/DS) uses VM dataspaces in this way. SQL
end-users cannot directly access SQL data, they still use APPC to send
their request to the SQL server. (It is impossible for an SQL system
to work with shared dataspaces the way SFS does as is explained below).
- Mapped
- Each virtual storage page of a mapped dataspace is mapped to a 4K
block on a minidisk of the dataspace owner. When a pagefault
occurs, CP will read the data from the appropriate minidisk. Mapped
dataspaces are used by SFS and DB2/VM.
See the detailed SFS example below.
- Unmapped
- This is more like primary address spaces:
when a pagefault occurs,
CP will get the data from its paging datasets. Pageouts go to CP's
paging datasets as well.
Unmapped dataspaces may be used by DB2/VM for workspace (the so called
"Internal DBspaces").
In this section we want to explain where the I/O instructions are
counted when CMS files are read
(Access to SQL tables is mentioned later). The explanation is however
simplified a bit.
To manage its file system, CMS uses two important control blocks:
- Active Disk Table or ADT
This table has an entry for each letter of the alphabet. It contains
more or less the information displayed by Q DISK and Q ACCESSED. In
other words, it allows CMS to know that disk D, for example, corresponds
to minidisk 192, and disk Z with SFS directory
VMSYSU:HTTPD.WEBSHARE.KRIS.
- Files Status Table or FST
This table has an entry for each file of a minidisk/directory. It
contains more or less the information displayed by FILELIST, and also a
pointer to where the file starts on the minidisk (in practice, the
pointer may point to an index of the file's records, but we will ignore
this here).
The ADT and FST are most used and affected by the ACCESS and RELEASE
commands.
When a CMS user starts a program that reads a file (for example
XEDIT MYFILE ONE D), CMS will consult its ADT to find on which
minidisk the file is stored and then the FST (File Status Table) to find
where the file lives on the minidisk. Then CMS asks CP to read the
appropriate file blocks from that minidisk.
CP places the data of the file in buffers provided by CMS
or by the program (XEDIT in our example).
Counters: The I/O's involved are issued by the end-user's
virtual machine and
simply counted as I/O instructions in all performance data collectors.
For example in CP IND USER or RTM's DISPLAY USER.
or PerfKit's UPAGE subcommand
When CMS finds in the users' ADT that the file is
a "normal" SFS file, it will use APPC to send a ReadFile request to
the SFS server. The SFS server consults its catalog to find where the
file resides, and performs one or more I/O requests (using *BLOCKIO) to
read the file data from its minidisk(s) (a single SFS file can span
minidisks).
It then uses APPC to send the data back to the end-user, where CMS
receives the data in the buffers provided by the the program (XEDIT in
our example).
Counters: Now it is no longer the end-user that performs the
I/O, but the SFS service machine. Hence the I/O counts for the end-user
are not incremented. CP counts the I/O's with the SFS server (CP cannot
know which end-user request resulted in the I/O instructions issued by
the SFS server).
Reading an SFS file residing in a dataspace is a bit more complex, but
faster. First we must explain how a dataspace is created for the
directory.
- Mapping
- When some user ACCESSes a directory, the SFS server checks if it is
eligible for a dataspace. When the user is the first one to access
that directory, SFS will define a new dataspace (for example
VMSERRUN:0000000016SFSRUN) and tell CP how it must be mapped. In
human words this means that, for each file in the directory, the SFS
server will assign a place in the dataspace, and tell CP where the data
of the files reside on its mindisks.
A basic, simplified, example:
- suppose the directory contains only two files. File 1 (MYFILE ONE)
has two datablocks, datablock 1 resides on block 102 of minidisk 423
and datablock 2 on block 326 of minidisk 525. File 2 (MYFILE TWO) has
one datablock, residing on block 103 of minidisk 423.
- SFS could assign dataspace pages 1 and 2 to the datablocks of
MYFILE ONE, and page 3 to MYFILE TWO. So it will tell CP that the
contents for page 1 of the dataspace are located on block 102 of
minidisk 423, page 2 is on block 326 of minidisk 252 and page 3 is on
block 103 of minidisk 423.
- SFS will also create the FST's corresponding to the 2 files in the
dataspace. The FST entry
for MYFILE ONE will point to pages 1 and 2 of the dataspace, the entry
for MYFILE TWO will point to page 3.
When all this is built, the SFS server tells the end-user accessing
this directory in which dataspace it can find everything. Obviously,
when a second user accesses the same directory, the mapping is already
done and SFS can simply tell the user which dataspace it can use.
Here ends the processing for the ACCESS command of a "dataspaced"
directory.
|
Note:
At this stage, the data for the files are not yet read in
central storage, they remain on the SFS minidisks until some user needs
them.
|
|
Note:
This also means that if you keep at least one user ACCESSed to the
directory, it may improve the performance. You could for example keep
the access in some service machines like VMUTIL.
|
- Using the files
- We saw that the SFS server constructed a dataspace to be used by the
accessors of the directory. Reading a file becomes simple.
Suppose again the user issues XEDIT MYFILE ONE ...
- XEDIT allocates buffers to work with the file, and asks CMS to read
the file.
- CMS finds out that MYFILE TWO is located in pages 1 and 2 of the
dataspace and simply executes a MOVE instruction to copy pages 1 and 2
in the buffer provided by XEDIT.
- If this is the first user to refer to file MYFILE ONE, a pagefault
for the dataspace pages will be the result when CMS starts to
copy page 1 (we mentioned above that when the directory is mapped,
the data for the files are not yet read into storage). To resolve
this page fault, CP's paging routines will see that page 1 is mapped to
minidisk 423 of the SFS server and so CP will pagein from the minidisk
instead of from its page datasets.
As CP uses a Block Paging algorithm, CP will probably not pagein
just a single page. If plenty of real storage is available, CP might
pagein many pages with one operation. The pages may or may not belong
to the same file. As the dataspace is shared
amongst all users of the directory, other users referring to the same
file may profit from the pagein caused by the first user.
Counters: It should be clear that neither the SFS server,
nor the end-user issues an I/O instruction. But, CP's paging
routines are called for the user that encountered the page-fault.
Hence here the
end-user's pageread numbers are incremented (e.g. CP IND USER)
as well as
the system-wide pagerate.
Note: To build the data space, the SFS server performs
also some I/O to read its catalog.
Above we mentioned that DB2/VM uses only private dataspaces. Indeed
requests to SFS files are simple :
"give me one or more records of a file".
SQL
requests are complex, and SQL has finer security (some users cannot see
all columns of a table for example). This makes the use of shared
dataspaces impossible or unacceptable.
We won't explain the mechanisms DB2/VM uses to optimize its
performance (such as page fault handshaking with CP). But, after having
read the section on SFS, it should be clear that when DB2/VM uses
dataspaces, the I/O counts for the DB2/VM server
will be much lower, and the reported page rate much higher.
A high pagerate now means either that the server works a lot with its
database, or that the system is too short on storage.
| DB2/VM Pagerate |
|---|
The page rate reported for DB2/VM can be split in a few classes:
- Page reads to the primary address space This happens when a
page fault occurs in the buffers or the SQL code. During the
pagefault, the server is forced to wait. High numbers indicate a
problem. This can be detected by looking at the %page wait for
the server (as reported by VMPRF).
- Page writes from the primary address space This happens when
CP decides that the server uses too much storage.
During the pageout, the
server is not forced to wait.
High numbers will probably result in high page read rates, and later in
high %page wait.
- Page reads to a dataspace This happens when a page fault
occurs in dataspaces of the SQL, it corresponds to the I/O SQL would do
to its database without dataspaces.
During the pagefault, the server is not forced to wait, it may work on
other tasks.
- Page write from a dataspace This happens either when
SQL asks CP to commit changes on disk, or when CP needs room and
decides to pageout a changed page of SQL. The server is not forced to
wait.
|
The performance monitors cannot differentiate between page
rate to primary address spaces and to dataspaces.
We
have to repeat over and over again that interpreting the systems'
"pagerate", as reported by CP INDICATE, RTM/ESA, or PerfKit
must be done with care,
as it is no longer "true" paging, but it includes database I/O and some
SFS I/O.
The VM lab is aware of this problem and may provide more detailed
reporting tools in the future.
RTM/ESA (known as SMART) gives one report (SYSDASD)
and Performance Toolkit (known as PerfKit)
gaves a report (STORAGE)
that helps a bit.
On the other hand, CP IND SPACES USER xxx
can be issued to get actual counters (totals for page reads and writes
of each address space a user has, but that doesn't give a rate/second).
In this report, data is only available for the last measurement
interval(2)
and the average since the last reset (by default:
8:00, 16:30 and 24:00). There is no timebased log available.
It shows data about DASD access by CP, and distinguishes:
- I/O to CP's PAGE areas
- I/O to the SPOOL areas
- I/O to mapped minidisks
Here is a sample:
>>>> 16:40:35 D SYSDASD INT
<>VM/ESA CPU9672 SERIAL 017239 212M DATE 06/06/97 START 16:38:52 END 16:39:52<>
<--- DEVICE ----> <-PAGE -> <-SPOOL-> <--- RATE ---> USER <----- AVERAGE ------>
S DEV TYPE VOLSER RSEC WSEC RSEC WSEC TSEC NOIO ACIO INTF QLEN SERV MLOD BK %ALL
402E 3390 VCTPG2 5.45 2.98 0 0 8.43 0 3.55 0 0 2.06 2.06 10 50.0
502E 3390 VCTPG1 2.98 2.85 0 0 5.83 0 2.07 0 0 1.79 1.79 10 57.1
SUM/AVG-PAGE 2 8.43 5.83 0 0 14.3 0 5.62 0 0 1.93 1.93 10 53.8
501C 3390 VCTRES 0.02 0.15 0.07 0.27 0.50 0 1.62 2 0 8.34 8.34 0 100
502E 3390 VCTPG1 2.98 2.85 0 0 5.83 0 2.07 0 0 1.79 1.79 10 57.1
SUM/AVG-SPOOL 2 3.00 3.00 0.07 0.27 6.33 0 3.68 2 0 5.06 5.06 10 66.7
5020 3390 VCT003 1.62 2.83 0 0 4.45 0 25.7 136 8 27.0 13.5 0 0
5021 3390 VCT004 3.75 3.08 0 0 6.83 0 6.58 0 8 18.5 13.5 0 0
5022 3390 VCT005 1.17 1.97 0 0 3.13 0 24.2 117 0 22.4 13.5 0 0
5023 3390 VCT006 4.15 3.00 0 0 7.15 0 6.97 0 6 16.6 13.5 0 0
5024 3390 VCT007 3.50 3.37 0 0 6.87 0 6.10 0 18 24.0 13.5 0 0
5025 3390 VCT008 3.32 2.68 0 0 6.00 0 5.75 0 9 20.8 13.5 0 0
SUM/AVG-MDSK 6 17.5 16.9 0 0 34.4 0 75.3 253 4.90 21.6 13.5 0 0
SUM/AVG-ALL 9 26.0 22.9 0.07 0.27 49.2 0 82.6 255 4.90 15.7 10.3 10 60.0
The SUM/AVG-xxxx lines are important to give you an idea of
how the pagerate reported by, for example, INDICATE is to be split up.
The paging activity can indeed be separated in the 'pure CP' part
(SUM/AVG-PAGE) and the page I/O on mapped minidisks (SUM/AVG-MDSK).
In this display PerfKit shows the paging rate for the last interval
on a device basis.
Here is a sample:
FCX109 Data for YYYY/MM/DD Interval HH:MM:SS - HH:MM:SS Monitor
Page / SPOOL Allocation Summary
PAGE slots available 1220016 SPOOL slots available 1220016
PAGE slot utilization 0% SPOOL slot utilization 47%
T-Disk cylinders avail. ....... DUMP slots available 0
T-Disk space utilization ...% DUMP slot utilization ..%
____ . . . . . . . .
< Device Descr. -> <------------- Rate/s ------------->
Volume Area Area Used <--Page---> <--Spool--> SSCH
Addr Devtyp Serial Type Extent % P-Rds P-Wrt S-Rds S-Wrt Total +RSCH
3000 3390-3 PERF1 PAGE 549- 648 0 .0 .0 ... ... ... ...
SPOOL 649- 748 91 ... ... .0 .0 .0 .1
3008 3390-3 PG401 PAGE 0-3338 0 .0 .0 ... ... .0 .1
3009 3390-3 PG402 PAGE 0-3338 0 .0 .0 ... ... .0 .1
300B 3390-3 SP401 SPOOL 0-3338 80 .0 .0 .0 .1 .1 .1
300C 3390-3 SP402 SPOOL 0-3338 12 .0 .0 .0 .0 .0 .1
In this display PerfKit shows the paging rate for shared dataspaces
only.
Here is a sample:
FCX134 Data for YYYY/MM/DD Interval HH:MM:SS - HH:MM:SS Monitor
______ . . . . . .
<--------- Rate per Sec. -------
Owning Users
Userid Data Space Name Permt Pgstl Pgrds Pgwrt X-rds X-wrt X-
>System< -------- 0 .000 .000 .000 .000 .000 .
SYSTEM FULL$TRACK$CACHE$1 0 .000 .000 .000 .000 .000 .
SYSTEM ISFCDATASPACE 0 .000 .000 .000 .000 .000 .
SYSTEM PTRM0000 0 .000 .000 .000 .000 .000 .
SYSTEM REAL 0 .000 .000 .000 .000 .000 .
SYSTEM SYSTEM 0 .000 .000 .000 .000 .000 .
SYSTEM VIRTUAL$FREE$STORAGE 0 .000 .000 .000 .000 .000 .
This report does not show private dataspaces.
One can indeed write an EXEC that issues an INDICATE SPACE USER
xyz every so often and calculates a rate per second.
The QDBPAG EXEC listed here does this for you.
| QDBPAG EXEC |
|---|
/* This exec is a sample to tailor to your needs.
(look for /*>>TAILOR<<*/ flags in the code).
It collects data to distinguish dataspace paging from other paging.
Very important to check on SQL/DS when it uses mostly dataspaces, as
then the database IO's are reported as paging.
We analyze INDICATE SPACES to find the real counts, and subtract the
counts of two executions of IND SPACED.
Note that we add Page Write and Page Migrations (the first is from
an address space to DASD, the second is from Xstore to DASD)
And, as we are counting these pagings, we also collect IO and %CPU
+-----------------------------------------------------------+
| format: | QDBPAG <SEND> |
+-----------------------------------------------------------+
Notes:
1. It is supposed that this exec runs for example in VMUTIL, where
it could be started every 5 minutes for example. You can freely
choose any interval you like.
The intermediate paging counters are stored in GLOBALV.
2. It is no problem for the exec that an SQL database would be restarted
from time to time. The exec will detect this and ignore the interval
in which a restart occurs. Similar, the very first time this exec
runs, it cannot calculate any page rate per second as it misses data
to compare with. So, the first time the exec encounters a given SQL
database, it simply stores the paging counters in GLOBALV.
3. Also, every day the collected file should be sent to some other user
and VMUTIL would start a new file. This is achieved with the SEND
parameter.
The two lines for VMUTIL's WAKEUP PARMS file could be:
M-F +5 10:00:16 CMS EXEC QDBPAG
M-F 23:55:00 09/02/97 CMS EXEC QDBPAG SEND
Written by: Kris Buelens IBM Belgium; 31 Jul 1996*/
parse upper source . . myname mytype . syn addr .
if addr=? then signal SubPipe
address command
parse upper arg send .
If send='SEND' then signal send
signal on syntax
/*>>TAILOR<<*/
/* Find what userids run an SQL database */
'PIPE CP Q RESOURCE',
'| LOCATE /SQL/',
'| SPEC W7 1',
'| STEM USER.'
/*>>end TAILOR<<*/
sayit=(linesize()<>0) /* If running connected: show on console */
If Sayit then
Say left('',8) 'Secs PagRd DatRd PagWr DatWr IO CPU'
/*user.0=1 ;user.1='SQLIBS'*/
do i=1 to user.0
user=user.i
'PIPE (end ?)',
' CP IND SPACES USER' user|| '15'x || 'IND USER' user 'EXP',
'|T: Tolabel Userid='||, /* Send INDICATE data down */
'| BETWEEN /Spaceid=/ 5',
'| XLATE = 40',
'| SPECS W2 1', /* Dataspace id */
'read read read w7 Nw', /* Xstore migrates */
'read w3 Nw W5 Nw', /* Dasd Read & Writes */
'|O: Fanout', /* Copies to allow SUMming */
'| SPECS W1 1|Not Chop' length(user)+1,
'| JOIN * $/$', /* Keep list of space ids .. */
'| CRC crc16i|SPEC 1-2 C2X 1', /* replace by a CRC as list */
'| BUFFER', /* can be too big for GLOBALV*/
'|S: SPECS /"' time('S') '/ 1 W1 Next /"/ Next W2 Nw',
' Select 1 W1-* NextWord', /* The reads */
' Select 2 W1-* NextWord', /* The Writes */
' Select 3 W1-* NextWord', /* The Migrates */
' /0 0 0/ NextWord', /* For user without DataSp*/
'|REXX('myname mytype')', /* Sum them for DataSpaces*/
'|SPECS W1-8 1', /* Remove 0 0 0 if not needed */
'|F: Fanin|Join 1 / /', /* Add IO and CPU */
'| Var PagIO.'user,
'?O:|SPECS W3|PAD 10|JOIN * $+$|XLATE 11 + 40|S:', /* Keep Reads */
'?O:|SPECS W4|PAD 10|JOIN * $+$|XLATE 11 + 40|S:', /* Keep Writes*/
'?O:|SPECS W2|PAD 10|JOIN * $+$|XLATE 11 + 40|S:', /* Keep Migr. */
'?T:',
'| FROMLABEL CPU '||, /* Anal INDICATE data */
'| XLATE : 40 = 40',
'| SPECS W14 Nw /*24*3600 + / Next',/* TotCpu days */
'W15 N /*3600 + / Next',/* TotCpu Hours */
'W16 N /*60 + / Next',/* TotCpu Minutes */
'W17 N ', /* TotCpu Seconds */
'Read W8 Nw', /* IO count */
'|REXX('myname mytype')', /* Calc CPU seconds*/
'|F:' /* Pass to main stream*/
parse value value('PagIO.'user,PagIO.user,'GLOBAL QIO'),
with otime oids oPagR oDatR oPagW oDatW oPagM oDatM oCPU oIO .
parse var pagio.user ,
time ids PagR DatR PagW DatW PagM DatM CPU IO .
if ids<>oids |, /* Dataspaces changed: ignore */
time<otime |, /* Date changed: ignore */
cpu <ocpu then iterate /* User must have been logged of */
secs=time-otime
/* We add the Xstore Migrates to the Writes */
Parse value PagW+PagM oPagW+oPagM DatW+DatM oDatW+oDatM,
with PagW oPagW DatW oDatW
data= format((PagR -oPagR ) /secs,5,1),
format((DatR -oDatR ) /secs,5,1),
format((PagW -oPagW ) /secs,5,1),
format((DatW -oDatW ) /secs,5,1),
format((IO -oIO ) /secs,5,1),
format((CPU -oCPU ) /secs*100,3,0)
If Sayit then say left(user,8) right(secs,4) data
'EXECIO 1 DISKW' user 'PERFLOG A (FINIS STRING' left(time(),5) data
end i
exit
SEND:
/*>>TAILOR<<*/
'PIPE COMMAND LISTFILE SQL* PERFLOG A|STEM FILE.',
'|SPEC /ERASE/ 1 W1 NW /PERFLOG- A/ NW|COMMAND'
do i=1 to file.0
'EXEC SENDFILE' file.i 'TO MAINT(NOLOG'
if rc=0 then 'RENAME' file.i '= PERFLOG- ='
'RENAME' file.i '= PERFLOG- A'
/* Start a new file */
'EXECIO 1 DISKW' file.i '(FINIS STRING',
'Time PagRd DatRd PagWr DatWr IO CPU'
end i
/*>>end TAILOR<<*/
exit rc
/*******************************************************************/
SYNTAX: /* we come here when SIGNAL ON SYNTAX traps an error */
/*******************************************************************/
parse upper source . how myname mytype . syn .
call errexit rc,'REXX problem in' myname mytype 'line' sigl':' ,
'ERRORTEXT'(rc), sigl':'||'SOURCELINE'(sigl)
/*******************************************************************/
ERREXIT: /* exit with retcode & errormsg */
/*******************************************************************/
do i=2 to 'ARG'()
say 'ARG'(i)
end
'EXEC REXXVARS' /* show value of all REXX variables */
exit arg(1)
/*******************************************************************/
SubPipe:
/*******************************************************************/
Signal on Error
do forever
'PEEKTO Data'
interpret "'OUTPUT'" data
'READTO'
end
Error:
exit rc*(ec<>12)
|
Footnotes:
(1)
As you could read in the first article, SQL/DS was renamed to
IBM DB2 Server for VSE & VM. We use the short notation DB2/VM here.
(2)
The default interval is 30 seconds. You can change this default interval
or run another RTM/ESA virtual machine with a much higher interval to
get smoother figures.
Back to the Performance Tips Page
|