SFS Performance Management Part II: Mission PossibleSHARE Winter 1995 (C) Copyright IBM Corporation 1995, 1997 - All Rights Reserved
Table of Contents
DISCLAIMER DISCLAIMERThe information contained in this document has not been submitted to any formal IBM test and is distributed on an "As is" basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environment do so at their own risk. In this document, any references made to an IBM licensed program are not intended to state or imply that only IBM's licensed program may be used; any functionally equivalent program may be used instead. Any performance data contained in this document was determined in a controlled environment and, therefore, the results which may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment. It is possible that this material may contain reference to, or information about, IBM products (machines and programs), programming, or services that are not announced in your country or not yet announced by IBM. Such references or information must not be construed to mean that IBM intends to announce such IBM products, programming, or services. Permission is hereby granted to SHARE to publish an exact copy of this paper in the SHARE proceedings. IBM retains the title to the copyright in this paper, as well as the copyright in all underlying works. IBM retains the right to make derivative works and to republish and distribute this paper to whomever it chooses in any way it chooses. Should the speaker start getting too silly, IBM will deny any knowledge of his association with the corporation. Trademarks
Introduction
This presentation assumes the attendee has seen the Introduction to SFS Performance Management or has equivalent experience. A little time will be spent in review, but the knowledge of the basics are assumed. Last update February 3, 1997 The speaker notes were added to enhance this presentation, however they were not created by a professional writer. So please excuse grammar and typos. However, any suggestions or corrections are appreciated. I'd also like to acknowledge several people who helped pull this presentation together. Many sections in this presentation are based on questions that have come up on various on-line forums. The experts that have helped answer those questions are the following:
A Quick Review
While many of us resist this approach, it really is very important to read all the directions before putting your SFS filepool together. In few areas of VM/ESA have I seen the value as great in preventative tuning. In particular, DASD placement is key. CP tuning is the traditional tuning that is done for server virtual machines, such as SET SHARE, SET QUICKDSP, SET RESERVE, and so forth. We do not often think of tuning CMS, but there are key start-up and configuration parameters that can be of interest. These include the SFS file buffer, the USER parameter and the use of shared segments. Two other areas of particular interest to investigate is whether multiple file pools are required for performance or other reasons and whether the use of VM data spaces is feasible. As was stressed in the SFS introduction presentation, a lot of good information exists in the VM/ESA library. The key ones are listed above for a complete listing, see the reference list at the end of this presentation. SFS Strengths
Before entering the debate of minidisk versus SFS performance, I thought it would be of value to take a look at some of the SFS strengths. Most are functional strengths, but performance is definitely a part of many of these items. "File-level sharing" is what SFS provides (hence the name). The ability to manage space in pools makes for more efficient and easier DASD space management. When SFS is used in conjunction with DFSMS, you have a very powerful facility. As will be discussed later 'access' of a directory is performance sensitive. The various features of SFS file referencing can be used to minimize the number of files on a directory that are accessed or avoid accessing them all together. Just as there is file-level sharing, security also exists at that same level. The callable services library (CSL) provides various services to work with SFS, including the ability to make most of the calls asynchronously. The concept of workunits will also be covered in more details later. Logical units of work allow one to process work and have it committed or rolled-back as appropriate. SFS participates in CRR (coordinated resource recovery) which provides synchronization of committing work over multiple resources. SFS vs. Minidisk
One has to be careful in making comparisons between minidisk and SFS. They each have strengths. With that said, let's look at the key system resources. Because of the added function and structure of SFS, there is an increase in processor requirements for doing file functions with SFS instead of minidisk. As we'll see later on, data spaces can help minimize this and other resource requirements. The processor requirements increase is proportional to the file activity. For our CMS interactive workload the processor time per command increase is 16% when going from a minidisk to an SFS environment. The SFS environment contains both file control and directory control directories. There can also be an increase in real storage requirements. Storage is required to run the server machines and for data areas in the end-user virtual machines. Unlike processor requirements, the storage increase is related to the number of SFS users and not as sensitive to file activity. Overall I/O requirements are similar. One beneficial characteristic of SFS is that control data and content data are kept separate. That can provide benefits in terms of caching by CP. Estimated Processor Requirements
Using the fact that the processor requirements are proportional to file I/O activity and using existing data, you can estimate the impact. For the IBM CMS interactive workload FS8F, there are approximately 13.0 mdisk I/Os per million instructions executed on the system. When 3.7 of those I/Os are moved to SFS file control directories, the processor usage increased about 20%. The mdisk I/O can be determined by looking at the diagnose A4 and A8 rate from RTM/ESA or from the virtual DASD I/O rate from monitor data. Processor requirements for dircontrol directories using data spaces are similar to minidisks. The rule of thumb comes down to a 6% increase for each minidisk I/O per million instructions moved to a filecontrol directory. Recall that rules of thumb are usually just starting points and seldom accurate enough to write performance guarantees against. Estimated Processor Requirements ...
For an example, assume a 10 MIP processor is running at 75% utilization and does a total of 50 x'A4' and x'A8' diagnoses a second. (I know we hate talking about MIPS, but this is just a rough estimate okay). This example system would do roughly 6.7 mdisk I/Os per million system instructions. 50 I/Os per second 50 ------------------- = ---- = 6.7 I/Os per Million Instructions 10 MIPS * .75 7.5So in this example, the processor increase per command would be about 2 * 6% = 12%. It is interesting to note that even when we move as much as we can to SFS, the FS8F workload still does a great deal of minidisk I/O. This remaining minidisk I/O is from S-disk, Y-disk, temporary disk, and virtual disk in storage. To determine how much I/O is moved, look at the I/O counts to volumes being moved. Do not forget to include I/Os satisfied from minidisk cache. Estimated Real Storage Requirements
The rule of thumb is 1800 pages plus 4 pages per user connected to a file pool. There are many factors that come into play in looking at storage requirements. The biggest may be how FSTs are handled. We will look at FSTs in detail on a later foil. Storage for the filepool servers really depends on file activity, number of users, start-up parameters, and other factors. The CRR recovery server is fairly passive in environments that do not recovery two phase syncpoints (multiple file pools being updated inside single work unit). For each SFS user, plan on an additional 4 pages. This is made up of storage for the work unit structures, additional APPC/VM usage, the SFS file cache, and additional server control blocks. This number can be impacted by the size of the SFS file cache and the number of active workunits per user. Tuning can affect this for better or worse. Consider things like the SFS file cache, saved segment for VMLIB, CMSFILES segment for servers, or xxxBUFFERS setting in the server machine. Access Performance - File Control
Performance of the ACCESS command can be important. The first access of a given directory by any user on the system is normally the slowest one. That access causes the applicable catalog data to be cached in the server's catalog buffer pool (as much as will fit - size is governed by the CATBUFFERS start-up parameter). The presence of minidisk caching can also contribute to faster access of that directory by non-first users. Access is faster when the files on the directory are ones that you own or that are public. For others, the access time is related to the number and types of authorizations present. Only on first access by a given user is there a trip to the server to get the file information. Once that first access is done, the results are cached in the accessing user's virtual address space. This is retained even if that directory is subsequently released and reaccessed. The cached information is given up if the user virtual machine gets short on storage or if it is reset (e.g. IPL CMS). Access time is proportional to the number of files on the directory being accessed. This is due to having to obtain the data from the file pool server to build the in storage information (FST) for each file. The file referencing features of SFS can avoid accessing all together (direct referencing) or minimized the number of files accessed (aliases and hierarchical directory). Requests for updates to directory information occur when (1) CMS explicitly sends a request to the server asking for updates. This is done when CMS thinks it needs to ask for updates (see next question), or (2) the server sends updates along with the response to some action that caused a trip to the server. Starting with CMS 7, support was added to CMS to have file pool server asynchronously notify CMS when there are changes to the accessed directory. Certain places in CMS then notice that there has been a change and then go ask the server for the updates before processing the current request. Note that the above is for file control directories. Directory control directories have a consistent view from access to release, and therefore the directory update process does not apply. Following foils will describe the difference for directory control (dircontrol) directories where VM data spaces are exploited. VM data spaces
VM data spaces...Performance advantages
The server (logically) puts directory in VM data space, and user virtual machine takes from VM data space. The benefit of data spaces is based on degree of sharing. They provide a great benefit in user virtual storage as the FSTs are shared among accessed users and I/Os as the data is moved from the data space without a trip to the server. Grouping updates will minimize the likelihood of having multiple versions in data spaces. (discuss ACCESS to RELEASE consistency here). Having users run in XC mode is how the previously stated benefits are achieved. Remote users obviously do not have access to the data spaces. The file pool server can use the data spaces on behalf of the remote user, but network performance tends to be the significant player in remote performance. Separate servers for 1) less scheduled down time for R/O and 2) multiple user rules (discussed later) do not apply. The benefit of data spaces is based on the degree of sharing. Not only will exploitation of VM data spaces minimize expensive server requests, but it will allow a single copy of data to be shared among several users. This can be a significant boost for storage constrained systems. Performance is similar compared to read-mostly minidisks in minidisk cache. There are measurements that show both ends of the spectrum. It is dependent on workload and storage constraint. Access and Storage
When a minidisk is accessed the FSTs require 64 bytes per file. SAVEFD can be used to put these in a saved segment which can be shared. For minidisks that are always accessed by every user on system, it is simple to see that SAVEFD should be used for the read-only disks. An SFS FST is slightly larger than a minidisk FST, how much larger depends on whether it is a dircontrol or filecontrol directory. When dircontrol, using a data space makes a lot of sense. The management of it is simple, and you can save storage similar to SAVEFD with minidisks. In addition, the SFS FSTs in this case would be in a separate data space instead of the primary address space of the end-user virtual machines. This leaves more virtual storage below the 16 meg line for user applications or other saved segments. Agents
The first time I ever mentioned the term "agents" in describing SFS performance, I got this funny look like I was talking about something mysterious or even sinister. Well, these agents are agents of good. Requests made of the file pool server will become associated with an agent, and the dispatcher within the file pool server will dispatch these agents. The number of existing agents is determined by the USERS start-up parameter which is in the DMSPARMS file. The USERS value chief use is for computing the number of agents, but it will also be used to determine other values (such as CATBUFFERS) that are not explicitly set. The formula is:
agents = 4 + TRUNCATE(USERS / 8 )
The USERS value should be the number of logged-on SFS users expected
during peak system activity.
Peak activity can change over time so this should be monitored.
Counter Refresher
/* QREFRESH Query SFS Refresh directory req rate */
/* note Refresh Directory Request is 61st line */
'PIPE cms q filepool counter | STEM counter1.'
JUNK = TIME('R')
'CP SLEEP 60 SEC'
'PIPE cms q filepool counter | STEM counter2.'
elapsed = TIME('R')
rrate = (WORD(counter2.61,1) - WORD(counter1.61,1) ) / elapsed
say rrate 'dir refresh requests per second'
exit
The SFS counter information is available either from QUERY FILEPOOL commands or Domain 10 (APPLDATA) monitor data. The majority of this data are accumulating counters of requests, time, or I/Os. Looking at the counters at a single point in time is not very meaningful. Therefore, we typically take two snapshots, compute the delta between times for various counters, and analyze that data. Various performance products do this work for you on differing degrees. We will refer to this method several times throughout the rest of this presentation. Monitoring Agents
The SFS server has these agents to get the work done. When a logical unit of work is started, an agent becomes associated with this unit of work. It is "held" by this unit of work until the logical unit of work is either committed or rolled back. An active agent is an agent busy doing work in the server on behalf of a file pool request. Typically an increase in agents in use (held) or busy agents (active), is an indication that SFS work is increasing or it is taking longer to complete the work. Note these are two different cause/effect pairs. An analogy can be made to VM systems in general. For a given VM system, you have a number of users logged on and a number of users active. If you add users (and work) to a system, you tend to increase the users logged and active counts. The same affect can occur if you change the system in a different manner. If you move the system to a slower processor, it will take longer to process commands and transactions. Therefore you are likely to increase the user active and logged count this way also. While agents are typically associated with user work, there are internal functions in the file pool server that run as agents as well. Detailed Agent Information
SERVER8 File Pool Agents
Start-up Date 02/18/95 Query Date 02/18/95
Start-up Time 06:17:34 Query Time 13:52:21
========================================================================
AGENT INFORMATION
66 Total Number of Agents
11 Active Agents Highest Value
4 Current Number of Agents
Userid Type Status Agent Number Wait Uncommitted Blks
CHECKPT Chkpt Inact 2 I/O 0
BITNER User Read 4 None 0
DEVO1 User Read 10 Communication 0
BITMAN User Read 14 I/O 0
The detailed information on agents can be useful when doing problem determination, but is seldom reviewed for normal monitoring activity. The type column will indicate 'User' unless the agent is in use for some file pool server specific task. For 'User's, the Status is usually 'Read' or 'Write' which indicates whether any log information has been written. In the 'Wait' column, the most common values are 'I/O', 'Communication', and 'None'. 'Communication' could mean the agent is held and that the server is waiting for next request from user. Large numbers of agents in less common wait values (such as ESM_Wait) should be investigated. The 'Uncommitted Blks' column is the number of SFS file blocks used by an agent that have not yet been committed or rolled back. Very high numbers here are an indication that an application is running-away. File Pool Requests
PRF083 Run 02/17/95 18:43:14 SFS_BY_TIME
SFS Activity by time
From 02/17/95 08:02:08
To 02/17/95 16:57:08
For 32100 Secs 08:54:59 Bill Bitner looking at GDLVM7
<-----Time Per File Pool Request---->
From To FPR FPR Block
Time Time Userid Count Rate Total CPU Lock I/O ESM Other
08:02 16:57 CALSERV 93863 2.924 0.044 0.011 0.000 0.035 0 0.001
08:02 16:57 EDLSFS 516041 16.076 0.012 0.002 0.000 0.010 0 0.000
08:02 16:57 EDLSFS1 354268 11.036 0.009 0.002 0.000 0.006 0 0.001
08:02 16:57 EDLSFS2 261827 8.157 0.076 0.042 0.031 0.006 0 0.003
File pool requests are a good unit to use for SFS throughput or as a transaction rate. File pool requests will use various resources in the server and require different delays. By normalizing time for various functions to the file pool requests, you can get a breakdown of the various components making up the service time on an SFS request. Note that the 'Other' bucket is the delta from the 'Total' column. The VMPRF PRF083 SFS_BY_TIME report shown above illustrates how this data can be viewed. FCON/ESA also provides this type of breakdown in its 'Shared File System Server Screen'. Server Utilization
PRF083 Run 02/17/95 18:43:14 SFS_BY_TIME
SFS Activity by time
From 02/17/95 08:02:08
To 02/17/95 16:57:08
For 32100 Secs 08:54:59 Bill Bitner looking at GDLVM7
<-------Server Utilization------->
From To FPR FPR Page Check-
Time Time Userid Count Rate Total CPU Read point QSAM
08:02 16:57 CALSERV 93863 2.924 3.4 3.3 0.0 0.0 0
08:02 16:57 EDLSFS 516041 16.076 3.2 2.9 0.2 0.1 0
08:02 16:57 EDLSFS1 354268 11.036 3.5 2.6 0.2 0.1 0.5
08:02 16:57 EDLSFS2 261827 8.157 34.4 34.3 0.1 0.0 0
Previously we looked at the breakdown of file pool request service time. The file pool server can be processing several file pool requests at any given time because of its exploitation of asynchronous I/O and communication. However, there are some tasks that serialize a file pool server and it is important to understand these. By utilizing SFS counters and other monitor data, we can create a break down of the server's time. Like any other resource, SFS server utilization can not exceed 100%. The higher the utilization, the more contention there will be between different SFS agents due to server utilization. Therefore, it can be valuable to monitor the SFS server utilization. At any given time, the server can be running on a processor for file pool requests, waiting for page fault resolution, performing checkpoint processing, or waiting for QSAM (back up I/O). Checkpoint and control data backup (except when done to another SFS file pool), are functions that serialize the server. SFS is also serialized by page fault resolution. As a guideline, server utilization of less than 50% should not be a concern. When attacking utilization problems, the largest component is often where the most improvement can be found. The Performance manual and the SFS Introduction presentation show techniques for this approach. It is possible that high file pool server utilization is an indication that the file pool should be split into two file pools. The VMPRF PRF083 SFS_BY_TIME report does this with its Server Utilization section. FCON/ESA also provides this type of information in the SFS Server Details screen. Checkpoint Processing
Checkpoint processing is an internal SFS file pool server operation during which the changes recorded on the log minidisks are permanently made to the filepool. I think of it like balancing the checkbook. By doing checkpoint processing, if SFS is asked to recover changes it only has to go back to the last checkpoint on the log. Checkpoint processing is started after a certain number of log blocks have been written. Checkpoint processing serializes the server and can impact response time. Since the resources used during checkpoint processing are relatively low, most checkpoint problems affect response time instead of resource usage. From the QUERY FILEPOOL or monitor data, one can calculate the checkpoint processing time. Factors to checkpoint processing include: number of control data buffers, I/O performance, and the number of changed catalog buffers. Having sufficient Control data buffers help checkpoint processing. Insufficient control data buffers is the most common reason for long checkpoint times. Since checkpoint processing involves significant I/O to the control minidisks and storage group 1 (catalog) minidisk, a poor performing I/O configuration will affect checkpoint time. The more catalog buffers that have been changed, the more information that needs to be written to disk. After changes in VM/ESA R1.1, the number of modified catalog buffers is not as significant due to pre-flushing. Catalogs
Good catalog performance is necessary since this is where information on authorization, directory structure, aliases, etc. is kept. The tuning knob you have available is the CATBUFFERS setting which controls the number of catalog buffers. If allowed to default, the default value is computed based on the USERS start-up parm. An increase in catalog I/Os can be caused by fragmentation of index information. A symptom is an increase in catalog blocks read per file pool request. If this occurs one should plan to reorganize the catalogs using the FILESERV REORG command. The catalog buffer setting (CATBUFFERS) presents a performance trade off. If set too low, more catalog I/O will be required which could result in high block I/O time. If set too high, paging could result from additional storage requirements. Therefore, this value should be set with consideration of system constraints. Note that for special processing, such as restoring control data, you might want to temporarily increase the CATBUFFERS value significantly. Data Space Usage
Since a R/O file pool will have less need for planned down time and less cause for unplanned down time, a separate server machine is recommended. Also, the capacity limits associated with R/W file pools do not apply here. A R/O file pool using data spaces should have little file pool activity. Therefore, if a significant number of file pool requests are being made, then something is wrong. Scenarios that would cause normal file pool requests to be made include: the directory configured as file control instead of dir control, users accessing directory as R/W, use of CSL and direct file referencing without accessing the directory, access from remote users, or the server is not in XC mode. The other main cause of not using data spaces is that the server has exhausted the number of available data spaces. This is set by the CP directory statement XCONFIG ADDRSPACE MAXNUMBER nnnnn TOTSIZE nnnnG SHAREMost people set the TOTSIZE to 8192G (the maximum) and control the usage by the MAXNUMBER value. This is maximum number of data spaces the server can define. Monitor data or the CP INDICATE USER EXP command will show the number of data spaces the server has. To determine what those spaces are and who has access to the various levels of the directories use the SFS QUERY ACCESSORS command with DATASPACE option. Additional Pearls
The overhead and serialization associated with control data backups can seriously impact performance. Therefore, scheduling control data backups during off-peak hours can be useful. If this is not possible, directing the backups to another filepool allows the I/O to be done asynchrnously and minimize the serialization. Very small log disks may result in control data back-ups being kicked off more frequently than is acceptable. While there is a guideline in the Filepool Admin for the log disk size, your mileage may vary. There is really no major downside to having the log disks too big (other than wasted DASD space and perhaps slightly longer start-up times. SFS is smart enough to recognize copyfile requests between two files in the same file pool. In this case, all the work would be done on the server side instead of moving all the data between the server and end user and then back again. This pearl was missing from the Application Development Guide in past releases, we will see that it gets in the VM/ESA 2.1.0 book. SFS Progress
VM/ESA 1.1.0 - The key improvements this release were to allow SFS to run on larger systems. VM/ESA 1.1.1 - Checkpoint processing occurs less frequently, thus avoiding serialization. This improved response time but has little impact on resource utilization. The asynchronous file functions added in release 1.0 were miscellaneous ones. This release added the real file functions such as read/write. VM/ESA 1.2.0 - Pre-flushing buffers and exploiting multi-block I/O improved checkpoint performance. The new catalog insert algorithm improved performance for catalog insert in terms of CPU and in cases I/O. The log manager I/O was also changed to use multi-block I/O and SFS file cache default was changed to 20KB. VM/ESA 1.2.1 - The improvement to SFS control data backup reduces backup time, space required, and allows for more accurate planning. The SFS thread blocking I/O changes make it more practical for CMS multitasking applications to use the SFS functions asynchronously. VM/ESA 1.2.2 - By improving the handling of released file blocks, a scenario that could cause instances of high processor utilization was corrected. The revoke performance changes addressed problems with the overhead of revoking authority from an SFS file or directory. The performance of APPC/VM is key to the performance of SFS file pool requests. The various improvements here have helped SFS over the past several releases. There were no direct, major performance enhancements for SFS in releases 2.1.0 or 2.2.0. References
Acronyms...
|