z/VM Single System Image Overview

z/VM V6.2 introduced the Single System Image (SSI) as a priced feature. With z/VM 7.1, SSI is now included as part of the base. SSI enhances the z/VM systems management, communications, disk management, device mapping, virtual machine definition management, installation, and service functions to enable multiple z/VM systems to share and coordinate resources within a Single System Image structure. This combination of enhanced functions provides the foundation that enables Live Guest Relocations (LGR), which is the ability for a Linux guest to be moved from one z/VM system to another within the SSI cluster.

Planning for a Single System Image and Live Guest Relocations

There are several reasons why you might need to relocate a virtual server while keeping the server available. You might need to relocate a virtual server for workload rebalancing, or to do software maintenance or hardware maintenance. Before you relocate a guest, there are architectural, disk, memory, and networking requirements plus guidelines you must understand. Below are some hints to help with installation of the SSI feature and tips to get you started relocating a Linux guest.

Hints and Tips:

  • z/VM SSI Installation

    Please keep in mind even if you have previous experience with installation and service of z/VM, it is important that you read the instructions for installation, whether or not you intend you use SSI. To plan and prepare, we encourage you to familarize yourself by referring to the following publications: z/VM: Getting Started with Linux on z Systems and the z/VM: CP Planning and Administration publications.

  • ISFC Set-up

    An SSI cluster must have direct logical links between all systems. All SSI clusters use ISFC for intra-cluster communication and live guest relocation. ISFC uses CTC devices. For maximum throughput, when you are setting up your network, follow the guidelines for planning your network in an SSI cluster located in the z/VM: Getting Started with Linux on System z, chapter 2. Other things being equal, faster CTC speeds increase throughput and result in shorter relocations.

    Factors that can affect relocation are:

  • Virtual Machine Memory

    The size and use of the virtual machine's memory can affect relocation performance. Parts of the processing for relocation are proportional to the size of the virtual machine. The cost of this processing increases with larger virtual machines. Relocation performance is also impacted by the frequency and amount of memory being changed in the virtual machine.

  • Matching Virtual Machine Configurations

    To prepare for live guest relocation, you need to ensure that the virtual machine has a relocatable configuration and that a matching configuration can be set up on the destination system. There are configuration attributes the guest must not have, because they cannot be relocated, and there are also characteristics the destination system must have in order to provide an identical virtual machine configuration. For information on configuration requirements and on verifying a virtual machine's eligibility to relocate, please refer to the z/VM: CP Planning and Administration, chapter 27.

  • CPU Utilization

    SSI will synchronize all the members in the cluster. You must ensure that you have allotted enough system resources to account for the necessary synchronization and communication among members. Understand that independent systems not formally clustered do not require this synchronization overhead. After initialization, the synchronization overhead is relatively low. Communication between members does increase during negotiations for access to devices and other resources, as well as during Live Guest Relocation. For example two independent systems that today run fine at peak utilization (close to 100%), when joined in a cluster may have performance problems.

    For z/VM members that are running as a second level z/VM system, they should not be waiting for CPU more than 10% of the time. For additional details refer to the Resource Limit Conditions section of the z/VM: CP Planning and Administration, chapter 27.

  • Paging and Other System Resouces

    To prepare for Live Guest Relocation, the target system must have enough system resource during and after the relocation. You will need to ensure your paging space is adequate. To be safe, there should be twice as much space available as the total virtual memory that can be defined on the system. The easiest way to check on this aspect of system resources is to issue the CP QUERY ALLOC PAGE command which will show the percent used, the slots available, and the slots in use. If you add in the size of the virtual machine(s) being relocated (a 4KB page = a 4KB slot) to the slots in use, and that brings the in use percentage over 50%, that may have an undesirable impact on system performance. Remember this query command provides only a snap shot in time.

  • Real Memory

    Real memory resources are important for both the source and the destination systems for relocations. You will need enough real memory 1) to hold buffers during the relocation on both systems, and 2) to accommodate the incoming guest's working set afterward on the target system.

    Relocation performance will also be affected by the level of overall resource constraint for both the source and destination systems.

Please keep in mind that all systems are unique, consisting of different hardware/software levels, Linux levels, networks and configurations. The recommendation and best practice will be to verify a relocation between members by testing it prior to a planned relocation.

For more information, see Chapter 29. Preparing for Guest Relocations in a z/VM SSI Cluster in the z/VM: CP Planning and Administration. For more information about networking devices, see Chapter 6. Live Guest Relocation Networking Considerations in z/VM: Connectivity. Another source of information can be found at virtual networking hints and tips.