Contents | Previous | Next

Linux Guest IUCV Driver

Executive Summary

We used a CMS test program to measure the best data rate one could expect between two virtual machines connected by IUCV. We then measured the data rate experienced by two Linux guests connected via the Linux IUCV line driver. We found that the Linux machines could drive the IUCV link to about 30% of its capacity.

We also conducted a head-to-head data rate comparison of the Linux IUCV and Linux CTC line drivers. We found that the data rate with the CTC driver was at best 72% of the IUCV driver's data rate, and the larger the MTU size, the larger the gap.

We did not measure the latency of the IUCV or CTC line drivers. Nor did we measure the effects of larger n-way configurations or other scaling scenarios.

Procedure

To measure the practical upper limit on the data rate, we prepared a pair of CMS programs that would exchange data using APPC/VM (aka "synchronous IUCV"). We chose APPC/VM for this because when WAIT=YES is used fewer interrupts are delivered to the guests and thus the data rate is increased.

The processing performed by the two CMS programs looks like this:

Requester                            Server
---------                            ------
                                     0.  Identify resource manager
1.  Allocate conversation
                                     2.  Accept conversation
3.  Do "I" times:
 
    4.  Sample TOD clock
    5.  Do "J" times:
        6.  Transmit a value "N"
                                     7. Receive value of "N"
                                     8. Transmit "N" bytes
        9. Receive "N" bytes
    10. Sample TOD clock again
    11. Subtract TOD clock samples
    12. Print microseconds used
 
13. Deallocate conversation
                                     14. Wait for another round

We chose I=20 and J=1000.

We ran this pair of programs for increasing values of N (1, 100, 200, 400, 800, 1000, 2000, ..., 8000000) and recorded the value of N at which the curve flattened out, that is, beyond which a larger data rate was not achieved.

For each value of N, we used CP QUERY TIME to record the virtual time and CP time for the requester and the server. We added the two machines' virtual and CP times together so that we could see the distribution of total processor time between CP and the two guests.

To measure the data rate between Linux guests, we set up two Linux 2.2.16 systems and connected them via the IUCV line driver (GA-equivalent level) with various MTU sizes. 1 We then ran an IBM internal network performance measurement tool in stream put mode, transfer size 20,000,000 bytes, 10 samples of 100 repetitions each, with API crossing sizes equal to the MTU size. We used CP QUERY TIME to record the virtual time and CP time used by each Linux machine during the run. We added the two machines' virtual and CP times together so that we could see the distribution of total processor time between CP and the two guests.

We repeated the aforementioned Linux experiment, using the CTC driver and a VCTC connection instead of the IUCV driver, and took the same measurements during the run.

Hardware Used

9672-XZ7, two-processor LPAR (dedicated), LPAR had 2 GB main storage and 2 GB XSTORE. z/VM V3.1.0.

Results

Here are the results we observed for our CMS test program.

Table 1. CMS IUCV Data Rates

Transfer Size (bytes) MB/sec MB/CPU-sec CPU-sec/second CP CPU-sec/MB %CP
1 0.017 0.018 0.944 46.137344 84.620
100 1.723 1.870 0.921 0.450888 84.310
1000 17.174 18.165 0.945 0.046662 84.760
1500 25.871 27.777 0.931 0.030409 84.470
2000 34.091 36.680 0.929 0.023331 85.580
4000 65.477 72.661 0.901 0.011665 84.760
8000 120.576 125.072 0.964 0.006947 86.890
9000 121.666 126.222 0.964 0.006991 88.240
10000 133.723 142.339 0.939 0.006134 87.310
20000 219.809 227.065 0.968 0.003985 90.480
32764 286.070 294.775 0.970 0.003121 91.980
40000 306.755 313.967 0.977 0.002976 93.420
80000 380.396 386.298 0.985 0.002470 95.440
100000 401.215 404.957 0.991 0.002380 96.390
200000 452.134 455.758 0.992 0.002144 97.730
400000 480.947 482.873 0.996 0.002045 98.730
800000 494.869 496.221 0.997 0.002000 99.220
1000000 496.635 497.612 0.998 0.001996 99.320
2000000 499.014 499.698 0.999 0.001990 99.460
4000000 499.976 500.485 0.999 0.001989 99.570
8000000 501.732 502.066 0.999 0.001985 99.660
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB.

Here are the results for our Linux IUCV line driver experiments.

Table 2. Linux IUCV Data Rates

MTU Size (bytes) MB/sec MB/CPU-sec CPU-sec/sec CP CPU-sec/MB %CP
1500 7.84 8.12 0.966 0.053393 43.33
9000 33.33 34.17 0.975 0.012367 42.26
32764 73.09 74.13 0.986 0.005608 41.57
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB.

Here are the results for our Linux CTC line driver experiments.

Table 3. Linux CTC Data Rates

MTU Size (bytes) MB/sec MB/CPU-sec CPU-sec/sec CP CPU-sec/MB %CP
1500 5.95 5.95 1.00 0.053472 31.97
9000 17.33 17.84 0.971 0.017920 31.97
32764 29.71 30.91 0.961 0.010346 31.98
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB.

Analysis

First let's compare some data rates, at a selection of transfer sizes (aka MTU sizes).

Table 4. Data Rate (MB/CPU-sec) Comparisons

MTU Size (bytes) CMS/IUCV Linux/IUCV Linux/CTC
1500 27.8 8.12 (0.292) 5.95 (0.214) [0.733]
9000 126.2 34.17 (0.271) 17.84 (0.141) [0.522]
32764 294.8 74.13 (0.251) 30.91 (0.105) [0.417]
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB. In the Linux/IUCV and Linux/CTC columns, a number in parentheses is the cell value's fraction of the CMS/IUCV value in the same row. In the Linux/CTC column, a number in brackets is the cell value's fraction of the Linux/IUCV value in the same row.

These numbers illustrate Linux's ability to utilize the IUCV pipe. Utilization at MTU 1500 runs at about 29%. As we move toward larger and larger frames, IUCV utilization goes down.

We see also that the Linux IUCV line driver is a better data rate performer than the Linux CTC line driver, at each MTU size we measured.

Next we examine CP CPU time per MB transferred, for a selection of MTU sizes.

Table 5. CP CPU-sec/MB Comparisons

MTU Size (bytes) CMS/IUCV Linux/IUCV Linux/CTC
1500 0.030409 0.053393 (1.756) 0.053472 (1.758) [1.001]
9000 0.006991 0.012367 (1.770) 0.017920 (2.563) [1.449]
32764 0.003121 0.005608 (1.796) 0.010346 (3.315) [1.845]
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB. In the Linux/IUCV and Linux/CTC columns, a number in parentheses is the cell value's fraction of the CMS/IUCV value in the same row. In the Linux/CTC column, a number in brackets is the cell value's fraction of the Linux/IUCV value in the same row.

We see here that the Linux/IUCV cases use about 1.8 times as much CP CPU time per MB as the CMS/IUCV case. This is likely indicative of the extra CP time required to deliver the extra IUCV interrupt to the Linux guest, though other Linux overhead issues (e.g., timer tick processing) also contribute.

We also see that in the Linux/CTC case, CP CPU time is greater than in the Linux/IUCV case, and as MTU size grows, the gap widens. Apparently there is more overhead in CP CTC processing than in CP IUCV processing, so the fixed cost is not amortized as quickly.

Now we examine virtual CPU time per MB transferred, for a selection of MTU sizes.

Table 6. Virtual CPU-sec/MB Comparisons

MTU Size (bytes) CMS/IUCV Linux/IUCV Linux/CTC
1500 0.005592 0.069819 (12.5) 0.114499 (20.5) [1.64]
9000 0.000932 0.016896 (18.1) 0.038124 (40.9) [2.26]
32764 0.000272 0.007882 (29.0) 0.022001 (80.9) [2.79]
Note: 9672-XZ7, 2 GB main, 2 GB expanded. Two-processor LPAR, both dedicated. z/VM V3.1.0. Guests 128 MB. In the Linux/IUCV and Linux/CTC columns, a number in parentheses is the cell value's fraction of the CMS/IUCV value in the same row. In the Linux/CTC column, a number in brackets is the cell value's fraction of the Linux/IUCV value in the same row.

These numbers illustrate the cost of the TCP/IP layers in Linux kernel and its line drivers. This cost is the biggest reason why Linux is unable to drive the IUCV connection to its capacity. CPU resource is being spent on running the guest instead of on running CP, where data movement actually takes place.

These numbers also show us that the Linux guest consumes more CPU in the CTC case than in the IUCV case. Apparently the Linux CTC line driver uses more CPU per MB transferred than the Linux IUCV line driver does.

Conclusions

  • The Linux IUCV driver is able to drive the IUCV pipe to at best about 30% of its capacity. As the MTU size grows, the percentage drops.

  • The Linux IUCV line driver provides a greater data rate than the Linux CTC line driver over a virtual CTC for the streaming workload measured.

  • CP CTC support uses more CPU per MB transferred than CP IUCV support, at a given MTU size.

  • For large transfers between Linux guests, the data rate would be improved if the IUCV line driver were modified to support MTU sizes beyond 32764. Based on our measurements, supporting a maximum MTU size as high as 2 MB would probably be sufficient.

    However, use of a large MTU size to increase data rate between two Linux images must be balanced against fragmentation issues. As long as the traffic on the IUCV link is destined to remain within the VM image, very large MTU sizes are probably OK. However, if the packets will eventually find their way onto real hardware, they will be fragmented down to the minimum MTU size encountered on the way to the eventual destination. This might be as small as 576 bytes depending on network configuration. The system programmer needs to take this into account when deciding whether to use a very large MTU size on an IUCV link.

  • IUCV and virtual CTC are memory-to-memory data transfer technologies. Their speeds will improve as processor speed improves and as memory access speed improves.

Footnotes:

1
uname -a reported Linux prf3linux 2.2.16 #1 SMP Mon Nov 13 09:51:30 EST 2000 s390 unknown.

Contents | Previous | Next