Virtual CPU Time: Raw Time, MT-1 Equivalent Time, and Prorated Core Time

(Last revised: 2018-03-19, BKW)

In a recent PMR a customer using z/VM in SMT-2 mode asked about the D4 R3 VMDTTIME and VMDVTIME fields. The former represents total CPU time accrued. The latter represents virtual (emulation) CPU time accrued. He thought he could compute induced CP time accrued -- he called it "VMDCTIME" -- by calculating the difference VMDTTIME - VMDVTIME. In doing his computation he found the delta to be negative. He opened a PMR and asked how this could be so.

In his PMR the customer also asked about prorated core time. He wanted to know how this could be used for accounting if it accrued at such a slow rate.

In preparing my response I realized we did not have a web article to which I could point him. So below I have included my response, which I hope will serve for future explanations.

Response

Hello Xxxxx,

Xxxx Xxxx, of z/VM Level 2, Endicott, asked me to help with your PMR xxxxx. He gave me the PMR text and also gave me a MONWRITE file you sent.

Here are my comments.

All six of those fields you mentioned -- that is, VMDVTIME, VMDTTIME, VMDVTIME_MT1, VMDTTIME_MT1, VMAVTIME_PRO, and VMATTIME_PRO -- are stored in complement form. In other words, accrual is reflected by the numeric value of the field decreasing. So,

  1. Let's say that for a given virtual CPU, the most-recent D4 R3 record has subscript 0 and the next-most-recent D4 R3 record has subscript 1.
  2. The amount of VMDVTIME used in the most-recent monitor interval is given by this formula: delta_VMDVTIME = VMDVTIME(1) - VMDVTIME(0). Yes, this is counterintuitive.
  3. It's the same for the other five fields. The subtraction to perform is always "next-most-recent value minus most-recent value".
  4. Once you have calculated the six deltas as I have stated above, you can then calculate the three synthetic delta_VMDCTIME_x values in the way you stated. For example, delta_VMDCTIME = delta_VMDTTIME - delta_VMDVTIME.

Attached find a report I built from your MONWRITE file called PMxxxxx MONDATA:

  1. The data was collected in partition XXXX, March 6, 19:52 to 20:51 system time.
  2. I generated the report for your guest XXXXXXXX, for its virtual CPU 0 and for its virtual CPU 1.

(report file)

The first two blocks of the report just dump out the six D4 R3 fields. For readability I converted each value from TOD units to seconds by dividing by 4096000000. You can see that as time-of-day advances, all six fields decrease. Yes, these six fields record accrual by decreasing. You can also see that in a given D4 R3 record, the "V" fields are larger in magnitude than are the corresponding "T" fields. This too is correct. The "V" fields accrue (decrease) more slowly than the "T" fields accrue (decrease), because, as you stated, "T" time accrues more rapidly than "V" time does.

The second two blocks of the report show percent-busy for all six fields, where percent-busy = 100 * (delta_VMDxxxx_field) / (delta_TOD_clock).

You asked about prorated core time. In SMT-2 mode, each logical IFL core embodies two logical IFL processors. Each logical IFL processor can run one virtual CPU at a time. The rules for a virtual CPU accruing prorated core time are these:

  1. When a virtual CPU runs alone on a core, prorated core time for that virtual CPU runs at the same speed as raw time. If a virtual CPU runs alone on a core for one second, it accrues one second of raw time and one second of prorated core time.
  2. When a virtual CPU runs while sharing a core with another virtual CPU, prorated core time for that virtual CPU runs at half the rate of raw time. If a virtual CPU runs on a core for one second, but for that entire second there is another virtual CPU running on the other logical processor of the core, the virtual CPU will accrue 0.5 second of prorated core time.
  3. Another way to think of (a) and (b) together is that each logical core never gives out more than one second of prorated core time in an elapsed second.

Your guest XXXXXXXX has two virtual CPUs. In 1.0 elapsed second, the most prorated core time this two-virtual-CPU guest could ever accrue is 2.0 seconds. That would happen only if each of the two virtual CPUs were dispatched for the entire elapsed second, concurrently, with each virtual CPU running alone on its own core. By contrast, if the two virtual CPUs were dispatched concurrently for the whole elapsed second but they ran together on the same core, the guest would accrue 1.0 second of prorated core time.

You also asked about how to report CPU time consistently between SMT-2 and non-SMT modes of operation. The answer to that question hinges entirely upon your definition of "consistently". One answer is this: raw time is raw time. VMDTTIME and VMDVTIME count raw CPU-seconds consumed, no matter whether the system is running non-SMT, or SMT-1, or SMT-2. However, we recognize that in an SMT-2 environment, a raw CPU-second isn't necessarily worth as much computation power when you're sharing the core as it is when you run alone on the core. This is why we invented MT-1 equivalent time. MT-1 equivalent time is meant to be a normalized statement of how much "CPU power" you got for your raw CPU-second. When the system is configured as non-SMT or as SMT-1, MT-1 equivalent time will accrue at the same rate as raw time. When the system is configured as SMT-2, MT-1 equivalent time will tend to accrue more slowly than raw time, because when the virtual CPU shares a core, a CPU-second is not necessarily worth as much "bang" as it is when the virtual CPU runs alone on the core. Mostly what we had in mind for MT-1 equivalent time was billing and chargeback. We wanted MT-1 equivalent time to be an expression of how much power the virtual CPU got, not an expression of how long it ran.

I hope this helps.

Thank you,
Brian