LE-BASED APPLICATION HANGS AFTER RECOVERABLE ABEND.
APAR Identifier ...... VM63795 Last Changed ........ 06/03/21 LE-BASED APPLICATION HANGS AFTER RECOVERABLE ABEND. Symptom ...... IN INCORROUT Status ........... CLOSED PER Severity ................... 1 Date Closed ......... 05/08/02 Component .......... 568411220 Duplicate of ........ Reported Release ......... 440 Fixed Release ............ 999 Component Name VM LE Special Notice Current Target Date ..05/08/21 Flags SCP ................... Platform ............ Status Detail: SHIPMENT - Packaged solution is available for shipment. PE PTF List: PTF List: Release 440 : UM31480 available 05/08/04 (0601 ) Parent APAR: Child APAR list: ERROR DESCRIPTION: User's MPROUTE virtual machine would occasionally enter a wait state, be unresponsive to external commands, and leave the network in a static state. Looking at the state of the MPROUTE virtual machine shows the application waiting on the CMS multi- tasking null thread and all other threads are either blocked or suspended. LOCAL FIX: PROBLEM SUMMARY: **************************************************************** * USERS AFFECTED: All users of z/VM 4.4.0 Language * * Environment. * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: APPLY PTF * **************************************************************** The reported problem is that the MPROUTE virtual machine becomes unresponsive to external commands and ceases to perform its normal routing assignments. The virtual machine is left waiting for something to happen and the PSW is sitting in the CMS multitasking null thread. The virtual machine must be re-IPLed and MPROUTE restarted to continue processing. PROBLEM CONCLUSION: The problem is that internally LE can and does encounter various program interrupts (e.g. addressing exceptions, specification exceptions, et al.). Many of these program interrupts are recoverable, but occasionally, when running a POSIX(ON) application, the recovery process gets lost and winds up in the CMS multitasking null thread. This is the problem cited in this APAR. The problem has been tracked to the way the VM LE exception handler processes the CMS multitasking VMERROR event. LE's exception handler initialization creates an event monitor for the VMERROR event and sets up an event trap pointing to a VM-specific LE routine (CEEBVTRP). When the first program interrupt occurs, everything works as it should. The VMERROR event monitor "trips" and CEEBVTRP gets control, processes the VMERROR event into either STAE or SPIE data and lets the LE exception handler run its course. The problem occurs when the NEXT program interrupt is encountered. The VMERROR event monitor no longer knows what routine to call because the name of the routine has been cleared. The trap routine handler call. The fix for the problem is to move the event trap clean up out of the LE exception handler termination and into the overall LE termination. TEMPORARY FIX: COMMENTS: MODULES/MACROS: CEEHTRM CEEPLPKA CEEZDSEX SRLS: NONE RTN CODES: CIRCUMVENTION: MESSAGE TO SUBMITTER: