LE-BASED APPLICATION HANGS AFTER RECOVERABLE ABEND.


 
 APAR Identifier ...... VM63795      Last Changed ........ 06/03/21
 LE-BASED APPLICATION HANGS AFTER RECOVERABLE ABEND.
 
 Symptom ...... IN INCORROUT         Status ........... CLOSED  PER
 Severity ................... 1      Date Closed ......... 05/08/02
 Component .......... 568411220      Duplicate of ........
 Reported Release ......... 440      Fixed Release ............ 999
 Component Name VM LE                Special Notice
 Current Target Date ..05/08/21      Flags
 SCP ...................
 Platform ............
 
 Status Detail: SHIPMENT - Packaged solution is available for
                           shipment.
 
 PE PTF List:
 
 PTF List:
 Release 440   : UM31480 available 05/08/04 (0601 )
 
 Parent APAR:
 Child APAR list:
 
 ERROR DESCRIPTION:
 User's MPROUTE virtual machine would occasionally enter a wait
 state, be unresponsive to external commands, and leave the
 network in a static state.  Looking at the state of the MPROUTE
 virtual machine shows the application waiting on the CMS multi-
 tasking null thread and all other threads are either blocked or
 suspended.
 
 LOCAL FIX:
 
 PROBLEM SUMMARY:
 ****************************************************************
 * USERS AFFECTED: All users of z/VM 4.4.0 Language             *
 *                 Environment.                                 *
 ****************************************************************
 * PROBLEM DESCRIPTION:                                         *
 ****************************************************************
 * RECOMMENDATION: APPLY PTF                                    *
 ****************************************************************
 The reported problem is that the MPROUTE virtual machine
 becomes unresponsive to external commands and ceases to
 perform its normal routing assignments.  The virtual machine
 is left waiting for something to happen and the PSW is
 sitting in the CMS multitasking null thread.  The virtual
 machine must be re-IPLed and MPROUTE restarted to continue
 processing.
 
 PROBLEM CONCLUSION:
 The problem is that internally LE can and does encounter
 various program interrupts (e.g. addressing exceptions,
 specification exceptions, et al.). Many of these program
 interrupts are recoverable, but occasionally, when running a
 POSIX(ON) application, the recovery process gets lost and
 winds up in the CMS multitasking null thread. This is the
 problem cited in this APAR.
 
 The problem has been tracked to the way the VM LE exception
 handler processes the CMS multitasking VMERROR event. LE's
 exception handler initialization creates an event monitor for
 the VMERROR event and sets up an event trap pointing to a
 VM-specific LE routine (CEEBVTRP). When the first program
 interrupt occurs, everything works as it should. The VMERROR
 event monitor "trips" and CEEBVTRP gets control, processes the
 VMERROR event into either STAE or SPIE data and lets the LE
 exception handler run its course. The problem occurs when the
 NEXT program interrupt is encountered. The VMERROR event
 monitor no longer knows what routine to call because the name
 of the routine has been cleared. The trap routine handler call.
 
 The fix for the problem is to move the event trap clean up out
 of the LE exception handler termination and into the overall LE
 termination.
 
 TEMPORARY FIX:
 
 COMMENTS:
 
 MODULES/MACROS:   CEEHTRM  CEEPLPKA CEEZDSEX
 
 SRLS:      NONE
 
 RTN CODES:
 
 CIRCUMVENTION:
 
 MESSAGE TO SUBMITTER: