Automatic Application Hang Detection is an
InSightfeaturewhich, when enabled, attempts to
detect certain application hang conditions.
Once detected, a
‘possible hang’ snapshot dump is generated.
The ‘possible hang’
snapshot dump must be confirmed or unconfirmed.
If confirmed, it is
renamed to an actual snapshot dump.
If unconfirmed the
temporary ‘
possbilehang’ snapshot dump is deleted.
Hang detection is based on monitoring calls from the application to
the system.
Normally, the
application is consistently making calls to the operating system on
a very regular basis.
Even when idle and
waiting for input, nearly all applications wake up at least once
per second and perform some work before going back into a wait.
The hang
detection logic simply says that, if no system calls are made by
the application within some amount of time (3
secsby default), there is potentially a
hang.
If the hang lasts 10
times this detection limit, an actual hang is very likely.
The feature needed to understand a difference between ‘possible’
and ‘likely’ because operators can be quick to restart the terminal
if they think
itshung by powering it off then on.
Application hang detection was initially designed to detect hangs
caused by hardware or hardware driver issues.
Applications were
making calls to the printer and hanging indefinitely.
Snapshot dump data was
used to understand specifically what happened just prior
to
thehang.
Confirmation occurs if the terminal is restarted and, during the
next IPL, the terminal agent finds an existing unconfirmed snapshot
dump file.
Confirmation also occurs if the application remains hung for 10
timeout periods.
Unconfirmationhappens if the
application resumes making calls before 10 timeout periods.
The setting of the timeout parameter is, by its nature, a balancing
act between the possibility of generating meaningless snapshot
dumps and having a good chance of having the data needed to resolve
whatever the issue is.
The snapshot dump feature has a related and kind-of built-in
problem automatic problem detection capability.
This is because
sequential clear keys are used to force a snapshot dump and because
clear is often hit repeatedly be operators when something is wrong.
The option is enabled via CDI parameters and is disabled by
default.
Required CDI keyword
:
tagentApplHangDetectList.
Note that full
documentation for
cdikeywords is available in
c:\qsa\cdi\insight.all as part of controller installation.