my_system_just_crashed_--_now_what

My VMS system is crashing right now! What, exactly, should I do?

If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too;…
If you can meet with Triumph and Disaster
And treat those two impostors just the same;…
If you can fill the unforgiving minute
With sixty seconds’ worth of distance run,
Yours is the Earth and everything that’s in it,
And—which is more—you’ll be a Man, my son!

If― Rudyard Kipling (1910)

1. First of all, relax!… Don’t panic. VMS has already done that for you… (remember, some operating systems call this event a “system or kernel panic”).

2. Deal with your users. Communicate with them, let them know that you know that the system has crashed, and that you’re responding to it. Your calm confident demeanor helps now.

3. If possible, get to your VMS system’s console terminal, or connect remotely to your Integrity’s iLO if available, as soon as possible. During a VMS crash, that’s where the action is.

4. Once you’re at (or connected to) the console, don’t interrupt what it’s doing, and don’t power-cycle the system. Let it finish writing RAM contents to the Dump File ― let it finish the dump. If you interrupt this, the dump will be incomplete, possibly corrupt, and certainly not of much use for post-crash analysis. Let it finish.

5. Once the dump has finished writing RAM contents to disk, one of two things will happen, depending upon how your console/system is configured:

  • The system will reboot automatically. Or
  • The system will halt. At that point, it’s up to you to decide when to reboot it manually.

6. As the system reboots, Startup Command Files output will either scroll on your console terminal’s display screen, or it will log that output to the startup log-file SYS$SYSTEM:STARTUP.LOG ― which of these occurs depends on the following SYSMAN setting:

SYSMAN> STARTUP SHOW OPTIONS
Current startup options on node CLASS8:
    DCL verification mode is: OFF
    Startup log will be written to
    console terminal
    Checkpointing messages are disabled 

Or…

SYSMAN> STARTUP SHOW OPTIONS
Current startup options on node CLASS8:
    DCL verification mode is: PARTIAL
    Startup log will be written to
    SYS$SYSTEM:STARTUP.LOG
    Checkpointing messages are enabled

For nearly all systems and situations, we recommend logging Startup Command Files output to the log-file SYS$SYSTEM:STARTUP.LOG, ― this permits you to review the full results of all startup command files during a system reboot; the alternative is to have to stand at the console terminal and watch the lengthy com-file output as it scrolls off your terminal display.

To set your system to use the log-file for reboot output:
SYSMAN> STARTUP SET OPTIONS /OUTPUT=FILE /VERIFY=PARTIAL /CHECKPOINT

7. After a crash dump has been written, verify that the Dump File’s contents have been copied (by SDA> COPY DUMPFILE_COPY.DMP) from the active Dump File (SYS$SYSTEM:SYSDUMP.DMP) to an alternative, safe copy-file. This ensures that you’ve got this crash-event’s dump preserved (especially if you’re experiencing a cycle of crashes), and that you’ll be able to send that dump copy-file along to your crash dump analysis experts.

8. See the pages How To Capture a VMS Crash Dump and How To Send a VMS Crash Dump to PARSEC for Analysis for more information.

my_system_just_crashed_--_now_what.txt · Last modified: 2018/09/10 17:44 by lricker