User Tools

Site Tools


how_to_capture_a_vms_crash_dump

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
how_to_capture_a_vms_crash_dump [2018/09/07 19:47]
lricker
how_to_capture_a_vms_crash_dump [2018/09/10 21:40]
lricker
Line 1: Line 1:
-(__Under Construction!__) +===== How to Capture a VMS Crash Dump ===== 
- +===or: What should be done to preserve the System Dump File after VMS reboots? ​====
-===== HOW TO CAPTURE A CRASH DUMP =====+
  
 Sometimes a VMS problem manifests itself as a system crash. ​ A VMS system crash is not an accident or a mistake -- it is an intentional response by the operating system (actual program code written by the VMS Engineering Sometimes a VMS problem manifests itself as a system crash. ​ A VMS system crash is not an accident or a mistake -- it is an intentional response by the operating system (actual program code written by the VMS Engineering
Line 13: Line 12:
 As a VMS sys-admin, you should know precisely how your VMS system(s) are configured to handle your SYSDUMP.DMP file upon reboot after a crash. ​ PARSEC can assist you with information on determining your systems'​ config, both in this wiki and/or with an MEP white-paper (available through your PARSEC Account Representative). As a VMS sys-admin, you should know precisely how your VMS system(s) are configured to handle your SYSDUMP.DMP file upon reboot after a crash. ​ PARSEC can assist you with information on determining your systems'​ config, both in this wiki and/or with an MEP white-paper (available through your PARSEC Account Representative).
  
 +==== Your Action Items ====
 +
 +Whether you configure VMS for “normal” on-system-disk dumps, dumps into the Page File, DOSD, or other more exotic configurations (don’t!), there’s one last configuration step that you must do to ensure that, after every VMS crash, you preserve this hard-won, valuable crash dump information for subsequent analysis.
 +
 +Remember that each system crash overwrites the previous contents of the System Dump File, so ― especially if your system is crashing repetitively for some hard-ware-related reason ― it’s essential that, upon reboot, your system'​s //​site-specific Startup Command File copies the contents of the Dump File to another, alternative file//​. ​ This alternative file can be stored on any other disk (probably best if not the VMS system disk or the DOSD disk), and can be named anything you’d like.
 +
 +For example, let’s assume that CLASS8’s DSA2: (shadow-set) disk has “plenty of free space”, and that we can copy several versions of our System Dump File to it before we have to worry about clean-ups or purges. ​ Create a working directory on that drive:
 +
 +$ CREATE /DIRECTORY /​OWNER=PARENT /​PROT=(S:​RWE,​O:​RWE,​G,​W) DSA2:​[LRICKER.CRASH_ANALYSIS]
 +
 +Now this directory can become the catch-point for any subsequent crash-dump copy:
 +
 +''​$ **ANALYZE /CRASH_DUMP SYS$SYSTEM:​SYSDUMP.DMP**''​\\
 +''​SDA>​ **copy /collect /log DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP**''​
 +  Copying: Headers...
 +  Copying: PT space - 47 blocks...
 +  Copying: S0/S1 space - 53922 blocks...
 +  Copying: S2 space - 37568 blocks...
 +  Copying: Page tables of key process "​SWAPPER"​ - 3 blocks...
 +  Copying: Memory of key process "​SWAPPER"​ - 3 blocks...
 +  Copying: Page tables of process "​SYSINIT"​ - 8 blocks...
 +  Copying: Memory of process "​SYSINIT"​ - 1478 blocks...
 +  Copying: Page tables of process "​STACONFIG"​ - 8 blocks...
 +  Copying: Memory of process "​STACONFIG"​ - 1721 blocks...
 +  %SDA-I-COLLECTING,​ collecting file and/or unwind data
 +  Scanning: Process "​SWAPPER"​ (PCB 8435CF48)...
 +  Scanning: Process "​SYSINIT"​ (PCB 85216740)...
 +  Scanning: Process "​STACONFIG"​ (PCB 85218D80)...
 +  Scanning: Page and swap files...
 +  %SDA-W-NOCOLLECT,​ no file and/or unwind data collected
 +  Rewriting: Headers...
 +''​SDA>​ **exit**''​
 +
 +With this in mind, there are several points to consider regarding VMS system startup crash-dump/​SDA processing:
 +
 +**On Alpha and I64 systems**: SDA is invoked by default during startup, and a CLUE list file is created as generated by a set sequence of commands; this CLUE list file contains only an overview of the crash and might not provide enough information to determine the cause of the crash.
 +
 +Always copy the system dump file to its alternative destination directory/​file.
 +
 +  * Although you could use the DCL command COPY to copy the dump file, don’t. SDA’s internal COPY command is preferable because it copies only the blocks occupied by the dump and then marks the Dump File as copied.
 +  * The SDA COPY command is also preferable when the dump was written into the primary Page File, SYS$SYSTEM:​PAGEFILE.SYS,​ because SDA COPY releases the dump page-blocks back to the pager after they’re copied.
 +  * Because a System Dump File can contain privileged and/or private information,​ always protect copies of dump files from world read access.
 +  * System Dump Files have the NOBACKUP attribute, so the BACKUP utility does not copy them unless you use the qualifier /​IGNORE=NOBACKUP. ​ When you use SDA COPY to copy the System Dump File to another file, the operating system does not automatically set the new file to NOBACKUP. ​ If you want to set the NOBACKUP attribute on the copy, use SET FILE /NOBACKUP on the copied file(s).
 +
 +The recommended method for SDA startup-the-system processing on Integrity and AlphaServer systems is:
 +
 +  * Create a /SYSTEM /​EXECUTIVE_MODE logical name CLUE$SITE_PROC in the SYS$STARTUP:​SYLOGICALS.COM command file. This will name (refer to) your own site-specific “save-the-dump” file, here SAVEDUMP.SDA. ​ This logical name must be (re)created each time your system reboots, so add this line to your SYLOGICALS.COM file:
 +
 +  $ DEFINE /SYSTEM /EXEC CLUE$SITE_PROC SYS$STARTUP:​SAVEDUMP.SDA
 +
 +  * Here’s an example of the contents of this site-specific command file SYS$STARTUP:​SAVEDUMP.SDA ― Cut-&​-paste these lines to create this file on your own system:
 +
 +  ! SAVEDUMP.SDA --
 +  ! SDA command file, executed as part of the system reboot.
 +  ! Used to save the dump file after a system bugcheck, and
 +  ! to execute any additional SDA commands.
 +  !
 +  READ /EXEC  ! Read in the executive images'​ symbol tables
 +  SHOW STACK  ! Display the stack
 +  COPY /COLLECT DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP ​ ! Copy/save system dump file
 +  !
 +
 +  * Of course, you must replace DSA2:​[...]DUMPFILE_COPY.DMP above with your own site-specific disk, directory and filename.
 +
 +  * You should also include commands, perhaps in SYLOGICALS.COM,​ or even in SYSTARTUP_VMS.COM,​ to ''​SET FILE /​NOBACKUP''​ and ''​PURGE /​KEEP=n''​ for those copied Dump Files. For example: ​
 +
 +  $ SET FILE /NOBACKUP DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP
 +  $ PURGE /KEEP=3 DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP
 +
 +  * These commands will be executed well after VMS executes the SDA commands in SAVEDUMP.SDA (above) following any actual system crash, and it won’t matter much if these are re-executed on normal reboots as well.
 +
 +**On VAX/VMS systems**: SDA crash-dump copy processing is (was) likely embedded in the SYS$STARTUP:​SYSTARTUP_VMS.COM command procedure. ​ If you look, you’ll likely see commands similar to this:
 +
 +  $ ANALYZE/​CRASH_DUMP SYS$SYSTEM:​SYSDUMP.SYS
 +  COPY DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP
 +  EXIT
 +  $ SET FILE /NOBACKUP DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP
 +  $ PURGE /KEEP=2 DSA2:​[LRICKER.CRASH_ANALYSIS]DUMPFILE_COPY.DMP
 +
 +Of course, you’re system will have a different disk device, directory and filename for what’s shown as DSA2:​[...]DUMPFILE_COPY.DMP above.
 +
 +Note that, on VAX/VMS, these commands simply appear (are edited) in-line in the SYSSTARTUP_VMS.COM command procedure, and are executed when encountered.
how_to_capture_a_vms_crash_dump.txt · Last modified: 2018/09/10 21:40 by lricker