User Tools

Site Tools


how_to_clone_tru64_and_digital_unix

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
how_to_clone_tru64_and_digital_unix [2018/12/11 16:02] – [Fix the rc.config] sgriggshow_to_clone_tru64_and_digital_unix [2018/12/11 17:02] – [Fix the Sysconfigtab] sgriggs
Line 1: Line 1:
-== Cloning Digital Unix and Tru64 ==+=== Cloning Digital Unix and Tru64 ===
  
 There are various places on The Net you can find how to clone a Tru64 system, but most of them are just discussions and hand waving. The complete process is documented here.  There are various places on The Net you can find how to clone a Tru64 system, but most of them are just discussions and hand waving. The complete process is documented here. 
  
-=== Process Overview ===+==== Process Overview ====
  
 Here is a quick and dirty view of this overall process.  Here is a quick and dirty view of this overall process. 
Line 26: Line 26:
   - Do not plan on using the **dd** method unless you have completely identical disks. That has several major drawbacks otherwise. The worst include performance problems (due to bad geometry alignment on the disklabel) and loss of space if the target disk is larger (or loss of __data__ if the target disk is smaller).    - Do not plan on using the **dd** method unless you have completely identical disks. That has several major drawbacks otherwise. The worst include performance problems (due to bad geometry alignment on the disklabel) and loss of space if the target disk is larger (or loss of __data__ if the target disk is smaller). 
  
-=== Take Inventory ===+==== Take Inventory ====
  
 Cloning can be complicated because of the different filesystems and storage layouts on your system. Before you begin, you need to know what kind of storage you are using and how it's laid out. We need to answer the following questions: Cloning can be complicated because of the different filesystems and storage layouts on your system. Before you begin, you need to know what kind of storage you are using and how it's laid out. We need to answer the following questions:
Line 63: Line 63:
  
  
-=== About LSM ===+==== About LSM ====
  
 LSM is a volume management scheme which is pretty much identical to Veritas Volume Manager (VxVM). This is because, at the time, DEC was able to secure a one-shot licensed copy of VxVM but they agreed to change it's name. So, if you know VxVM, then all you do is replace the string "vx" with the string "vol" on all the commands and they will work fine. For example, instead of using **vxprint** you use **volprint** and so forth. Don't get me wrong, VxVM is a great volume manager, it's just that when they glued it onto Tru64, they really really made root disk encapsulation (putting your boot drives into VxVM and getting RAID-1 working so you can boot off both sides of the mirror) is a HUGE pain in the neck to fix or maintain. It's pretty easy to install with it, but you end up with a giant white elephant. Most folks are just as likely to delete their surviving mirror as they are to resilver and fix their systems when using LSM.  LSM is a volume management scheme which is pretty much identical to Veritas Volume Manager (VxVM). This is because, at the time, DEC was able to secure a one-shot licensed copy of VxVM but they agreed to change it's name. So, if you know VxVM, then all you do is replace the string "vx" with the string "vol" on all the commands and they will work fine. For example, instead of using **vxprint** you use **volprint** and so forth. Don't get me wrong, VxVM is a great volume manager, it's just that when they glued it onto Tru64, they really really made root disk encapsulation (putting your boot drives into VxVM and getting RAID-1 working so you can boot off both sides of the mirror) is a HUGE pain in the neck to fix or maintain. It's pretty easy to install with it, but you end up with a giant white elephant. Most folks are just as likely to delete their surviving mirror as they are to resilver and fix their systems when using LSM. 
Line 69: Line 69:
 Systems using LSM are extremely hard to clone unless you can use dd alone to clone a single disk. This would not be a very common LSM configuration (what would be the point of a single-disk LSM config?). So, my advice on cloning LSM-based systems is "do not try". Instead, recreate the LSM RAID layout via a "shell" Tru64 installation and then use **vdump** and **vrestore** to get the data back over there.  Systems using LSM are extremely hard to clone unless you can use dd alone to clone a single disk. This would not be a very common LSM configuration (what would be the point of a single-disk LSM config?). So, my advice on cloning LSM-based systems is "do not try". Instead, recreate the LSM RAID layout via a "shell" Tru64 installation and then use **vdump** and **vrestore** to get the data back over there. 
  
-=== The Boot Sector ===+==== The Boot Sector ====
  
 In Tru64 and Digital Unix you need to make sure the disk has the proper boot blocks at the front of the disk. This is done using **disklabel** and it's not super-intuitive. You absolutely must use both the **-rw** and the **-t** flags when installing boot blocks. Without both sets of flags the procedure will fail. Also, the behavior of the tool is a bit odd sometimes and the boot blocks don't get properly installed or get clobbered later on. So, when starting a clone, I'd suggest zeroing out the disklabel and re-installing it from scratch using the exact-right syntax, then doing only edits (using **-e** to **disklabel**) after that point. Here is a couple of examples. I'll use a disk name of **rz0** in this case, but you should alter that to fit your system. Also, don't do all the steps, but just the ones that correspond with your file system type. Boot blocks have to be customized for the file system that you are using.  In Tru64 and Digital Unix you need to make sure the disk has the proper boot blocks at the front of the disk. This is done using **disklabel** and it's not super-intuitive. You absolutely must use both the **-rw** and the **-t** flags when installing boot blocks. Without both sets of flags the procedure will fail. Also, the behavior of the tool is a bit odd sometimes and the boot blocks don't get properly installed or get clobbered later on. So, when starting a clone, I'd suggest zeroing out the disklabel and re-installing it from scratch using the exact-right syntax, then doing only edits (using **-e** to **disklabel**) after that point. Here is a couple of examples. I'll use a disk name of **rz0** in this case, but you should alter that to fit your system. Also, don't do all the steps, but just the ones that correspond with your file system type. Boot blocks have to be customized for the file system that you are using. 
Line 104: Line 104:
 </code> </code>
  
-=== Editing the Disklabel ===+==== Editing the Disklabel ====
  
 Tru64 and Digital Unix have a strong connection to BSD Unix. This is because OSF/1 which was the predecessor to Digital Unix (ie.. the 1.x - 3.x versions of the OS were called OSF/1) used BSD for the majority of it's user space programs. Why re-invent all that good stuff when BSD set the defacto standard everyone was following for TCP/IP programs? Yes, the kernel is still mostly a microkernel and is just DEC's own thing (but resembles the Carnagie Mellon [[https://en.wikipedia.org/wiki/Mach_(kernel)|Mach]] operating system kernel a bit). DEC, IBM, and HP partnered to create OSF/1 but DEC was the only one who didn't eventually walk away from the partnership.  Tru64 and Digital Unix have a strong connection to BSD Unix. This is because OSF/1 which was the predecessor to Digital Unix (ie.. the 1.x - 3.x versions of the OS were called OSF/1) used BSD for the majority of it's user space programs. Why re-invent all that good stuff when BSD set the defacto standard everyone was following for TCP/IP programs? Yes, the kernel is still mostly a microkernel and is just DEC's own thing (but resembles the Carnagie Mellon [[https://en.wikipedia.org/wiki/Mach_(kernel)|Mach]] operating system kernel a bit). DEC, IBM, and HP partnered to create OSF/1 but DEC was the only one who didn't eventually walk away from the partnership. 
Line 149: Line 149:
 In general, if you are cloning a UFS based system, then be very careful that your disklabel is going to give you enough space for the **/** and **/usr** file systems. If you are using AdvFS make sure that the total slices you set aside can be used to add up to the sizes you need (ie.. remember that AdvFS can do concatination, mirroring, and striping between disk/block devices). This is an effort you need to make before you start copying over files, because by then it could be too late to correct a size mismatch and you'll simply find out because the destination file system or file set will fill up before your copy/sync operation completes.  In general, if you are cloning a UFS based system, then be very careful that your disklabel is going to give you enough space for the **/** and **/usr** file systems. If you are using AdvFS make sure that the total slices you set aside can be used to add up to the sizes you need (ie.. remember that AdvFS can do concatination, mirroring, and striping between disk/block devices). This is an effort you need to make before you start copying over files, because by then it could be too late to correct a size mismatch and you'll simply find out because the destination file system or file set will fill up before your copy/sync operation completes. 
  
-=== File Copy Steps ===+==== File Copy Steps ====
    
 The UFS file system is BSD's native file system. It's very reliable and tough, but it also lacks features such as journaling, logging, and some other more esoteric stuff. It's maximums are also much lower than AdvFS. The upshot of UFS is that it's extremely reliable and stable, gives you reasonably high performance, and has a lot of tools in user space and by 3rd parties that do things like recovery. It's also free-as-in-beer insomuch that DEC/Compaq/HP don't charge you any extra $$$ for what you do with it, unlike AdvFS which costs money to do anything but host a basic OS installation.  The UFS file system is BSD's native file system. It's very reliable and tough, but it also lacks features such as journaling, logging, and some other more esoteric stuff. It's maximums are also much lower than AdvFS. The upshot of UFS is that it's extremely reliable and stable, gives you reasonably high performance, and has a lot of tools in user space and by 3rd parties that do things like recovery. It's also free-as-in-beer insomuch that DEC/Compaq/HP don't charge you any extra $$$ for what you do with it, unlike AdvFS which costs money to do anything but host a basic OS installation. 
Line 212: Line 212:
  
 Another file you //might// have to alter is your **/etc/sysconfigtab**. This isn't always needed. I believe it's a difference between Tru64 and Digital Unix. There are some versions of startup scripts which will refer to the file again, for a swap device. It would be present in the section called **vm:**. If you see a swap device listed in that section, alter it to point to the new disk or remove it.  Another file you //might// have to alter is your **/etc/sysconfigtab**. This isn't always needed. I believe it's a difference between Tru64 and Digital Unix. There are some versions of startup scripts which will refer to the file again, for a swap device. It would be present in the section called **vm:**. If you see a swap device listed in that section, alter it to point to the new disk or remove it. 
 +
 +
 +=== Final Steps ===
 +
 +Insure that you have completed these steps.
 +
 +  - Install the boot loader using disk label
 +  - Edit the disklabel on your target disk
 +  - Re-create UFS or AdvFS file systems
 +  - Copy files over from the original
 +  - Fix the **/etc/rc.config**, **/etc/sysconfigtab**, and of course, the **/etc/fstab**
 +
 +You should have done all these steps before you attempt the new disk. 
 +
 +==== Final Boot ====
 +
 +Now the system is ready to reboot. You probably want to understand a bit of interaction with what we call the SRM console. The main thing you want to do is to check the values of the following.
 +
 +  - **show dev** This will show you all the devices (NICs, HBAs, and of course disks). You need to know which disk is your target versus destination disk. The device list should have clues like the manufacturer name and the device model.
 +  - The **boot** command takes the disk name as an argument. For example: "boot dka0" or "boot dqa0" would boot each of those disks respectively. Also, if you'd like to try single user mode you'll want to use the "-fl s" argument to boot into single user mode (if you do then remember to use **bcheckrc** command to make single user mode usable). 
 +  - The **show** command is the compliment of the **set** command. These allow you to view and alter the names of SRM variables which alter boot and system behavior.
 +  - Understand the variables that matter most like **BOOTDEF_DEV** which points to the default boot device on the system. Another you might want to understand is **AUTO_ACTION** which governs if the system will automatically try to boot up the system or halts at the SRM chevron prompt. The action names are **boot** or **halt**. 
 +
 +So, what do you normally need to do? Try to boot the clone but don't yet change the default boot device until you are ready to completely switch over to the clone.  
 +
 +==== Troubleshooting  ====
 +
 +Cloning was something that DEC intended folks to use **sysman** for. Unfortunately, their process is too inflexible for most use. So, this more manual method is needed. It is, unfortunately a fault prone process. Here are some of the normal issues.
 +
 +===== The Drive will not Boot =====
 +
 +If you issue the **boot** command from the SRM console but you never see the kernel line saying "UNIX Boot" then you probably had an issue with the boot sector. Do the following. 
 +
 +  - Re-mount the target disk and make double sure that you have the kernel on the root file system. These would be in the form of two files named **vmunix** and **genvmunix**. Without a kernel, you can't boot the system. They should be there as a result of your file copy effort.
 +  - Unfortunately, the most likely cause is that you didn't do the disklabel steps in the proper order. Zero the disklabel with the **-z** and start over. Do it in the proper order and you'll have better luck. 
 +
 +===== It Hangs During Boot =====
 +
 +Depends on why and where it hangs. The most common issues are these. 
 +
 +  - You forgot to edit out some kind of reference to the swap device. Check the post-copy steps again. One of the startup scripts probably tried to activate swap on a device that won't. 
 +  - You are using UFS and you forgot to fix the reference to the **/etc/fstab** for one of the file systems. You might also have to edit any reference for swap, especially on Digital Unix 4.x. Also pay attention for any other filesystems that might have changed or gone away. 
 +  - Make sure your copy method preserved all the permissions, especially on **/sbin** and the scripts in **/sbin/init.d** which are critical. 
 +  - Do NOT try to eliminate one of the AdvFS file domains. As mentioned earlier, the startup scripts reference both **root_domain** and **usr_domain** and if you change their names or eliminate one of them the startup scripts will fail. 
 +
 +If you have problems beyond the ones documented, then consider contacting PARSEC for some consulting work to help you!
 +
 +
  
  
how_to_clone_tru64_and_digital_unix.txt · Last modified: 2023/09/08 23:04 by sgriggs

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki