In this post we will discussing about most common issue of replacing the faulty VxVM disk in UNIX server.
Every Unix System Admin should gone through this situation where they removing/replacing the faulty VxVM disk. Here we can see the procedure for replacing the faulty disk through "vxdiskadm" command, however we can also do the same in command line mode which is preferred.
Assume that diskgroup "unixrockdg" is having one faultydisk unixrockdg04 (c3t9d0s2) which needs to be replaced. Let we can do the high level plan before doing the replacement.
HIGH LEVEL PLAN:
Step1: Take the Backup of System and Disk Configuration.(Recommended to take cfg2html or explorer)
Step2: Remove the disk from VxVM level by using vxdiskadm utility
Step3: Unconfigure the disk from Os level by using cfgadm command.
Step4: Request to change the faulty disk.
Step5: Configure the disk from Os level by using cfgadm and devfsadm command.
Step6: Replace the disk from VxVM level by using vxdiskadm utility.
Step7: Start the VOLUME and Mount the same.
.
Step5: Configure the disk from Os level by using cfgadm and devfsadm command.
Step6: Replace the disk from VxVM level by using vxdiskadm utility.
Step7: Start the VOLUME and Mount the same.
Let we can start the activity after taking the valid configuration backup.Below output is confirming that unixrockdg04(c3t9d0s2) is failed status.
root@unixrock # vxdisk list DEVICE TYPE DISK GROUP STATUS c1t0d0s2 sliced rootdisk rootdg online c1t1d0s2 sliced rootmirr rootdg online c3t8d0s2 sliced unixrockdg03 unixrockdg online c3t10d0s2 sliced - - online c3t11d0s2 sliced unixrockdg01 unixrockdg online c3t12d0s2 sliced unixrockdg02 unixrockdg online - - unixrockdg04 unixrockdg failed was:c3t9d0s2 root@unixrock #Remove the failed disk from VxVM with using vxdiskadm command.
root@unixrock # vxdiskadm Volume Manager Support Operations Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk 4 Remove a disk for replacement 5 Replace a failed or removed disk 6 Mirror volumes on a disk 7 Move volumes from a disk 8 Enable access to (import) a disk group 9 Remove access to (deport) a disk group 10 Enable (online) a disk device 11 Disable (offline) a disk device 12 Mark a disk as a spare for a disk group 13 Turn off the spare flag on a disk 14 Unrelocate subdisks back to a disk 15 Exclude a disk from hot-relocation use 16 Make a disk available for hot-relocation use 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view 19 List currently suppressed/non-multipathed devices 20 Change the disk naming scheme 21 Get the newly connected/zoned disks in VxVM view list List disk information ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: 4 Remove a disk for replacement Menu: VolumeManager/Disk/RemoveForReplace Use this menu operation to remove a physical disk from a disk group, while retaining the disk name. This changes the state for the disk name to a "removed" disk. If there are any initialized disks that are not part of a disk group, you will be given the option of using one of these disks as a replacement. Enter disk name [Now we can see the failed disk status as "Removed",list,q,?] list Disk group: rootdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE dm rootdisk c1t0d0s2 sliced 10175 143339136 - dm rootmirr c1t1d0s2 sliced 10175 143339136 - Disk group: unixrockdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE dm unixrockdg01 c3t11d0s2 sliced 9919 143328960 - dm unixrockdg02 c3t12d0s2 sliced 9919 143328960 - dm unixrockdg03 c3t8d0s2 sliced 9919 143328960 - dm unixrockdg04 - - - - NODEVICE Enter disk name [ ,list,q,?] unixrockdg04 The following volumes will be disabled as a result of this operation: unixrockvol These volumes will require restoration from backup. Are you sure you want do do this? [y,n,q,?] (default: n) y The requested operation is to remove disk unixrockdg04 from disk group unixrockdg. The disk name will be kept, along with any volumes using the disk, allowing replacement of the disk. Select "Replace a failed or removed disk" from the main menu when you wish to replace the disk. Continue with operation? [y,n,q,?] (default: y) y Removal of disk unixrockdg04 completed successfully. Remove another disk? [y,n,q,?] (default: n) n Volume Manager Support Operations Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk 4 Remove a disk for replacement 5 Replace a failed or removed disk 6 Mirror volumes on a disk 7 Move volumes from a disk 8 Enable access to (import) a disk group 9 Remove access to (deport) a disk group 10 Enable (online) a disk device 11 Disable (offline) a disk device 12 Mark a disk as a spare for a disk group 13 Turn off the spare flag on a disk 14 Unrelocate subdisks back to a disk 15 Exclude a disk from hot-relocation use 16 Make a disk available for hot-relocation use 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view 19 List currently suppressed/non-multipathed devices 20 Change the disk naming scheme 21 Get the newly connected/zoned disks in VxVM view list List disk information ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: q Goodbye.
root@unixrock # vxdisk list DEVICE TYPE DISK GROUP STATUS c1t0d0s2 sliced rootdisk rootdg online c1t1d0s2 sliced rootmirr rootdg online c3t8d0s2 sliced unixrockdg03 unixrockdg online c3t9d0s2 sliced - - error c3t10d0s2 sliced - - online c3t11d0s2 sliced unixrockdg01 unixrockdg online c3t12d0s2 sliced unixrockdg02 unixrockdg online - - unixrockdg04 unixrockdg removed was:c3t9d0s2Once we removed the disk from VxVM level, we have to remove the faulty disk from OS level by using cfgadm -c unconfigure
root@unixrock # cfgadm -c unconfigure c3::dsk/c3t9d0 root@unixrock #Once its done, we have to replace the faulty disk physically and configure the disk in OS level.
root@unixrock # cfgadm -c configure c3::dsk/c3t9d0 root@unixrock # root@unixrock # devfsadm -c disk root@unixrock # echo|format|grep -i c3t9d0 3. c3t9d0 SUN72G cyl 14087 alt 2 hd 24 sec 424 root@unixrock #Now the disk is available in OS level, we have to get the disk into VxVM control now.
root@unixrock # vxdctl enable root@unixrock # vxdiskadm Volume Manager Support Operations Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk 4 Remove a disk for replacement 5 Replace a failed or removed disk 6 Mirror volumes on a disk 7 Move volumes from a disk 8 Enable access to (import) a disk group 9 Remove access to (deport) a disk group 10 Enable (online) a disk device 11 Disable (offline) a disk device 12 Mark a disk as a spare for a disk group 13 Turn off the spare flag on a disk 14 Unrelocate subdisks back to a disk 15 Exclude a disk from hot-relocation use 16 Make a disk available for hot-relocation use 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view 19 List currently suppressed/non-multipathed devices 20 Change the disk naming scheme 21 Get the newly connected/zoned disks in VxVM view list List disk information ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: 5 Replace a failed or removed disk Menu: VolumeManager/Disk/ReplaceDisk Use this menu operation to specify a replacement disk for a disk that you removed with the "Remove a disk for replacement" menu operation, or that failed during use. You will be prompted for a disk name to replace and a disk device to use as a replacement. You can choose an uninitialized disk, in which case the disk will be initialized, or you can choose a disk that you have already initialized using the Add or initialize a disk menu operation. Select a removed or failed disk [Checking the status,list,q,?] list Disk group: rootdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE Disk group: unixrockdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE dm unixrockdg04 - - - - REMOVED Select a removed or failed disk [ ,list,q,?] unixrockdg04 Select disk device to initialize [ ,list,q,?] list DEVICE DISK GROUP STATUS c1t0d0 rootdisk rootdg online c1t1d0 rootmirr rootdg online c3t8d0 unixrockdg03 unixrockdg online c3t9d0 - - error c3t10d0 - - online c3t11d0 unixrockdg01 unixrockdg online c3t12d0 unixrockdg02 unixrockdg online Select disk device to initialize [ ,list,q,?] c3t9d0 The following disk device has a valid VTOC, but does not appear to have been initialized for the Volume Manager. If there is data on the disk that should NOT be destroyed you should encapsulate the existing disk partitions as volumes instead of adding the disk as a new disk. Output format: [Device_Name] c3t9d0 Encapsulate this device? [y,n,q,?] (default: y) n c3t9d0 Instead of encapsulating, initialize? [y,n,q,?] (default: n) y The requested operation is to initialize disk device c3t9d0 and to then use that device to replace the removed or failed disk unixrockdg04 in disk group unixrockdg. Continue with operation? [y,n,q,?] (default: y) Use a default private region length for the disk? [y,n,q,?] (default: y) Replacement of disk unixrockdg04 in group unixrockdg with disk device c3t9d0 completed successfully. Replace another disk? [y,n,q,?] (default: n)
root@unixrock # vxdisk list DEVICE TYPE DISK GROUP STATUS c1t0d0s2 sliced rootdisk rootdg online c1t1d0s2 sliced rootmirr rootdg online c3t8d0s2 sliced unixrockdg03 unixrockdg online c3t9d0s2 sliced unixrockdg04 unixrockdg online c3t10d0s2 sliced - - online c3t11d0s2 sliced unixrockdg01 unixrockdg online c3t12d0s2 sliced unixrockdg02 unixrockdg online root@unixrock #We have successfully replaced the faulty disk. however we have to check the VOLUME status. Below output "unixrockvol" is disabled status.
root@unixrock # vxprint -hvtg unixrockdg V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO v unixrockvol - DISABLED ACTIVE 573313024 SELECT - fsgen pl unixrock-01 unixrockvol DISABLED RECOVER 573315840 CONCAT - RW sd unixrockdg03-01 unixrock-01 unixrockdg03 0 143328960 0 c3t8d0 ENA sd unixrockdg04-01 unixrock-01 unixrockdg04 0 143328960 143328960 c3t9d0 ENA sd unixrockdg01-01 unixrock-01 unixrockdg01 0 143328960 286657920 c3t11d0 ENA sd unixrockdg02-01 unixrock-01 unixrockdg02 0 143328960 429986880 c3t12d0 ENA root@unixrock #I tried below steps make the volume active status.
root@unixrock # vxrecover -s unixrockvol root@unixrock # vxprint -hvtg unixrockdg V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO v unixrockvol - DISABLED ACTIVE 573313024 SELECT - fsgen pl unixrock-01 unixrockvol DISABLED RECOVER 573315840 CONCAT - RW sd unixrockdg03-01 unixrock-01 unixrockdg03 0 143328960 0 c3t8d0 ENA sd unixrockdg04-01 unixrock-01 unixrockdg04 0 143328960 143328960 c3t9d0 ENA sd unixrockdg01-01 unixrock-01 unixrockdg01 0 143328960 286657920 c3t11d0 ENA sd unixrockdg02-01 unixrock-01 unixrockdg02 0 143328960 429986880 c3t12d0 ENA root@unixrock # vxtask list TASKID PTID TYPE/STATE PCT PROGRESS root@unixrock # vxvol -g unixrockdg startall vxvm:vxvol: ERROR: Volume unixrockvol has no CLEAN or non-volatile ACTIVE plexes root@unixrock #Then I follow the below steps in order to make the volume active status.
root@unixrock # vxmend -g unixrockdg fix stale unixrock-01 root@unixrock # vxmend -g unixrockdg fix clean unixrock-01 root@unixrock # vxvol -g unixrockdg start unixrockvol root@unixrock # root@unixrock # vxprint -hvtg unixrockdg V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO v unixrockvol - ENABLED ACTIVE 573313024 SELECT - fsgen pl unixrock-01 unixrockvol ENABLED ACTIVE 573315840 CONCAT - RW sd unixrockdg03-01 unixrock-01 unixrockdg03 0 143328960 0 c3t8d0 ENA sd unixrockdg04-01 unixrock-01 unixrockdg04 0 143328960 143328960 c3t9d0 ENA sd unixrockdg01-01 unixrock-01 unixrockdg01 0 143328960 286657920 c3t11d0 ENA sd unixrockdg02-01 unixrock-01 unixrockdg02 0 143328960 429986880 c3t12d0 ENA root@unixrock #Then I tried to mount the Volume, but I got below errors
root@unixrock # mount /unixrock mount: /dev/vx/dsk/unixrockdg/unixrockvol is already mounted, /unixrock is busy, or the allowable number of mount points has been exceeded root@unixrock # mount -v|grep -i /unixrock root@unixrock #Then i did some breakfix in order to mount the volume
root@unixrock # mv /unixrock /unixrock_old root@unixrock # mkdir /unixrock root@unixrock # mount /unixrock root@unixrock # df -k|grep -i unixrock Filesystem kbytes used avail capacity Mounted on /dev/vx/dsk/unixrockdg/unixrockvol 282176390 29435764 249918863 11% /unixrock root@unixrock #
No comments:
Post a Comment