Failed temporary mount of ZFS FileSystem while BMR...

eomaber · ‎01-07-2013

Hi Everyone,

When trying to restore with BMR we faced the following:

1) after we did the prepare to restore on BMR, we booted from net on the solaris client, the restore failed with the error:

+ /usr/sbin/zfs mount rpool/var

cannot mount 'rpool/var': 'canmount' property is set to 'off'

+ RC=1

+ (( 1 ))

+ echo ERROR: failed temporary mount of ZFS FileSystem rpool/var at /tmp/mnt/var.

ERROR: failed temporary mount of ZFS FileSystem rpool/var at /tmp/mnt/var.

Solution was to bring up the client and change the property to ON and take a backup again.

2) after fixing the first issue, we retried the restore, this time we had the following:

The restore fails with the following error:

+ print Creating ZFS Storage Pool rpool

+ 1>> /dev/console

+ /usr/sbin/zpool create -f -m /tmp/mnt/rpool rpool c1t0d0s0 spare c1t2d0s0 c1t3d0s0

cannot open '/dev/dsk/c1t0d0s0': I/O error

We had to restore from local tape to bring back the Client. note that the disks do not have any hardware problem, then we suspected that it is a mismatch of the system version used:

We are using an srt with solaris update 9, and the node we want to restore has a same patch installed(Solaris with Generic_147440-12), so I have tried to add all missing patches to the SRT but when installing some patches the installation hang and the SRT become invalid.

Could anyone help on how to add the following patches to SRT:

142933-03

144526-01

147061-01

125555-11

144500-19

142933-04

147440-12

The succesful patch installation of each of the avove are very random. It succeeds for one of them and fails next time.

Thanks

Mark_Solutions · ‎01-07-2013

Is your configuration definately supported for BMR?

There are a few limitations as listed in the BMR Admin Guide

Does it all look right in the disk configuration section of the clients configuration within BMR itself?

Mark_Solutions · ‎01-07-2013

Different file system on ZFS volumes comes to mind from what the error says?

I am not a Solaris expert but this sounds like a possibility

Nathan_Kippen · ‎01-07-2013

In 7.5.0.4 release notes...

pg 26:

During a Bare Metal Restore, the Zeta file system (ZFS) temporary mount fails.
This issue occurs if any ZFS is not mounted or the canmount value is set to OFF
during a backup.

To restrict the disk or the disk pool, edit the Bare Metal Restore configurations.
The edits ensure that the disk is not overwritten and the data that it contains
is not erased during the restore process.

For more information on how to edit the configurations, refer to the following
sections of the Bare Metal Restore Administrator's Guide:

■ Managing Clients and Configurations
■ Client Configuration Properties

eomaber · ‎01-08-2013

I think this command

+ /usr/sbin/zpool create -f -m /tmp/mnt/rpool rpool c1t0d0s0 spare c1t2d0s0 c1t3d0s0

couldn't be executed bacause zfs version included in the SRT is at less level than the one in the client solaris.

SRT is at Generic-142909-17 and the client is at Generic-147440-12

The first one use zfs version 23 and the second is using 29

I tried to add patches to SRT but failed to add all the necessary patches.

Even if the patches adding would work, do you think this could help bmr to create the zfs pool?

Thanks

katlamudi · ‎01-09-2013

As per ur comments,

the SRT is at patch level Generic-142909-17 means SRT is of Solaris 10 Update 9

the client is at Generic-147440-12 means client should be of Solaris 10 Update 10 (refer "Solaris 10 Kernel Patch Lineage" https://blogs.oracle.com/patch/entry/solaris_10_kernel_patchid_progression. )

Please confirm whether your client is of Solaris 10 Update 10 or not. If it is at Solaris 10 update 10, please use SRT created using Solaris 10 Update 10.

thanks

ttouati · ‎01-19-2013

Hi,

Just to update this thread, I worked specifically on this issue along with "eomaber" and it seems that the IO error occured because the disks werent partitioned, all sclices had 0 cylinder attributed to them as you can see :

partition> print
Current partition table (unnamed):
Total disk cylinders available: 14087 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 - 14086      136.71GB    (14087/0/0) 286698624
  1 unassigned    wu       0                0         (0/0/0)             0
  2     backup    wu       0 - 14086      136.71GB    (14087/0/0) 286698624
  3 unassigned    wu       0                0         (0/0/0)             0
  4 unassigned    wu       0                0         (0/0/0)             0
  5 unassigned    wu       0                0         (0/0/0)             0
  6 unassigned    wu       0                0         (0/0/0)             0
  7 unassigned    wu       0                0         (0/0/0)             0

partition>

We manually partitioned all disks and were able to create the zfs volume after that as you can see in this logs :

# prtvtoc /dev/rdsk/c1t0d0s2
* /dev/rdsk/c1t0d0s2 partition map
*
* Dimensions:
*     512 bytes/sector
*     848 sectors/track
*      24 tracks/cylinder
*   20352 sectors/cylinder
*   14089 cylinders
*   14087 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      2    00          0 286698624 286698623
       2      5    01          0 286698624 286698623
#
#
#
#
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2
fmthard:  New volume table of contents now in place.
#
#
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t2d0s2
fmthard:  New volume table of contents now in place.
#
#
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t3d0s2
fmthard:  New volume table of contents now in place.
#
#
#
# /usr/sbin/zpool create -f -m /tmp/mnt/rpool rpool mirror c1t0d0s0 c1t1d0s0 spare c1t2d0s0 c1t3d0s0
libshare SMF initialization problem: entity not found
#
#
# zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  85.5K   134G    21K  /tmp/mnt/rpool
#
#
# zpool list
NAME    SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
rpool   136G    90K   136G     0%  ONLINE  -
#
#
# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c1t0d0s0  ONLINE       0     0     0
            c1t1d0s0  ONLINE       0     0     0
        spares
          c1t2d0s0    AVAIL   
          c1t3d0s0    AVAIL   

errors: No known data errors
#

This is where I'm confused, because according to netbackup logs the partitioning is occuring as you can see in the attached file.!!

Any suggestions ??

katlamudi · ‎01-23-2013

Hi ttouati, Can you please confirm that which SRT you have used for restore? Is it prepared using Solaris 10 Update 10 or not. if not, Please try restore with Solaris10 Update10 SRT. If still see issues, please log a support case along with restore log. thanks

ttouati · ‎01-25-2013

Salam katlamudi,

Thanks for your reply and concern,

We are using Solaris 10u9 on the SRT, but I don't think that this is the issue, here is what we are doing and what we are getting..

Policy: standard Policy with BMR activated, we left the policy with default options, we choose the ALL_LOCAL_DRIVES as backup selection.

Filesystem :

We have ZFS file system :

bash-3.00# zfs list
NAME                USED  AVAIL  REFER  MOUNTPOINT
rpool              46.5G  87.4G    98K  /rpool
rpool/ROOT         13.6G  87.4G    21K  /rpool/ROOT
rpool/ROOT/SDP5    13.6G  87.4G  13.6G  /
rpool/dump         2.00G  87.4G  2.00G  -
rpool/swap         16.5G   104G    16K  -
rpool/var          14.4G  87.4G  2.11G  /var
rpool/var/opt      12.3G  87.4G  90.3M  /var/opt
rpool/var/opt/fds  12.2G  87.4G  12.2G  /var/opt/fds

The rpool/var and rpool/var/opt are not mountable, they are just containers to the rpool/var/opt/fds file system, so you cant see them when executing df -h command:

bash-3.00# zfs get canmount
NAME               PROPERTY  VALUE     SOURCE
rpool              canmount  on        local
rpool/ROOT         canmount  on        local
rpool/ROOT/SDP5    canmount  noauto    local
rpool/dump         canmount  -         -
rpool/swap         canmount  -         -
rpool/var          canmount  off       local
rpool/var/opt      canmount  off       local
rpool/var/opt/fds  canmount  on        local

bash-3.00# df -h
Filesystem             size   used  avail capacity  Mounted on
rpool/ROOT/SDP5        134G    16G    85G    16%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    26G   480K    26G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr/libc_psr_hwcap2.so.1
                       101G    16G    85G    16%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,SPARC-Enterprise-T5220/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
                       101G    16G    85G    16%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
swap                    26G    72K    26G     1%    /tmp
swap                    26G    64K    26G     1%    /var/run
rpool                  134G    98K    85G     1%    /rpool
rpool/ROOT             134G    21K    85G     1%    /rpool/ROOT
rpool/var/opt/fds      134G    12G    85G    13%    /var/opt/fds

When we perform a backup,everything is present on the backup file, but when we perform a restore, we were getting errors:

1- First the error that "eomaber" mentioned about the restore script being enable to mount the rpool/var, we get around this error by editing the configuration file and seting the canmount of rpool/var and rpool/var/opt to "on"

2- The second issue was that when the restore completed, the rpool/var and rpool/var/opt are mountable, this is causing many important services to go to maintenance state, especially svc:/system/filesystem/minimal:default

This happened because the system is unable to mount the /var/run file system

When we enter the single user mode we found the rpool/var rpool/var/opt and rpool/var/opt/fds not mounted, so we have to options that we tried out:

First we tried to leave the canmount options as they were and mounted all three file systems (zfs mount -a)

This caused the system to go to 3rd level after clearing the svc:/system/filesystem/minimal:default service, but we were still getting many errors and the system seemed to be unstable.

The second option was to set back the canmount option to their original value, in this case only rpool/var/opt/fds will be mounted, but /var and /var/opt will be empty which will cause the system to go to 3rd level with many errors on the console, luckily we were able to restore the /var and /var/opt content again, the errors stopped. This solution seems to work better than the first, the system was stable and we were able to run all our applications without problems.

Finally, I think that Netbackup is misunderstanding our file system configuration, perhaps this is because the zfs file system has been migrating from a UFS file system which is according to Symantec is not supported but they didn't give the reason.

Anyway, we figure out how to perform a restore, a strait forward one would be appreciated, so if anyone have a better ideas we will be grateful

Salam

VOX

Failed temporary mount of ZFS FileSystem while BMR restore