cancel
Showing results for 
Search instead for 
Did you mean: 

VxFs file system doomed?

LUCS
Level 3

 Had a major power supply failure on a Linux server managing various VxVM file systems on
a DAS RAID..  The RAID itself wasn''t affected and after Linux  had rebooted  I was
able to run fsck and replay the intent log of each file system .

 There is one file system though on which the 'vxdump' command now hangs at the
point of mapping directories.

Tried to find structural errors by using 'fsck -n' , but that also hangs; when checking directory linkages..

However I can still list the entire directory tree and 'vxstat' reports no IO errors
with the underlying volume..

 But if a file system  can't be backed up then it's of no use in our environment. 

What do people think? Is this a failure mode anyone recognises?
1 ACCEPTED SOLUTION

Accepted Solutions

avsrini
Level 4
Employee Accredited Certified
Hi Lucs

 From the symptom looks to be some metadata issue.
But could you confirm if the vxdump or fsck -n is infact hanging or just
waiting for task to be completed? The time to finish fsck depends on the filesystem size
and the number of files in it.

use #strace -fitT fsck -o full -n /dev/vx/rdsk/dg/vol to check if the process is hanging or progressing.

Regards
Srini

View solution in original post

4 REPLIES 4

Marianne
Level 6
Partner    VIP    Accredited Certified
Have you tried to fsck with '-o full,nolog'? This will perform a full filesystem check and not merely replay the intent log.

avsrini
Level 4
Employee Accredited Certified
Hi Lucs

 From the symptom looks to be some metadata issue.
But could you confirm if the vxdump or fsck -n is infact hanging or just
waiting for task to be completed? The time to finish fsck depends on the filesystem size
and the number of files in it.

use #strace -fitT fsck -o full -n /dev/vx/rdsk/dg/vol to check if the process is hanging or progressing.

Regards
Srini

LUCS
Level 3

 Well Srini it's an 120GB file system that contains about 40GB worth of files.
The fsck -n was "stuck" on pass2  when I killed it off after 35 minutes..  Unfortunately
I haven't any strace of that run and can't repeat those circumstances..

  Something has changed:: 'vxdump' now completes within its usual time frame.
A full ''fsck' ' took 1 hour 24 minutes and  did not reveal any structural faults.

 So whatever the issue was it seems to have righted itself. 

Gaurav_S
Moderator
Moderator
   VIP    Certified

Well unfortunately we missed the chance to capture data but good to know issue is resolved itself.... I would agree with Srini however, In future if you see similar sort of issue, I would recommend to wait (check using strace/dtrace) to see if fsck is moving or is really hang... I too have seen cases where fsck takes huge time but completes eventually...

Gaurav