As a part of NDMP restore validation we found a strange and weird issue.
The details are given below:
Netbackup Master/Media all are on 7.0.1 Version
Policy Type: NDMP
Ontap: 8.1.1 7-mode
We have taken a full backup of a volume that completed successfully with status 0. When we compared the actual data with backedup data we found that a chunk of files got missed in the backup and found additional folders backed up which are not a part of the actual data.
The total count of the missed files are 51. Out of which 49 files are part of 1 folder.
The folder name is with 30 characters and complex combination of special characters.
1) Validated for Open Files: In NetApp, volume snapshot is taken at the beginning of NDMP backup session and data is actually read from this snapshot. So “open file” concept is not applicable here for the missed files in this scenario.
2) Do the missed files/folders really exist during the backup: Yes, got the confirmation from the user that these files are present and the data is very much static during the backup.
3) What are the extra files that got backedup and why they looked so strange ?
During the integrity checks we found that extra files got backedup and these files are never a part of the actual data and the name of the files look so strange which had a convention starting with -:60RFN00 and it contains all of the missed files of one folder. Observation is that the actual complex folder name got renamed to :60RFN00
Validations are to be carried out if the folder got renamed during the backup or is Netapp a factor for this change.
4) Is it because of share-name vs actual name conflict in the filer?
No, there is no conflict observed as the NAS team confirmed they used only share names not folder names.
When we tried to take the backup up of this missed folder exclusively the backup got failed with generic error code 99 .
1/24/2014 7:50:28 AM - begin writing
1/24/2014 7:50:29 AM - Error ndmpagent(pid=2556) NDMP_LOG_ERROR 1 DATA: Backup terminated: backup configuration failure (2) on RPC error: ERROR: BAD ENV OP (for PATH)
1/24/2014 7:50:29 AM - Error ndmpagent(pid=2556) NDMP backup failed, path =PATH1/24/2014 7:50:29 AM - Error bptm(pid=12244) none of the NDMP backups for client CLIENTNAME completed successfully
1/24/2014 7:50:34 AM - end writing; write time: 00:00:06
NDMP backup failure(99)
1)Error ndmpagent(pid=5916) MOVER_HALTED unexpected reason = 3 (NDMP_MOVER_HALT_INTERNAL_ERROR)
1/21/2014 3:17:36 PM - Error ndmpagent(pid=5916) NDMP backup failed, path = /vol/Prestwick/.snapshot/nightly.0
1/21/2014 3:17:37 PM - Error bptm(pid=2500) io_ioctl_ndmp (MTBSF) failed on media id 0082L5, drive index 0, return code 10 (NDMP_NO_TAPE_LOADED_ERR) (bptm.c.8569)
1/21/2014 3:17:43 PM - end writing; write time: 00:29:13
NDMP backup failure(99)
2)- Error bptm(pid=8496) io_ioctl_ndmp (MTFSF) failed on media id 0027L5, drive index 1, return code 18 (NDMP_XDR_DECODE_ERR) (bptm.c.6912)
1/21/2014 3:13:03 PM - Error ndmpagent(pid=10144) connection 0000000001A5BAB0 ndmp_message_process_one failed, status = 18 (NDMP_XDR_DECODE_ERR)
1/21/2014 3:13:04 PM - Error ndmpagent(pid=10144) NDMP backup failed, path = UNKNOWN
Found no much details/clue on the ndmpd and ndmpagent logs so I tried increasing unified log levels but encountered the below error.
Running vxlogcfg -a -p 51216 -o 117 -s DiagnosticLevel=5 -s DebugLevel=5 gives error message.
Error:V-1-8-0 invalid setting / value passed
Cause:Netbackup Processes seem to be hung
1: Stop the netbackup proccess on the master server
2: Run vxlogcfg -a -p 51216 -o 117 -s DiagnosticLevel=5 -s DebugLevel=5
3: Start the netbackup proccess
4: Run vxlogcfg -a -p 51216 -o 117 -s DiagnosticLevel=0 -s DebugLevel=0
Ref TN: http://www.symantec.com/business/support/index?page=content&id=TECH157017
To fix the problem1 I have done the below tests:
1) Tried taking the backup of the folder with strange name (:60RFN00) which is never part of the actual Volume
Result: Backup got completed successfully.
2) I have asked Netapp Admin to take the snapshot of this volume to verify if Netapp had anything to do with the changing the folder name.
Verified the snapshot and found that the folder name did not change but when tried to take the backup of this snashot, the complex folder name changed as :60RFN00 in the NBU.
3) Tried the backup by removing the spaces between the special characters of the folder name.
This finally worked. Strange name did not appear this time in the backedup data.
Removing the space between special characters for the complex folder name resolved the issue. This problem and resolution is applicable in NBU version 7.0.1(not verified for other versions of NBU).
1) Upgrade Netbackup Version to 7.5/7.6 which is the latest available.
2) Advise the user to avoid creating complex folder/file names for NAS volumes.