cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup with Deduplication Option

ericmagallanes
Level 3
HI All, I am getting a Media Write error on my NBU MSDP running on 7.5.0.4. It has happen many time already and Symantec support is not able to tell me the issue is. Apparently from the storage.log it was showing "disk space issue" however the MSDP usage is less than 30% of the total space. I was give a command to rebuild db "crchk.pl --rebuild-crdb". but they still cannot find the root cause of the issue. Below is the log I found: -------------------- May 17 03:14:27 ERR [2942]: 4: pqObjects2ChildCopyIndex: Copy command failed. Reason: ERROR: could not extend relation 1663/16387/33696: No space left on device HINT: Check free disk space. CONTEXT: COPY objects2_0, line 3698945 May 17 03:14:27 ERR [2942]: 4: database query failed. Query : CREATE UNIQUE INDEX idx_objects2_0_1368731667211169_764315522 ON objects2_0(key); Called by : pqObjects2ChildCopyIndex Reason : ERROR: current transaction is aborted, commands ignored until end of transaction block May 17 03:20:35 ERR [3199]: 4: pqObjects2ChildCopyIndex: Copy command failed. Reason: ERROR: could not extend relation 1663/16387/33702: No space left on device HINT: Check free disk space. CONTEXT: COPY objects2_1, line 3700840 May 17 03:20:35 ERR [3199]: 4: database query failed. Query : CREATE UNIQUE INDEX idx_objects2_1_1368732035941078_3761745002 ON objects2_1(key); Called by : pqObjects2ChildCopyIndex Reason : ERROR: current transaction is aborted, commands ignored until end of transaction block May 17 03:26:15 ERR [3456]: 4: pqObjects2ChildCopyIndex: Copy command failed. Reason: ERROR: could not extend relation 1663/16387/33708: No space left on device HINT: Check free disk space. CONTEXT: COPY objects2_2, line 3700674 May 17 03:26:15 ERR [3456]: 4: database query failed. Query : CREATE UNIQUE INDEX idx_objects2_2_1368732375305858_937651469 ON objects2_2(key); Called by : pqObjects2ChildCopyIndex Reason : ERROR: current transaction is aborted, commands ignored until end of transaction block May 17 03:28:03 ERR [3713]: 4: pqObjects2ChildCopyIndex: Copy command failed. Reason: ERROR: could not extend relation 1663/16387/33714: No space left on device HINT: Check free disk space. CONTEXT: COPY objects2_3, line 3700176 May 17 03:28:03 ERR [3713]: 4: database query failed. Query : CREATE UNIQUE INDEX idx_objects2_3_1368732483560508_1665227463 ON objects2_3(key); Called by : pqObjects2ChildCopyIndex Reason : ERROR: current transaction is aborted, commands ignored until end of transaction block May 17 03:28:03 ERR [4113]: 25004: TlogProcessLog: Commit to storage database failed: unknown error May 17 03:28:10 WARNING [4113]: 25000: Transaction logs from /dedupe/AYMSDP/queue/partsorted-19363-19522-0.tlog to /dedupe/AYMSDP/queue/partsorted-19363-19522-5.tlog failed: TlogProcessLog: Commit to storage database failed: unknown error Transaction will be retried. May 17 03:28:10 INFO [4113]: WSRequestExt: submitting &request=4&login=agent_3_24432&passwd=********************************&action=newevent&data=EVENT%7Bversion%3A1%3Btype%3A1%3Bid%3A0%3Bdate%3A1368732490%3B%7BLEGACYEVENT%7Bpayload%3Asev%3D4%3Btype%3D1037%3Bmsg%3DTransaction%20logs%20from%20%2Fdedupe%2FAYMSDP%2Fqueue%2Fpartsorted-19363-19522-0.tlog%20to%20%2Fdedupe%2FAYMSDP%2Fqueue%2Fpartsorted-19363-19522-5.tlog%20failed%3A%20TlogProcessLog%3A%20Commit%20to%20storage%20database%20failed%3A%20unknown%20error%0A%20Transaction%20will%20be%20retried.%0A%3B%7D%7D May 17 03:28:11 ERR [4113]: 25004: Queue processing failed five times in a row. Queue processing will be disabled and the CR will no longer accept new backup data. Please contact support immediately! May 17 03:28:11 INFO [4113]: WSRequestExt: submitting &request=4&login=agent_3_24432&passwd=********************************&action=process&agentId=3&severity=6&errornumber=2000&application=spoold&source=Storage%20Manager&date=1368732491&description=Queue%20processing%20failed%20five%20times%20in%20a%20row.%20Queue%20processing%20will%20be%20disabled%20and%20the%20CR%20will%20no%20longer%20accept%20new%20backup%20data.%20Please%20contact%20support%20immediately%21%20%0A May 17 03:28:11 INFO [4113]: 25004: CR is changing mode to: PUT=No DEREF=No SYSTEM=Yes STORAGED=No REROUTE=No COMPACTD=No RECOVERCRDB=No May 17 03:28:11 INFO [4113]: Entered mode 'STORAGED=no'. May 17 03:28:11 INFO [4113]: Task Manager : Initiate mode switch Store -> stopped May 17 03:28:11 INFO [4113]: Task Manager : Initiate mode switch System -> normal May 17 03:28:11 INFO [4113]: Task Manager : Initiate mode switch Dereference -> stopped May 17 03:28:11 INFO [4113]: 25004: CR mode changed. After CRQP failure issue is fixed, CR mode could be changed back to normal manually by crcontrol or CR restart.
6 REPLIES 6

user022013
Level 4
Partner Accredited Certified

We had an issue with our mdsp storage filling up yet plenty of space was available from within NetBackup.

Netbackup does not reflect the "actual" disk space available.  If you browse using explorer you can check for log files that may fill it up.

However,  run this command if you haven't already:

\\installpath\veritas\pdde crcontrol --dsstat.

Look for the space that requires compacting.  This can grow very large and can fill up all available space without showing up anywhere.  It should look like something below (see bold text).  Ours blew out to 10TB.

D:\Program Files\Veritas\pdde>crcontrol --dsstat

************ Data Store statistics ************
Data storage      Raw    Size   Used   Avail  Use%
                  23.0T  22.1T   9.3T  12.7T  43%

Number of containers             : 94206
Average container size           : 108911664 bytes (103.87MB)
Space allocated for containers   : 10260132285650 bytes (9.33TB)
Space used within containers     : 9337391012298 bytes (8.49TB)
Space available within containers: 922741273352 bytes (859.37GB)
Space needs compaction           : 311206942925 bytes (289.83GB)
Reserved space                   : 1016116240384 bytes (946.33GB)
Reserved space percentage        : 4.0%
Records marked for compaction    : 10152429
Active records                   : 162195158
Total records                    : 172347587

Use "--dsstat 1" to get more accurate statistics

I hope this helps.

user022013
Level 4
Partner Accredited Certified

Also,  we experience media write errors when compaction is running with the 100 switch.

x:\program files\veritas\pdde crcontrol --compactstart 100 0 1.

The msdp is too busy to write to the disk. 

ericmagallanes
Level 3

Hi guys,

 

Im still stuck with this issue, below is the dsstat info. Can anybopdy let me know why its asking to free up space considering used space is only 14%?

*********** Data Store statistics ************
Data storage      Raw    Size   Used   Avail  Use%
                   3.5T   3.3T 458.2G   2.9T  14%

Number of containers             : 4999
Average container size           : 99598809 bytes (94.98MB)
Space allocated for containers   : 497894450231 bytes (463.70GB)
Space used within containers     : 444114041837 bytes (413.61GB)
Space available within containers: 53780408394 bytes (50.09GB)
Space needs compaction           : 481168640 bytes (458.88MB)
Reserved space                   : 158329901056 bytes (147.46GB)
Reserved space percentage        : 4.1%
Records marked for compaction    : 22756
Active records                   : 14875321
Total records                    : 14898077

huanglao2002
Level 6

MSDP is more complex component than other,if you encount this issue you can escalate to backline.

watsons
Level 6

This error message indicates your Queue Processing process has failed many times. 

" Queue processing failed five times in a row. Queue processing will be disabled"

If you have run crchk before, what did support say about the crchk result? Push them for answer if you don't get one.

There could be some fingerprint files missing or corrupted, this would need a crchk to fix. So the crchk result is the essential one to tell us something. 

user022013
Level 4
Partner Accredited Certified

We had issues with queue processing too.  Backline support searched the log file and removed the offending corrupted file and queue processing resumed.

Definitely go to backline.  They were very good at helping us out for this.

Sachin Chavan was the support guy who worked with me.  If you can't get to him directly tell the support worker you are working with to check with him.  He seems to be the MSDP expert.