Netbackip 8 and tape HPE StoreEver MSL6480

Dmitry_K · ‎08-28-2017

I have a unusual problem with Netbackip 8 and tape HPE StoreEver MSL6480

Configuration:

Master server-01 (VM)

Media server-02, MS server 2016 (physical)

HPE StoreEver MSL6480 , connected (FC) to Media server-02

Situation:

MSL6480 detected and added on Master server-01, duplication-on-tape policy start task, MSL6480 move tape in drive and after that ... nothing happens, no any write process

Error at the end -

Error bpduplicate (pid=10492) Status = no images were successfully processed.

C:\Program Files\Veritas\NetBackup\logs\bptm - log -

-----------

11:34:34.193 [8144.7252] <2> JobInst::sendIrmMsg: starting

11:34:34.193 [8144.7252] <2> packageBptmResourceDoneMsg: msg (MEDIA_DONE 0 743099 0 AFNNNNN 4000043 180 {94E334AC-DDF0-47F1-B48D-9121EF862263})

11:34:34.193 [8144.7252] <2> packageBptmResourceDoneMsg: keyword MEDIA_DONE version 0 jobid 743099 copyNum 0 mediaId AFNNNNN mediaKey 4000043 unloadDelay 180 allocId {94E334AC-DDF0-47F1-B48D-9121EF862263}

11:34:34.193 [8144.7252] <2> packageBptmResourceDoneMsg: returns 0

11:34:34.209 [8144.7252] <2> JobInst::sendIrmMsg: returning

11:34:34.240 [8144.7252] <2> main: Got bpduplicate acknowledgement status: 0, err: 0

11:34:34.240 [8144.7252] <2> bptm: EXITING with status 84 <----------

---

What could be first point to troubleshoot, where problem is?

RamNagalla · ‎08-28-2017

when you say "MSL6480 detected and added on Master server-01" i would assume you have added the tape drives to media server and configured storate unit for media server.

basic question:- are you able to run the backup fine on tapes ? ( i would suggest to test this first)

if backup also failing we would like to see the below details

1)vmoprcmd -d output from media server

2)tpconfig -d output from media server

3) failed job details status

4)bptm log for failed job process

Marianne · ‎08-29-2017

We need a bit more info please.

Firstly we need to be sure that the MSL6480 was detected on server-02 and added on Master server-01 for server-02.
Sorry, we cannot just assume...

If all is well and devices configured for media server, then please get us output of all the commands that @RamNagalla asked for.

I also agree with Ram - first get a small backup going before you try duplication.

For additional troubleshooting, add VERBOSE entry in media server's vm.conf.
Additional hardware-related info and errors will be logged in Event Viewer System and Application logs.

Next, increase bptm logging level to 3.
After next failure, copy bptm log to bptm.txt and upload here. (We need to see everything that happened from the start of bptm process - not just the error at the end.)
Text export of System and Appliacation logs will also help.

Handy NetBackup Links

Dmitry_K · ‎08-29-2017

>> i would assume you have added the tape drives to media server and configured storate unit for media server.

-

Yes. I see MSL6480 as robot and 2 drives on Media server in windows device manager, and as robot-tld-** in "storage units" on Master. I see tapes in slots and robot stat in NBU device monitor., and it's same tapes in slots as from MSL6480 web console.

Test policy for duplication - small (40Gb) Vm, not all disk/storage.

vmoprcmd -d
                               PENDING REQUESTS

                                     <NONE>

                                  DRIVE STATUS

Drv Type   Control User      Label RecMID ExtMID Ready   Wr.Enbl. ReqId
0 hcart    TLD                -                     No       -         0
0 hcart    TLD                -                     No       -         0
1 hcart    TLD                -                     No       -         0
1 hcart    TLD                -                     No       -         0

                             ADDITIONAL DRIVE STATUS

Drv DriveName            Shared    Assigned        Comment
0 HP.ULTRIUM7-SCSI.000 No       -
0 HP.ULTRIUM7-SCSI.000 No       -
1 HP.ULTRIUM7-SCSI.001 No       -
1 HP.ULTRIUM7-SCSI.001 No       -
----------------------------------------------
tpconfig –d
Id DriveName           Type   Residence
      SCSI coordinates/Path                                            Status
****************************************************************************
0   HP.ULTRIUM7-SCSI.000 hcart TLD(1) DRIVE=1
      {2,0,7,0}                                                        UP
      {1,0,6,0}                                                        UP
1   HP.ULTRIUM7-SCSI.001 hcart TLD(1) DRIVE=2
      {2,0,6,0}                                                        UP
      {1,0,7,0}                                                        UP

Currently defined robotics are:
TLD(1)     SCSI coordinates = {1,0,6,1}

EMM Server =My_servername

Dmitry_K · ‎08-29-2017

3) failed job details status

--

1: (196) client backup was not attempted because backup window closed-

4)bptm log for failed job process

- attached

Dmitry_K · ‎08-29-2017

>>Text export of System and Appliacation logs will also help.
-
I can't find any record from NBU or from MSL6480 in this log.

Nicolai · ‎08-29-2017

The log show a backup to disk not tape :

write_backup: begin writing backup id vm01_1503954160, copy 1, fragment 1, destination path F:\NBU_07\

Are you using the basic disk storage unit with stagening to tape ?

Marianne · ‎08-29-2017

You need to add VERBOSE to vm.conf on media server and restart NBU Device Management service.

After this, hardware info will be logged to Event Viewer logs.

Please run a small backup to the tape STU.
If that fails, then collect bptm log and post here.

Handy NetBackup Links

Dmitry_K · ‎08-29-2017

>>Are you using the basic disk storage unit with stagening to tape ?

-

Well, perhaps I made mistake and took another part from 2+Mb logfile for examlpe.

Sorry.

Dmitry_K · ‎08-29-2017

>>You need to add VERBOSE to vm.conf on media server and restart NBU Device Management service.

After this, hardware info will be logged to Event Viewer logs.

Please run a small backup to the tape STU.

-

ok. I add VERBOSE in C:\Program Files\Veritas\Volmgr - to vm.conf on media server , restart NetBackup Device Manager / "C:\Program Files\Veritas\Volmgr\bin\ltid.exe" , change policy storage in test policy for testVM to robot-tld, and change log-level in "host properties - media server - properties - logging - BPTM logging level" - from 5 to 3.

And I just wait 15 minutes, while previous task will be canceled\ended. It's strange, but NBU task did't see that I reboot media server

Marianne · ‎08-29-2017

If a backup to tape fails, you will see bptm PID in Activity Monitor. Probably a child bptm PID as well.

Extract all entries for the particular PID(s) and paste in bptm.txt.

Handy NetBackup Links

Dmitry_K · ‎08-29-2017

Win app log:

29.08.2017 13:20:21 starting tldcd
29.08.2017 13:20:23 TLD(1) Creating Process for INITIALIZE: "C:\Program Files\Veritas\Volmgr\bin\tldd.exe" -v -child -ro 9 -rn 1
../tldcd.c.3034, process_request(), received command=9, from peername=myMEDIA.mydomain.dom, version 50
TLD(1) [4656] opening robotic path {2,0,7,1} (bus -1, target -1, lun -1)
inquiry() function processing library HP MSL6480 5.20:
TLD(1) NumDrives = 2, FirstDriveAddr = 1
TLD(1) NumSlots = 80, FirstSlotAddr = 1001
TLD(1) NumIE = 0, InportAddr = 101, OutportAddr = 101
DecodeQuery() Actual status: STATUS_SUCCESS
TLD(1) going to UP state
TLD(1) DismountTape ****** from drive 1
TLD(1) Creating Process for DISMOUNT: "C:\Program Files\Veritas\Volmgr\bin\tldd.exe" -v -child -ro 3 -ltidev 0 -ltix 1 -rn 1 -slot -1 -lf 2 -flgs 0 -rs 65536 -pd 180 -sd 15 -path *NULL* -nh *NULL* -vsn ******
...
29.08.2017 13:22:37 TLD(1) [7248] opening robotic path {2,0,7,1} (bus -1, target -1, lun -1)
29.08.2017 13:22:55 TLD(1) initiating MOVE_MEDIUM from addr 1 to addr 1028
29.08.2017 13:23:27 TLD(1) closing/unlocking robotic path
29.08.2017 13:25:12 TLD(1) MountTape NNN6 on drive 2, from slot 29
TLD(1) Creating Process for MOUNT: "C:\Program Files\Veritas\Volmgr\bin\tldd.exe" -v -child -ro 1 -ltidev 1 -ltix 1 -rn 1 -slot 29 -lf 0 -flgs 0 -rs 65536 -pd 180 -sd 15 -path {1,0,7,0} -vsn NNN6 -b NNN6

29.08.2017 13:25:22 TLD(1) [372] opening robotic path {2,0,7,1} (bus -1, target -1, lun -1)

C:\Program Files\Veritas\NetBackup\logs\bptm - ALL_ADMINS.082917_00001 - attached

Dmitry_K · ‎08-29-2017

After all black magic and reboot - direct write to tape started.

But it's so slooow (for FC 8G LTO7) - see attach

Nicolai · ‎08-29-2017

set NUMBER_DATA_BUFFERS and NUMBER_DATA_BUFFERS_RESTORE to 512.

https://www.veritas.com/support/en_US/article.TECH18422

NUMBER_DATA_BUFFERS

To change the NUMBER_DATA_BUFFERS, create the <INSTALL_PATH>\NetBackup\db\config\NUMBER_DATA_BUFFERS file. It should contain the number of buffers to be created at the start of backup. If the file is not present, the default of 16 will be used. Ensure the file name is not created with a suffix such as '.txt'.

Dmitry_K · ‎08-30-2017

ok. I made C:\Program Files\Veritas\NetBackup\db\config dir and 2 files (without extension) -

- NUMBER_DATA_BUFFERS
- NUMBER_DATA_BUFFERS_RESTORE

with clear text 512 inside

Result: no difference -

Detailed status - begin writing and .. nothing more.

30.08.2017 11:39:26 - Info bptm (pid=5864) start

30.08.2017 11:39:36 - Info bptm (pid=5864) using 65536 data buffer size

30.08.2017 11:39:36 - Info bptm (pid=5864) setting receive network buffer to 263168 bytes

30.08.2017 11:39:36 - Info bptm (pid=5864) using 512 data buffers

30.08.2017 11:39:37 - Info bptm (pid=5864) start backup

30.08.2017 11:39:37 - Info bptm (pid=5864) Waiting for mount of media id NNN (copy 1) on server ***

30.08.2017 11:39:37 - mounting NNN

30.08.2017 11:43:34 - Info bptm (pid=5864) media id NNN mounted on drive index 1, drivepath {1,0,7,0}, drivename HP.ULTRIUM7-SCSI.001, copy 1

30.08.2017 11:43:35 - mounted NNN ; mount time: 0:03:58

30.08.2017 11:43:40 - positioning NNN to file 1

30.08.2017 11:44:05 - positioned NNN ; position time: 0:00:25

30.08.2017 11:44:05 - begin writing

Marianne · ‎08-30-2017

Is this a VMware backup?

If so, are you using VMware Intelligent Policies with Resource Limits?

Please show us the last 10-15 lines for PID 5864 in bptm file to see the effect of buffer size and number.

Oh - the screenshot and your post above show different values for number of data buffers.
Why is that?

Handy NetBackup Links

Dmitry_K · ‎08-30-2017

>>Is this a VMware backup?

yes, but today I made new task for file (windows) backup.

>>show different values for number of data buffers.

I change this value from def to 512.

Result now: process stop on line:

15:05:06.282 [5580.6928] <2> write_data: first write, twin_index: 0 cindex: 0 dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
15:05:06.282 [5580.6928] <2> write_data: received first buffer (65536 bytes), begin writing data

(and after 5 minutes I stop task) -

15:09:13.665 [1436.6960] <2> initialize: fd values STDOUTSOCK=520 STDERRSOCK=120
15:09:13.665 [1436.6960] <2> bptm: instance - 1398296
15:09:13.681 [1436.6960] <2> bptm: INITIATING (VERBOSE = 5): -rptdrv -jobid -1504093765 -jm
15:09:13.681 [1436.6960] <2> bptm: Event Global\NetBackup Terminate Event, pid: 1436 created.
15:09:13.681 [1436.6960] <2> bptm: PORT_STATUS = 0x00000000

Marianne · ‎08-30-2017

There is no way that we can comment on performance or tell you where to look if we cannot see the end of a successful job.

Handy NetBackup Links

Nicolai · ‎08-30-2017

Set SIZE_DATA_BUFFERS to 262144, you are using a 64K block size (using 65536 data buffer size)

If data comming in too slow, buffer tuning won't help. But setting the buffers is always step number 1.

Dmitry_K · ‎09-04-2017

>>if we cannot see the end of a successful job.

I don't have any successul job, even if I try write small 2-Gb file from local drive.

Perhaps, best way is try another solution - veeam or bareos

VOX

Netbackip 8 and tape HPE StoreEver MSL6480