cancel
Showing results for 
Search instead for 
Did you mean: 

SAP backup fails with Status code 41/ 40

elaguni
Level 4
Partner Accredited

Hi Team,

I hope someone can spare us some time to point me to the right direction. As the subject suggests, our SAP on AIX backup fails with status code 41 / 40. The backup starts ok, looks like it'll run just fine, but then it fails later on.

I've attached a few files that could help us further understand what's happening. I've tried to adjust the client read time out (both from client and media server) but the issue persist.

8 REPLIES 8

Nicolai
Moderator
Moderator
Partner    VIP   

Can you please share the backint log from the client as well.  

Anyt firwall between client and netbackup master/media server ?

Do you have client_read_timeout set to 1800 or a larger value ?

See : http://www.symantec.com/docs/HOWTO33199

Marianne
Level 6
Partner    VIP    Accredited Certified

Please help us understand where log files were taken from?

log.011214 looks like a bpcd log file taken from a media server and contains no error.

the bptm log says 'clientbptmlog' and contains no error either. 
Is the SAP client also a media server backing up itself?

media011214.. is also a bpcd log. From which server or client?
This log contains lots of errors, e.g.

 <16> bpcd main: couldn't write cmd status to output socket h_errno = 10053 - An established connection was aborted by the software in your host machine. 
<4> bpcd valid_server: bsdmb1.smrtnet.ads is not a master server
....
<16> bpcd valid_server: bsdmb1.smrtnet.ads is not a media server either

 

But there is no indication if these errors are in any way related to failed SAP backup.

Please give all relevant info: 
1. All text in Details tab of the failed job
2. Info about master, media server and client - OS, NBU version on each.
SAP for Oracle on Client or some other database? Database and SAP versions?
3. Backup/Policy type - if SAP policy type, we need backint log on client. If SAP on Oracle with RMAN, we need dbclient log as well.

One more thing - I saw reference to NBU 6.0 in one of the logs. 
Are you aware that support for ALL 6.x versions ended in Oct 2012?
 

elaguni
Level 4
Partner Accredited

Hi Nicolai,

  • The backint log is empty.
  • No firewall in between client and master/media server
  • Client read timeout of both client and master/media server were set to 9999

Hi Marianne,

I apologize for not mentioning which log is from where. Attached is the joboutput for your reference.

The master/media server is Win2k3 with NBU 6.0. Client is AIX RS6000 - no NBU SAP agent. The dbclient folder is also empty. Here's some info bout the policy:

 
Policy Name:       AIXBK_FS_SAP_FLASH
Options:           0x0
template:          FALSE
c_unused1:         ?
Names:             (none)
Policy Type:       Standard (0)
Active:            no
Effective date:    03/31/2005 14:28:24
Client Compress:   no
Follow NFS Mnts:   no
Cross Mnt Points:  yes
Collect TIR info:      no
Block Incremental: no
Mult. Data Stream: no
Perform Snapshot Backup:   no
Snapshot Method:           (none)
Snapshot Method Arguments: (none)
Perform Offhost Backup:    no
Backup Copy:               0
Use Data Mover:            no
Data Mover Type:           2
Use Alternate Client:      no
Alternate Client Name:     (none)
Enable Instant Recovery:   no
Policy Priority:   99
Max Jobs/Policy:   30
Disaster Recovery: 0
Collect BMR Info:  no
Keyword:           (none specified)
Client Encrypt:    no
Checkpoint:        yes
      Interval:    30
Residence:         aixbk-hcart-robot-tld-0
Volume Pool:       NetBackup
Client/HW/OS/Pri:  AIXBK RS6000 AIX5 0 0 0 0 ?
Include:           /SAP
Exclude:           (none defined)
Schedule:          Daily
  Type:            FULL (0)
  Calendar sched: Enabled
.
.
.
  Maximum MPX:     30
  Synthetic:       0
  PFI Recovery:    0
  Retention Level: 2 (4 weeks)
  u-wind/o/d:      0 0
  Incr Type:       DELTA (0)
  Alt Read Host:   (none defined)
  Max Frag Size:   0 MB
  Number Copies:   1
  Fail on Error:   0
  Residence:       (specific storage unit not required)
  Volume Pool:     (same as policy volume pool)
 

Nicolai
Moderator
Moderator
Partner    VIP   

So it really not a SAP backup but a regular file system backup. In this case it the bpbkar log on the client.

Netbackup 6.0 is out of support - It time to upgrade.

Please also consider this T/N to narraow down the issue : http://www.symantec.com/docs/TECH31513

Marianne
Level 6
Partner    VIP    Accredited Certified

So, this is NOT a SAP backup.

It is a filesystem backup of /SAP directory/mount point with Cross Mount points selected...

Policy Type:       Standard (0)
Active:            no
Effective date:    03/31/2005 14:28:24
Client Compress:   no
Follow NFS Mnts:   no
Cross Mnt Points:  yes
Perform Snapshot Backup:   no
Snapshot Method:           (none)
 
Has this backup worked previously?
Are you trying to backup SAP databases while db's are online/active?
If so, this will not work. Data will not be backed in consistent state.
Even if backup should go through successfully, the DBs won't startup after a restore.
Policy name contains the word 'FLASH' but this does not seem to be a FlashBackup or snapshot policy.
How are you ensuring the DBs are in consistent state?
 
It seems that the AIX server is backing up itself, but there is a mix of uppercase and lowecase names:
AIXBK and aixbk.
Why is that? 
NBU and AIX are case sensitive.
 
Where was the 'joboutput' file taken from?
This looks like unformatted error log - all in one line.
Please copy the text in Details tab of a failed job. This is all we need for now.
 
For filesystem backup, you need the following logs to troubleshoot:
 
On the media server (same as client if backing up itself):
bptm and bpbrm
 
On the client: 
bpcd and bpbkar.
 
Please collect the logs that corresponds with the date and time of failed job and rename to reflect the process names: bptm.txt bpbrm.txt bpcd.txt bpbkar.txt.
 
If you do not have all of these log folders, please create them and start another backup. 
(Don't try to backup SAP files while DBs are online....)
After failure, collect all logs and rename them, then upload as File attachments.
Copy all text in Details tab and post here.
 
PS: Any reason why you are still on NBU 6.0??

elaguni
Level 4
Partner Accredited

Thank you very much for your time and assistance Nicolai and Marianne. I'm sorry for not being able to answer your question probably the way you expect me to. My background is mostly VM and storage. Now, starting to cover backup.

As far as I know, the db is not online during the backup window, and this policy has worked before. We have a few backup domains, and this one is yet to be upgraded. Attached are a few more logs; hope they help.

Marianne
Level 6
Partner    VIP    Accredited Certified

We still need to see all text in Details tab of failed job.

It seems as if bpbkar on client went into a 'hang' state at about 05:38.

The last file entry was for /SAP/sapmnt/PRD/global/100BDCLG/

The next entry is at 08:28 when it seems that another backup was started for the same filesystem.

05:38:03.339 [393322] <4> bpbkar PrintFile: /SAP/sapmnt/PRD/global/000JOBLG/0001X23492701X49641
05:38:03.344 [393322] <4> bpbkar PrintFile: /SAP/sapmnt/PRD/global/100BDCLG/
08:28:01.871 [475290] <4> bpbkar main: real locales <en_US en_US C en_US en_US en_US>
08:28:01.871 [475290] <4> bpbkar main: standardized locales - lc_messages <en_US> lc_ctype <en_US> lc_time <en_US> lc_collate <en_US> lc_numeric <en_US>
08:28:01.882 [475290] <2> logparams: bpbkar -r 2419200 -ru root -dt 0 -to 0 -clnt AIXBK -class AIXBK_FS_SAP_FLASH -sched Daily -st FULL -bpstart_to 8800 -bpend_to 8000 -read_to 9999 -ckpt_time 1800 -blks_per_buffer 511 -use_otm -use_ofb -
 
Is it possible to list contents of /SAP/sapmnt/PRD/global/100BDCLG/ folder?
 
 

Nicolai
Moderator
Moderator
Partner    VIP   

This likley the issues:

09:01:20.400 [606460] <4> bpbkar PrintFile: /road1/ora817/app/oracle/product/817/rdbms/audit/
09:03:58.700 [393322] <16> bpbkar copy_to_shm: ERR - bpbkar exiting because backup is aborting
09:03:58.755 [393322] <16> bpbkar sighandler: ERR - bpbkar killed by SIGPIPE
09:03:58.766 [393322] <2> bpbkar sighandler: INF - ignoring additional SIGPIPE signals
09:03:58.787 [393322] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 40: network connection broken
09:03:58.795 [393322] <4> bpbkar Exit: INF - EXIT STATUS 40: network connection broken
09:03:58.857 [393322] <4> bpbkar Exit: INF - setenv FINISHED=0
 
I have many time before seen file backup fail on the Oracle audit area because of the number of files in this directory (many 1000's of files). Oracle write audit log files for user sessions in this location.
 
Ask the dba to clean-up.