Solved: Oracle Archive_log backup job failing Error 6

Nayabsk · ‎10-13-2014

Hi All,

My Oracle backups failing with error as below. Please help me with the workaround.

Starting Control File and SPFILE Autobackup at 14.10.2014 11:37:22
released channel: ch00
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of Control File and SPFILE Autobackup command on ch00 channel at 10/14/2014 11:38:36
ORA-19506: failed to create sequential file, name="c-535381093-20141014-05", parms=""
ORA-27027: sbtremove2 returned error
ORA-19511: Error received from media manager layer, error text:
Failed to remove, c-535381093-20141014-05, from image catalog.

I have already checked the permissions of bp.conf file , everything seems in place.

Thanks,

Nayab

Marianne · ‎10-21-2014

IMHO - the issue seems to be with this error seen in dbclient log:

10:36:35.174 [28727] <2> bprd_connect:    errno = 98 - Address already in use
10:36:35.174 [28727] <16> dbc_GetMediaListByName: Can't connect to host zsswmasb: cannot connect on socket (25)

I found this TN that seems to be caused by a lack of available sockets (OS issue):

http://www.symantec.com/docs/TECH52587

Handy NetBackup Links

View solution in original post

Michael_G_Ander · ‎10-14-2014

It could be a timeout, but impossible to say without more information. Please post the job details

As always when troubleshooting database backup create the bphdb and dbclient log folders to get more information

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

Nayabsk · ‎10-14-2014

Hey Michael,

Please find the dbclient log folder attached , Please let me know where i will be able to find bphdb ?

Thanks,

Nayab

Will_Restore · ‎10-14-2014

Network problems (configuration not hardware).

Verify hostname lookup. Make sure your client and server can communicate bi-directionally.

13:58:06.055 [14563] <16> bsa_bplist: Can't connect to host zsswmasb: cannot connect on socket (25)

13:46:27.382 [14563] <8> dbc_GetServerClientConfig: WARNING - NBU's client name= <bsswmppp1> differs from gethostname()= <zsswmppp1>

Michael_G_Ander · ‎10-14-2014

If the bphdb folder does not exist you have to create it under netbackup/logs on the client

Can see from dbclient log, that there seems to be a connection issue

10:36:35.174 [28727] <2> bprd_connect: Cannot connect to server zsswmasb
10:36:35.174 [28727] <2> bprd_connect: errno = 98 - Address already in use
10:36:35.174 [28727] <16> dbc_GetMediaListByName: Can't connect to host zsswmasb: cannot connect on socket (25)

What does bptestbpcd -client bsswmppp1 -debug from the master and media server show ?

Also what does bpclntcmd -pn & bpclntcmd -self on client show ?

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

rookie11 · ‎10-14-2014

@nayabsk : i think this is root of problem.

12:43:27.748 [5558] <2> bprd_connect: Cannot connect to server zsswmasb automatically: 21
12:43:27.748 [5558] <2> bprd_connect: Cannot connect to server zsswmasb
12:43:27.748 [5558] <2> bprd_connect: errno = 98 - Address already in use
12:43:27.748 [5558] <16> bsa_bplist: Can't connect to host zsswmasb: cannot connect on socket (25)
12:43:27.748 [5558] <16> VxBSAQueryObject: ERR - bsa_bplist() failed 25
12:43:27.748 [5558] <16> xbsa_QueryObject: ERR - VxBSAQueryObject: Failed with error:
Server Status: cannot connect on socket
12:43:27.748 [5558] <16> int_RemoveImage: ERR - Failed to remove, c-535381093-20141014-06, from image catalog.

12:43:34.656 [5558] <4> sbtend: INF - --- END of SESSION ---
13:46:27.382 [14563] <8> dbc_GetServerClientConfig: WARNING - NBU's client name= <bsswmppp1> differs from gethostname()= <zsswmppp1>

System name:   Linux
   Node name:   zsswmppp1(i think its a cluster)
  Client Host:    bsswmppp1(virtual server on cluster)

please confirm above info. and do share ur script as well along with bpclntcmd output.

Nayabsk · ‎10-14-2014

@rookie , Please find the bpclntcmd output and backup script attached in btwn i see few Archivelog job are successful but few are failing , Its strange

ZSSWMPPP1 is Clusterd with ZSSWMPPP2 and bsswmppp1 , bsswmppp2 were the hostnames through which the NBU communicates these hosts ( THE BACKUP LAN )

zsswmppp1:/usr/openv/netbackup/bin # ./bpclntcmd -pn
expecting response from server zsswmasb
bsswmppp1 bsswmppp1 172.29.48.54 59665

zsswmppp1:/usr/openv/netbackup/bin # ./bpclntcmd -self
current domain =
NIS does not seem to be running: (1) Request arguments bad
gethostname() returned: zsswmppp1
host zsswmppp1: zsswmppp1.ssw.corp at 172.29.26.25 (0x191a1dac)
aliases: zsswmppp1

zsswmppp1:/usr/openv/netbackup/bin # ./bpclntcmd -hn zsswmasb2
host zsswmasb2: zsswmasb2 at 172.29.48.82 (0x52301dac)
aliases:

zsswmppp1:/usr/openv/netbackup/bin # ./bpclntcmd -hn zsswmasb
host zsswmasb: zsswmasb at 172.29.48.83 (0x53301dac)
aliases:

zsswmppp1:/usr/openv/netbackup/bin # ./bpclntcmd -ip 172.29.48.54
checkhaddr: host : bsswmppp1: bsswmppp1 at 172.29.48.54 (0x36301dac)
checkhaddr: aliases:

Nayabsk · ‎10-14-2014

@Michael Please note that my CLIENT ITSELF acts as MEDIA SERVER and backups up self.

Please find the outputs attached as requested and can refer to the bpclntcmd in my reply to rookie :)

Nayabsk · ‎10-14-2014

Hi All,

Now my backups are failing with error 11 system call failed ?

11 system call failed bsswmppp_sp25_oracle_daily bsswmppp1
11 system call failed bsswmppp_sp27_oracle_daily bsswmppp2

Thanks,

Nayab

Michael_G_Ander · ‎10-15-2014

Have you changed anything ?

Which process is failing ?

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

Nayabsk · ‎10-15-2014

I made entries in this location in my MASTER server and made this entry for my DB servers MPPP1 and MPPP2 as below

zsswmasb2:/usr/openv/netbackup/db/altnames

echo echo "bsswmppp1" >> bsswmppp1.aaa.corp

Apart from this i havent chnaged anything

rookie11 · ‎10-15-2014

error code 11 can also come if disk space is less

have you updated host file entries on master and media+client ? It should look like

ZSSWMPPP1 bsswmppp1 <ipaddress>

ZSSWMPPP2 bsswmppp2 <ipaddress>

rookie11 · ‎10-15-2014

if would suggest to make changes on script as well.

SEND 'NB_ORA_CLIENT=$CLIENTHOST, NB_ORA_POLICY=$NB_ORA_POLICY';

put client name as bsswmppp1 and policy name as policy which is in NBU gui

Nayabsk · ‎10-15-2014

Hey Rookie,

Let me make some things clear about my SERVER and INSTANCES running

ZSSWMPPP1 ( Instances Name - 20, 27 ) Clustered with server ZSSWMPPP2 ( Instance Name - 25 )

Now status of my backups are as below

Instance 20 - Both Archivelog and DB backups are successful

Instance 27 - Only Archivelog backups successful but DB backups failing continously since 2 days

Instance 25 - ArchiveLog backups failing sometimes but DB backups failing continously since 2 days

Thanks,

Nayab

Michael_G_Ander · ‎10-16-2014

Seems to me that something changed 2 days ago, and are pretty sure there is something in the network setup/NetBackup configuration that not quite match.

Also as you are using a backup network, are you sure all the backup related traffic goes through that and not some through the production network.

What are the logical names/VIP on the cluster and what are the names/IP on physical nodes ?

Sorry but is now very confused about what you setup actually are

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

nathanmike · ‎10-16-2014

@Nayabsk,

I have some questions for you before going further in troubleshooting:

Are you Oracle DBA or have you experience in RDBMS?

On which platform are you running Oracle: Linux? Unix? Windows?

RMAN backup was working before? if so, Is there something that has changed since the last backup?

=> If backup was running fine before, you can run the following command from RMAN prompt to see the list of all previous backup:

list backup summary ;

Can you run show all command from RMAN prompt to see your RMAN configuration and make sure CONTROFILE AUTOBACKUP is properly configured?

Can you paste the complete RMAN backup script or the command that you run for backup?

Are you backing up to DISK or TAPE (SBT_TAPE param)?

Nayabsk · ‎10-20-2014

Hey Nathanmike,

Answer to your questions

Are you Oracle DBA or have you experience in RDBMS?

Nope i am Storage and Backup Admin

On which platform are you running Oracle: Linux? Unix? Windows?

Linux

RMAN backup was working before? if so, Is there something that has changed since the last backup?

As per my SYSADMIN and DBA nothing changed ans also i see other instance backups on the same server is working fine.

=> If backup was running fine before, you can run the following command from RMAN prompt to see the list of all previous backup:

list backup summary ;

Can you run show all command from RMAN prompt to see your RMAN configuration and make sure CONTROFILE AUTOBACKUP is properly configured?

Yes it is Properly Configured i have checked with my DBA

Can you paste the complete RMAN backup script or the command that you run for backup?

Are you backing up to DISK or TAPE (SBT_TAPE param)

Backup going to SBT_TAPE, Please find the script attached

Nathan Mike

Marianne · ‎10-20-2014

Cannot connect to server zsswmasb

I remember from one of your other posts (Understanding the Connectivity of Tape Library with my Media Servers) that your master server is clustered, right?
CLUSETR NAME IS ZSSWMASB

Check /etc/hosts on the client that correct IP address is entered for the virtual name.
Maybe 172.29.48.83 is the IP address for zsswmasb1?
If so, backups will work while zsswmasb1 is the active node but fail when zsswmasb2 is active node.

When the backup fails, check bprd on the active node to see if connection request was received from the client.

Handy NetBackup Links

nathanmike · ‎10-21-2014

@Nayabsk,

I already face this issue few weeks ago:

Starting Control File and SPFILE Autobackup at 23-MAY-14
released channel: ch00
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of Control File and SPFILE Autobackup command on ch00 channel at 05/23/2014 12:29:24
ORA-19506: failed to create sequential file, name="c-4185468684-20140523-03", parms=""
ORA-27027: sbtremove2 returned error
ORA-19511: Error received from media manager layer, error text:
Failed to remove, c-4185468684-20140523-03, from image catalog.

I solved this issue as follow:

1) If it is a Clustered environment, make sure all Clients (nodes and virtual host) are properly created in Netbackup Admin Console

2) Make sure all Clients attribute from the Master server are properly created for each nodes and virtual host.

3) Make sure you have created the follwing files "/usr/openv/netbackup/db/altnames/No.Restrictions" on the Master server to avoid Client access conflict especially if you have multihomed system:

[root@nbumas01 ~]# touch /usr/openv/netbackup/db/altnames/No.Restrictions
[root@nbumas01 ~]# ls -lrt /usr/openv/netbackup/db/altnames/No.Restrictions
-rw-r--r-- 1 root root 0 May 23 11:12 /usr/openv/netbackup/db/altnames/No.Restrictions

http://www.symantec.com/business/support//index?page=content&pmv=print&impressions=&viewlocale=&id=HOWTO86696

4) Make sure you can lookup/reverse lookup all of your client (nodes and virutal host) from master server and client

http://www.symantec.com/business/support/index?page=content&id=TECH135349

5) Make sure RMAN is properly configured for CONTROLFILE AUTOBACKUP, ARCHIVELOG, DATAFILE AND DEVICE TYPE

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE TO '%F';

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE 'SBT_TAPE' TO '%F';
new RMAN configuration parameters are successfully stored
starting full resync of recovery catalog
full resync complete

RMAN> CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE SBT_TAPE TO 1
2> ;

new RMAN configuration parameters:
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;
new RMAN configuration parameters are successfully stored
starting full resync of recovery catalog
full resync complete

RMAN> CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE SBT_TAPE TO 1;

new RMAN configuration parameters:
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;
new RMAN configuration parameters are successfully stored
starting full resync of recovery catalog
full resync complete

RMAN> CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT 'ora_df%t_s%s_s%p';

new RMAN configuration parameters:
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT 'ora_df%t_s%s_s%p';
new RMAN configuration parameters are successfully stored
starting full resync of recovery catalog
full resync complete

RMAN> show all;

RMAN configuration parameters for database with db_unique_name NBUORA are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 3;
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/backup/rman/ora_cf%F';
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE 'SBT_TAPE' TO '%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 2 BACKUP TYPE TO BACKUPSET;
CONFIGURE DEVICE TYPE SBT_TAPE PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/backup/rman/ora_df%t_s%s_s%p';
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS 'SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64';
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.1.0/dbs/snapcf_nbuora.f'; # default

RMAN> exit

6) Make sure your RMAN script is properly configured with correct entries for backup. Please find below an example:

[oracle@rhel6 scripts]$ cat fullrmancmd
set echo on;
run {
allocate channel ch00
    device type sbt_tape
      PARMS='ENV=( NB_ORA_CLIENT=nbuora.nbulabs.be,
            NB_ORA_POLICY=nbu-oracle-pol,
            NB_ORA_SCHED=weekly_full_oracle
          )'
;

backup
    format='%U'
    incremental level 0
    database
      include current controlfile
      plus archivelog
;

release channel ch00;
}
list backup summary device type sbt;
exit;

7) Run your script and see if it is running fine:

[oracle@rhel6 scripts]$ ./nbuorafull.sh

Recovery Manager: Release 11.2.0.1.0 - Production on Fri May 23 16:27:16 2014

connected to target database: NBUORA (DBID=4185468684)
connected to recovery catalog database

RMAN> set echo on;
2> run {
3>   allocate channel ch00
4>     device type sbt_tape
5>       PARMS='ENV=( NB_ORA_CLIENT=nbuora.nbulabs.be,
6>             NB_ORA_POLICY=nbu-oracle-pol,
7>             NB_ORA_SCHED=weekly_full_oracle
8>           )'
9>   ;
10>
11>   backup
12>     format='%U'
13>     incremental level 0
14>     database
15>       include current controlfile
16>       plus archivelog
17>   ;
18>
19>   release channel ch00;
20> }
21> list backup summary device type sbt;
22> exit;
echo set on

allocated channel: ch00
channel ch00: SID=64 device type=SBT_TAPE
channel ch00: Veritas NetBackup for Oracle - Release 7.5 (2013061020)

Starting backup at 23-MAY-14
current log archived
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_22/o1_mf_1_21_9qwb9d94_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_22/o1_mf_1_22_9qwbc8nv_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_23_9qwwzzww_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_24_9qwx0nb1_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_25_9qxzcmgq_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_26_9qxzyc65_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_27_9qy0gbqt_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_28_9qy3l3h3_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_29_9qy48llt_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_30_9qy82dg3_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_31_9qy8o2jh_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_32_9qybofpw_.arc; already backed up 1 time(s)
skipping archived log file /u01/app/oracle/fast_recovery_area/NBUORA/archivelog/2014_05_23/o1_mf_1_33_9qyc6v58_.arc; already backed up 1 time(s)
channel ch00: starting archived log backup set
channel ch00: specifying archived log(s) in backup set
input archived log thread=1 sequence=34 RECID=28 STAMP=848334441
channel ch00: starting piece 1 at 23-MAY-14
channel ch00: finished piece 1 at 23-MAY-14
piece handle=26p913je_1_1 tag=TAG20140523T162726 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:03:35
Finished backup at 23-MAY-14

Starting backup at 23-MAY-14
channel ch00: starting incremental level 0 datafile backup set
channel ch00: specifying datafile(s) in backup set
input datafile file number=00004 name=/oradata/users/nbuora/users01.dbf
input datafile file number=00005 name=/oradata/data01/nbuora/data01.dbf
input datafile file number=00006 name=/oradata/data02/nbuora/data02.dbf
input datafile file number=00007 name=/oradata/data03/nbuora/data03.dbf
input datafile file number=00008 name=/oradata/index01/nbuora/index01.dbf
input datafile file number=00009 name=/oradata/index02/nbuora/index02.dbf
input datafile file number=00010 name=/oradata/index03/nbuora/index03.dbf
input datafile file number=00001 name=/oradata/system/nbuora/system01.dbf
input datafile file number=00002 name=/oradata/system/nbuora/sysaux01.dbf
input datafile file number=00003 name=/oradata/undo01/nbuora/undo01.dbf
input datafile file number=00011 name=/oradata/rman/rman01.dbf
channel ch00: starting piece 1 at 23-MAY-14
channel ch00: finished piece 1 at 23-MAY-14
piece handle=27p913q5_1_1 tag=TAG20140523T163101 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:08:27
channel ch00: starting incremental level 0 datafile backup set
channel ch00: specifying datafile(s) in backup set
including current control file in backup set
channel ch00: starting piece 1 at 23-MAY-14
channel ch00: finished piece 1 at 23-MAY-14
piece handle=28p914a0_1_1 tag=TAG20140523T163101 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:03:06
Finished backup at 23-MAY-14

Starting backup at 23-MAY-14
current log archived
channel ch00: starting archived log backup set
channel ch00: specifying archived log(s) in backup set
input archived log thread=1 sequence=35 RECID=29 STAMP=848335355
channel ch00: starting piece 1 at 23-MAY-14
channel ch00: finished piece 1 at 23-MAY-14
piece handle=29p914ft_1_1 tag=TAG20140523T164237 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:03:35
Finished backup at 23-MAY-14

Starting Control File and SPFILE Autobackup at 23-MAY-14
piece handle=ora_cfc-4185468684-20140523-05 comment=API Version 2.0,MMS Version 5.0.0.0
Finished Control File and SPFILE Autobackup at 23-MAY-14

released channel: ch00

Marianne · ‎10-21-2014

IMHO - the issue seems to be with this error seen in dbclient log:

10:36:35.174 [28727] <2> bprd_connect:    errno = 98 - Address already in use
10:36:35.174 [28727] <16> dbc_GetMediaListByName: Can't connect to host zsswmasb: cannot connect on socket (25)

I found this TN that seems to be caused by a lack of available sockets (OS issue):

http://www.symantec.com/docs/TECH52587

Handy NetBackup Links