03-09-2015 05:23 AM
Ive got several instances in an OIP policy that backs up redo logs every few hours.
It has worked flawlessy until now.
I notice that for one instance the backup failed.
Then when the window opens next time around, netbackup/OIP doesnt even bother to execute the backup for that instance. All the other ones in the policy continue as normal. And again for the next iteration: its totally absent.
As an experiment I've removed the instance and readded it to the policy to see if it runs next time around.
Thoughts or similar observations?
Its all Solaris client and server.
Thanks in advance,Jim
03-09-2015 06:01 AM
The instance isn't deactivated right ?
If you take a look in \netbackup\logs\bpdbsbora on the client is there any signs of the Oracle instance being scheduled (scripts being generated) ?
http://www.mass.dk/netbackup/quick-hints/101-rman-input-a-output-using-intelligent-polices-in-netbackup-76.html
03-09-2015 07:33 AM
Thanks for reply Nicolai, I will investigate your idea.
Update: removing and adding back the instance into the policy makes no difference, it still doesnt initiate a backup. And yes the instance is active. Jim
03-09-2015 07:46 AM
On the client in the bpdbsbora log there is no sign of the backup being initiated, just the error that triggered earlier in the day:
08:00:27.160 [1410] <16> CFECallback::reportStatus: ERR - Unable to connect to <instance>. Error Code = 12,537: ORA-12537: TNS:connection closed
08:05:37.088 [7576] <16> CFECallback::reportStatus: ERR - Unable to connect to <instance>. Error Code = 12,537: ORA-12537: TNS:connection closed
Looking like a bug to me at this point. Jim
03-09-2015 08:08 AM
Interesting...if I create a new policy to just cover the redo logs in my problem instance, the gui throws an error:
"Cannot do manual backup of policy. The policy does not have a list of files to back up."
Most intruiging. Jim
03-09-2015 08:11 AM
Ask the DBA if he/she can connect to the database. I suspect this a Oracle listner issue (but I may be wrong). If Netbackup is not able to talk to the instance it won't be able to perform backup.
There is plenty of hits on the TNS ORA-12537 error .e.g
http://www.oradev.com/ORA-12537_TNS_connection_closed.jsp
03-09-2015 08:18 AM
Spot the error? I cant see it yet...
Policy Name: adhoc_Oracle_redo
Options: 0x0
template: FALSE
audit_reason: ?
Names: (none)
Policy Type: Oracle (4)
Active: yes
Effective date: 04/08/2008 11:37:41
Block Incremental: no
Mult. Data Stream: no
Perform Snapshot Backup: no
Snapshot Method: (none)
Snapshot Method Arguments: (none)
Perform Offhost Backup: no
Backup Copy: 0
Use Data Mover: no
Data Mover Type: -1
Use Alternate Client: no
Alternate Client Name: (none)
Use Virtual Machine: 0
Hyper-V Server Name: (none)
Enable Instant Recovery: no
Policy Priority: 100
Max Jobs/Policy: Unlimited
Disaster Recovery: 0
Collect BMR Info: no
Keyword: To run OIP backup of redos.
Data Classification: -
Residence is Storage Lifecycle Policy: no
Client Encrypt: no
Checkpoint: no
Residence: Master-SL500-LTO4
Volume Pool: ENCR_Testing
Server Group: *ANY*
Granular Restore Info: no
Exchange Source attributes: no
Exchange DAG Preferred Server: (none defined)
Application Discovery: no
Discovery Lifetime: 0 seconds
ASC Application and attributes: (none defined)
Generation: 19
Ignore Client Direct: no
Enable Metadata Indexing: no
Index server name: NULL
Use Accelerator: no
Client List Type: 1
Selection List Type: 2
Oracle Backup Data File Name Format: NULL
Oracle Backup Archived Redo Log File Name Format: NULL
Oracle Backup Control File Name Format: NULL
Oracle Backup Fast Recovery Area File Name Format: NULL
Oracle Backup Set ID: NULL
Oracle Backup Data File Arguments: SPECIFY_MAX_LIMITS=0,NUM_STREAMS=1,SKIP_READ_ONLY=0,SKIP_OFFLINE=0,OFFLINE=0
Oracle Backup Archived Redo Log Arguments: SPECIFY_MAX_LIMITS=0,NUM_STREAMS=1,INCLUDE_ARCH_LOGS=1
Instance Name/Client/Pri/DMI/CIT: PROPPROD oracle-propprod 0 0 1 0 ?
Include: (none defined)
Schedule: Redos_only
Type: TLOG Archived Redo Log Backup (5)
Calendar sched: Enabled
Included Dates-----------
No days of week entered
Excluded Dates----------
No specific exclude dates entered
No exclude days of week entered
Retention Level: 8 (1 day)
u-wind/o/d: 0 0
Incr Type: DELTA (0)
Alt Read Host: (none defined)
Max Frag Size: 0 MB
PFI Recovery: 0
Maximum MPX: 1
Number Copies: 1
Fail on Error: 0
Residence: (specific storage unit not required)
Volume Pool: (same as policy volume pool)
Server Group: (same as specified for policy)
Residence is Storage Lifecycle Policy: 0
Schedule indexing: 0
Daily Windows:
Day Open Close W-Open W-Close
Sunday 000:00:00 000:00:00
Monday 000:00:00 000:00:00
Tuesday 000:00:00 000:00:00
Wednesday 000:00:00 000:00:00
Thursday 012:00:00 014:00:00 108:00:00 110:00:00
Friday 000:00:00 000:00:00
Saturday 000:00:00 000:00:00
Jim
03-09-2015 08:21 AM
The db is up and running fine, I have connected to it no problem. There would be mayhem if it were truly not runnng. Even if it were down, I would probbly expect netbackup to trigger a backup, even if it fails.
Jim.
03-09-2015 08:45 AM
If I copy the functioning policy and remove all the instances apart from 1 which does work, the gui throws the same error. Something amiss...if I leave it to run under the scheduler, it works perfectly. All very interesting.
03-09-2015 08:49 AM
I re-call a previous connect thread about this ..But I can't find it
03-09-2015 09:02 AM
And so if I put my "problem" instance to run in a separate policy via scheduler, it also runs fine. That confirms the instance and everything associated with it is fine. And yet the official policy simply ignores it still. Jim
03-09-2015 09:25 AM
humm - Could it be the original policy is damaged.
03-09-2015 09:26 AM
Long shot - But could it be a clean-up issue
http://www.symantec.com/docs/TECH215948
03-10-2015 02:23 AM
Update: without any further intervention on the orignal policy, my ignored instance is now no longer ignored and has run successfully at 0800.
Jim
03-11-2015 11:34 PM
I still think Nicolai was right about his observation regarding Listener issue. The db can be up/active but listener not. Or incorrect config of the listener in a cluster.
The original error should give a clue of what was wrong.
03-12-2015 05:03 AM
Sorry but nothing to do with listener. Without listener, we'd have our clients banging on the door, and we'd know as it would show up on the cluster too. And besides, why did the instance backup spring back into life the following day? Jim
03-12-2015 05:51 AM
Thanks for sharing your experience.
One possibility that comes to mind: the global setting on the master server for # tries per ## hours.
The default is 2 tries per 12 hours.
I have see users changing it to 1 try per 24 hours.
For some reason, this plays havoc with backups scheduled multiple times per day when a failure occurs.
All runs well as per the schedule, but when there is a failure, this setting in NBU causes backup to be overlooked until the number of hours in the global setting have passed.
03-12-2015 08:29 AM
Now that does sound like my issue Marianne. Perhaps it needs some logic to sort out this notion of a day as it doesnt apply to freq based backups that are supposed to run in hours.My setting is default. Im pretty sure thats the cause.Awaiting confirmation from Symantec but I reckon we are there.Jim