10-09-2019 03:44 PM
Does anyone know what the the requirements for OS user for Postgresql backups/restores to the same client?
Client: Rhel7.6, netbackup 8.1.2
We're able to run backups but not the restores so can't verify the backups are complete.
Backups: only information which is confusing is at the beginning of the backup:
nbpgsql.log:
Err: failed to enumerate data directory
Debugs: Fallback to Non-LVM approach
Then the backup is runing and backing files (?) and completes as succesful.
Restores: client log:
nbpgsql.log:
ERR: Failed to get multiple objects - 26
ERR: Error occured while performing restore
exten_client:
....serverResponse: read comm file:<...Found (14,000) files in (1) image for Restore Job ID 2343423.xxx
readCommMessages: Entering readCommMessages
serverResponse: read comm file: <09:00:00 INF - Server status =42>
server response: ERR - server exited with status 42: netwrok read failed
RestoreFileObjects: Err - serverResponse() failed
closeApi: entering closeApi.
closeApi: INF - EXIT STATUS 5: the restore failed to recover the requested files
VxBSAGetMultipleObjects: WRN - bsa_RestoreMultipleFileObjects was not able to get hte objects. Status1
VxBSATeminate: INF - entering VXBSAEndTxn.
VxBSATeminate: INF - Transaction being ABORTED
VxBSAGetEnv: INF - entering GetEnv - NBBSA_LOG_DIRECTORY
VxBSAGetEnv: INF - returning -
VxBSAGetTxn: INF - Cleaning directory: </usr/openv/netbackup/logs/exten_client
delete_old_files: entering delete_old_files
bsa_CloseDebugFD: DBG - bCloseDebug flag is set.
Please note we were able to get it running in Rhel7.3 (if that helps).
Getting error 42 - network read failed after the requested restore image is read from master server.
10-10-2019 12:25 AM - edited 10-10-2019 12:29 AM
AFAIK postgres backups are a relatively newish thing... so maybe not many sites are using it yet, and I'm sure there are people in the world doing postgres backups and restores, but the thing is... they're probably not checking in to this forum every day/week/month/year to search for postgres questions to answer... and so... your best bet may well be to raise a proper support call. :( Sorry I can't personally help.
I would have expected the manual to list all required pre-configuration steps:
https://www.veritas.com/support/en_US/doc/129277259-129955710-0/index
10-18-2019 08:27 AM
Hi Peter,
There are some problems with PostgreSQL restore and it depends on how many files (tables) you have.
Found (14,000) files in (1) image for Restore Job ID 2343423.xxx
Usually, problems start when you need to backup/restore more than 8000 files.
Did you try to restore exactly the same backup on the RHEL 7.3?
1) Try to run:
nbpgsql -o query -C <client_name>
as a result, you'll get a table with all available backups. It might take several hours depends on file count. If you can get list of backupids you can continue. If it wouldn't show anything and hang let me know and I'll show you the workaround.
2) Check that PGSQL_TARGET_DIRECTORY specified in nbpgsql.conf exists. Err: failed to enumerate data directory - might be a sign that it's not specified or exists.
10-21-2019 06:19 PM
Hi Mike,
we're having abou 13000 files.
1.Tried to restore same backup to Rhel7.3 - failed with same error.
I'm able to get the backup id from runing a query.
2. PGSQL_TARGET_DIRECTORY - I understand that's required for restores only. We're getting that error during the backup - failed to enumerate data directory . That's why I'm trying to confirm the backups are completed or not.
Cheers
10-22-2019 07:13 AM
As @sdo said - The NBU agent for PostgreSQL is still very new - so, not a lot of experience in this forum.
If you were my customer and I had to assist, I would start from scratch and check the config and requirements as per the manual.
Looking at your error message, I am not convinced that the backup was successful:
Backups: only information which is confusing is at the beginning of the backup:
nbpgsql.log:
Err: failed to enumerate data directory
Debugs: Fallback to Non-LVM approach
Why are we seeing this message? Fallback to Non-LVM approach
Does this mean that the LVM snapshot was not successful?
Did you verify that there is sufficient space for the snapshot?
Any possibility that you could share the entire log?
I would personally not attempt a restore until we are sure that the backup is 100% successful.
10-23-2019 02:08 PM
10-27-2019 01:00 PM
Peter,
If you've already increased all timeouts and your Job Detail looks like:
14.10.2019 22:46:58 - begin Restore 14.10.2019 22:46:59 - Info bprd (pid=145675) Found (195,000) files in (1) images for Restore Job ID 20315.xxx 14.10.2019 22:47:03 - Info bprd (pid=145675) Estimated time to assemble file list: (0) days, (0) hours, (0) min, (4) sec 14.10.2019 22:48:23 - Info bprd (pid=145675) Searched (250,000) files of (195,000) files for Restore Job ID 20315.xxx 15.10.2019 5:28:44 - Info bprd (pid=145675) Searched (500,000) files of (195,000) files for Restore Job ID 20315.xxx 15.10.2019 5:34:16 - end Restore; elapsed time 6:47:18 network read failed (42)
when you've faced with the problem that I was talking about. You can create <NetBackup_install_path>/netbackup/logs/extrn_client folder on the client and check that the client receives the whole filelist and timeout occurs after that. If it's true when you have 2 ways:
1) Switch from agent backup to old-school script backup
2) Review DB architecture and get rid of thousands of DB tables/files.
10-30-2019 06:58 PM
thanks Mike. yes the timeout happens after all the data-to-be-restored file has been read by master server.
10-30-2019 11:53 PM
Veritas was going to publish TechNote that describes this behavior but it's still in progress.
11-05-2019 02:06 PM
Thanks Mike