cancel
Showing results for 
Search instead for 
Did you mean: 

Postgreql restore

Peter351
Level 3

Does anyone know what the the requirements for OS user for Postgresql backups/restores to the same client?

Client: Rhel7.6, netbackup 8.1.2

We're able to run backups but not the restores so can't verify the backups are complete.

Backups: only information which is confusing is at the beginning of the backup:

     nbpgsql.log:

     Err: failed to enumerate data directory

     Debugs: Fallback to Non-LVM approach

Then the backup is runing and backing files (?) and completes as succesful.

 

Restores: client log:

nbpgsql.log:

ERR: Failed to get multiple objects - 26

ERR: Error occured while performing restore

exten_client:

....serverResponse: read comm file:<...Found (14,000) files in (1) image for Restore Job ID 2343423.xxx

readCommMessages: Entering readCommMessages

serverResponse: read comm file: <09:00:00 INF - Server status =42>

server response: ERR - server exited with status 42: netwrok read failed

RestoreFileObjects: Err - serverResponse() failed

closeApi: entering closeApi.

closeApi: INF - EXIT STATUS 5: the restore failed to recover the requested files

VxBSAGetMultipleObjects: WRN - bsa_RestoreMultipleFileObjects was not able to get hte objects. Status1

VxBSATeminate: INF - entering VXBSAEndTxn.

VxBSATeminate: INF - Transaction being ABORTED

VxBSAGetEnv:    INF - entering GetEnv - NBBSA_LOG_DIRECTORY

VxBSAGetEnv:    INF - returning -

VxBSAGetTxn:   INF - Cleaning directory: </usr/openv/netbackup/logs/exten_client

delete_old_files: entering delete_old_files

bsa_CloseDebugFD: DBG - bCloseDebug flag is set.

 

Please note we were able to get it running in Rhel7.3 (if that helps).

 

Getting error 42 - network read failed after the requested restore image is read from master server.

 

 

9 REPLIES 9

sdo
Moderator
Moderator
Partner    VIP    Certified

AFAIK postgres backups are a relatively newish thing... so maybe not many sites are using it yet, and I'm sure there are people in the world doing postgres backups and restores, but the thing is... they're probably not checking in to this forum every day/week/month/year to search for postgres questions to answer... and so... your best bet may well be to raise a proper support call.  :(  Sorry I can't personally help.

I would have expected the manual to list all required pre-configuration steps:

https://www.veritas.com/support/en_US/doc/129277259-129955710-0/index

Mike_Gavrilov
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi Peter, 

There are some problems with PostgreSQL restore and it depends on how many files (tables) you have.

 

Found (14,000) files in (1) image for Restore Job ID 2343423.xxx

Usually, problems start when you need to backup/restore more than 8000 files. 

Did you try to restore exactly the same backup on the RHEL 7.3?

1) Try to run:

nbpgsql -o query -C <client_name>

as a result, you'll get a table with all available backups.  It might take several hours depends on file count. If you can get list of backupids you can continue. If it wouldn't show anything and hang let me know and I'll show you the workaround.

2) Check that PGSQL_TARGET_DIRECTORY specified in nbpgsql.conf exists. Err: failed to enumerate data directory - might be a sign that it's not specified or exists. 

 

 

 

 

Hi Mike,

we're having abou 13000 files.

1.Tried to restore same backup to Rhel7.3 - failed with same error.

I'm able to get the backup id from runing a query.

2. PGSQL_TARGET_DIRECTORY - I understand that's required for restores only. We're getting that error during the backup - failed to enumerate data directory . That's why I'm trying to confirm the backups are completed or not.

Cheers

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@Peter351 

As @sdo said - The NBU agent for PostgreSQL is still very new - so, not a lot of experience in this forum.

If you were my customer and I had to assist, I would start from scratch and check the config and requirements as per the manual. 

Looking at your error message, I am not convinced that the backup was successful:

Backups: only information which is confusing is at the beginning of the backup:

     nbpgsql.log:

     Err: failed to enumerate data directory

     Debugs: Fallback to Non-LVM approach

Why are we seeing this message? Fallback to Non-LVM approach

Does this mean that the LVM snapshot was not successful? 
Did you verify that there is sufficient space for the snapshot? 

Any possibility that you could share the entire log? 

I would personally not attempt a restore until we are sure that the backup is 100% successful. 

Hi Marianne, The NBU agent for PostgreSQL is still very new - so, not a lot of experience in this forum. _ Understand that, already got opened case for about 3-4 weeks with no luck - that's the reason for online forum. Looking at your error message, I am not convinced that the backup was successful: - Yes - that's what I'm trying to verify. To give a bit more of confusion - we ran a test backup/restore on another machine with tiny dummy DB. Backup did complete with and without that "falling to non-lvm approach" debug but the restores completed in both cases. Does this mean that the LVM snapshot was not successful? LVM snapshot was indeed created. Did you verify that there is sufficient space for the snapshot? Yes there was. One thing however - the data and the logs are on separate LVM. Admin guide does not point if that must be on same LV. Any possibility that you could share the entire log? - Sorry can't provide that in full.

Mike_Gavrilov
Moderator
Moderator
Partner    VIP    Accredited Certified

Peter,

If you've already increased all timeouts and your Job Detail looks like:

 

14.10.2019 22:46:58 - begin Restore
14.10.2019 22:46:59 - Info bprd (pid=145675) Found (195,000) files in (1) images for Restore Job ID 20315.xxx
14.10.2019 22:47:03 - Info bprd (pid=145675) Estimated time to assemble file list: (0) days, (0) hours, (0) min, (4) sec
14.10.2019 22:48:23 - Info bprd (pid=145675) Searched (250,000) files of (195,000) files for Restore Job ID 20315.xxx
15.10.2019 5:28:44 - Info bprd (pid=145675) Searched (500,000) files of (195,000) files for Restore Job ID 20315.xxx
15.10.2019 5:34:16 - end Restore; elapsed time 6:47:18
network read failed  (42)

when you've faced with the problem that I was talking about. You can create  <NetBackup_install_path>/netbackup/logs/extrn_client folder on the client and check that the client receives the whole filelist and timeout occurs after that. If it's true when you have 2 ways:

1) Switch from agent backup to old-school script backup

2) Review DB architecture and get rid of thousands of DB tables/files. 

 

 

 

 

thanks Mike. yes the timeout happens after all the data-to-be-restored file has been read by master server.

Mike_Gavrilov
Moderator
Moderator
Partner    VIP    Accredited Certified

Veritas was going to publish TechNote that describes this behavior but it's still in progress.

Thanks Mike