Highlighted

V-Ray backups are impacting on SAP jobs

Hi friends,

 

Does anyone experience network connectivy issues on V-Ray backups. We have found that during snapshot quiesce time SAP jobs are getting failed. That is when ever V-Ray backups are getting triggered their SAP application jobs are getting failed.

 

Regards,

Rajesh

1 Solution

Accepted Solutions
Highlighted
Accepted Solution!

Let's rule out

Let's rule out NetBackup...

 

Perform a snapshot in the vSphere Client against the VM's in question.

Wait 10-15 mins, just to assume some datachange in the VM and some data been written to the deltas.

Then consolidate the snapshot - Delete ALL.

During all this monitor SAP on the server.

When do the jobs fail. Is it at the snapshot removal part. ie the VM could be getting stunned.

Check the Event Viewer of the Windows box also.

So if happens in the vSphere Client not a NetBackup issue.

 

Interesting Note in VMware KB: A snapshot removal can stop a virtual machine for long time (1002836)

Note: Beginning in ESXi 5.0, the snapshot stun times are logged. Each virtual machine's log file (vmware.log) will contain messages similar to:

2013-03-23T17:40:02.544Z| vcpu-0| Checkpoint_Unstun: vm stopped for 403475568 us

In this example, the virtual machine was stunned for 403475568 microseconds (1 second = 1 million microseconds).

-----------------

Some other considerations

1. Make sure VM does not have any other snapshots (including hidden).
2. Increase CPU reservations in the VM settings.
3. Move snapshot location to a different datastore (via workingDir parameter), preferably backed by faster storage (for example, SSD disk).

 

It is strange when this is an App server and not the DB so would not expect high I/O.

View solution in original post

4 Replies
Highlighted

You should not use VMware

You should not use VMware backup with application not VSS aware like SAP using Oracle as database.

Because SAP is not VSS aware it will fails when disks reads & writes are suspended temporary for snapshot creation. You may get around the issue by setting the database tabelspaces in backup mode before running the VMware backup. 

See: https://www-secure.symantec.com/connect/blogs/nuts-and-bolts-netbackup-vmware-virtual-machine-snapshots-backing-business-critical-applicatio

Since Exchange, Sharepoint and MS SQL is 100% VSS aware its possible to take consistent backup of those application while running VMware snapshot backups.

See Page 18 in the VMware admin guide : http://www.symantec.com/docs/DOC6461

HI Nicolai,   Thanks for the

HI Nicolai,

 

Thanks for the update, these VMs are not the Database servers they are only the frontend Application servers.

You mean even on the frontend application servers transactions \ jobs will disconnects during quiesceing?

 

Regards,

Rajesh

Highlighted

No I had only database server

No I had only database server is mind. 

If SAP application server stop working when a VMware SNAP is done is do sound strange to me.  But it do indicate SAP has a problem when disks are quiesced.

Do you have any error codes to show ?

 

Highlighted
Accepted Solution!

Let's rule out

Let's rule out NetBackup...

 

Perform a snapshot in the vSphere Client against the VM's in question.

Wait 10-15 mins, just to assume some datachange in the VM and some data been written to the deltas.

Then consolidate the snapshot - Delete ALL.

During all this monitor SAP on the server.

When do the jobs fail. Is it at the snapshot removal part. ie the VM could be getting stunned.

Check the Event Viewer of the Windows box also.

So if happens in the vSphere Client not a NetBackup issue.

 

Interesting Note in VMware KB: A snapshot removal can stop a virtual machine for long time (1002836)

Note: Beginning in ESXi 5.0, the snapshot stun times are logged. Each virtual machine's log file (vmware.log) will contain messages similar to:

2013-03-23T17:40:02.544Z| vcpu-0| Checkpoint_Unstun: vm stopped for 403475568 us

In this example, the virtual machine was stunned for 403475568 microseconds (1 second = 1 million microseconds).

-----------------

Some other considerations

1. Make sure VM does not have any other snapshots (including hidden).
2. Increase CPU reservations in the VM settings.
3. Move snapshot location to a different datastore (via workingDir parameter), preferably backed by faster storage (for example, SSD disk).

 

It is strange when this is an App server and not the DB so would not expect high I/O.

View solution in original post