Solved: Client -side vs server-side deduplication?

Peter_Muir · ‎02-02-2016

Hi,

Reading the BE 2014 manual it states that server-side deduplication reduces size of the back which reduces storage, but client-side deduplication reduces network traffic and therefore reduces backup window.

Can I deduce from this that if I performed client-side dedup the data written to the storage will be large than if I used server-side?

Colin_Weaver · ‎02-02-2016

Well the deduplication algorithm is the same no matter which side does it, and information about chunks already in the deduplication store are still provided to the client during the process (there is a local cache detailing chunks that gets augmented by requests back to the server if details of a chunk do not show up in the local cache.)

As such the storage use should be the same either way.

Also the only way to really find out how much the different choices affect you is to try the different configurations to see how much bandwidth is used, how much CPU and RAM is used on the various servers etc.

Basically the key reasons for chosing Client Side vs server side usually relate to how much bandwidth is used that might affect other applications using the same network link and the overall power (RAM and CPU availability) of the servers that you choose to do the work with the associated impact on other applicatons on those servers . For this descison usually the storage used (and as it happens the overall backup window) are not key factors in the decision)

.

So

- Fast servers (loads of CPU and RAM) but network link that has key production applications using it (so bandwidth worries) then you would probably go with Client side to minimize the bandwidth used

- Fast servers but one of the remote servers has a key application that also uses lots of CPU and RAM then you would probably consider server side for that one busy server but client side for the others (depending on bandwidth and network performance anyway)

- Fast servers with slow link you would also go with client side - although note that unreliable slow link or link that is too slow may still have probelms that neither configuration will solve.

- Backup Server is fast but other servers are slow (or 32bit) then you would go with server side (as long as the network link is stable/fast and does not have bandwidth concerns)

The difficult one's to decide are:

- Backup server is fast but all other servers are slow and network link is slow or you have bandwidth concerns

- All servers are slow, network link speed/bandwidth is also a concern.

Really these last two are upgrade your hardware or network or both then revisit your decision

View solution in original post

CraigV · ‎02-02-2016

This TN might explain it better:

https://www.veritas.com/support/en_US/article.000059030

Thanks!

CraigV · ‎02-02-2016

Bear in mind your dedupe ratios won't be as big using client-side as it's on a per-server basis and makes no use of a centralised check of what's been backed up.

Thanks!

Colin_Weaver · ‎02-02-2016

Well the deduplication algorithm is the same no matter which side does it, and information about chunks already in the deduplication store are still provided to the client during the process (there is a local cache detailing chunks that gets augmented by requests back to the server if details of a chunk do not show up in the local cache.)

As such the storage use should be the same either way.

Also the only way to really find out how much the different choices affect you is to try the different configurations to see how much bandwidth is used, how much CPU and RAM is used on the various servers etc.

Basically the key reasons for chosing Client Side vs server side usually relate to how much bandwidth is used that might affect other applications using the same network link and the overall power (RAM and CPU availability) of the servers that you choose to do the work with the associated impact on other applicatons on those servers . For this descison usually the storage used (and as it happens the overall backup window) are not key factors in the decision)

.

So

- Fast servers (loads of CPU and RAM) but network link that has key production applications using it (so bandwidth worries) then you would probably go with Client side to minimize the bandwidth used

- Fast servers but one of the remote servers has a key application that also uses lots of CPU and RAM then you would probably consider server side for that one busy server but client side for the others (depending on bandwidth and network performance anyway)

- Fast servers with slow link you would also go with client side - although note that unreliable slow link or link that is too slow may still have probelms that neither configuration will solve.

- Backup Server is fast but other servers are slow (or 32bit) then you would go with server side (as long as the network link is stable/fast and does not have bandwidth concerns)

The difficult one's to decide are:

- Backup server is fast but all other servers are slow and network link is slow or you have bandwidth concerns

- All servers are slow, network link speed/bandwidth is also a concern.

Really these last two are upgrade your hardware or network or both then revisit your decision

Peter_Muir · ‎02-03-2016

Thanks Colin and Craig for you response.

Craig - I had the read that TN before, but it didn't really answer my question hence the posting.

Colin - We run a minimum of 1Gpbs through our network, I have noticed that the BE server is getting flooded some days with 99%, the RAM and CPU never go above 50% The BE is fairly power 32GB RAM and 8 cores, so I think the solution is to team up the two NIC's in the server (non-fault tolerance) so the BE server can have a 2Gbps bandwidth giving split 1Gbps from the source servers.

VOX

Client -side vs server-side deduplication?