cancel
Showing results for 
Search instead for 
Did you mean: 

Deduplication incremental backup is taking more than 16 hours

Dennis_S_
Level 4

I'm backing up 1.4TB of data from our project drives to an iSCSI storage Synology server and it is still running.  How should I configure this software so it takes less time?  It started at 11:00PM yesterday and now it is 4:42PM today.  When it completes, it appears that there's usually only about 100 gigabytes of changed data that it stores in the dedupe folder from what I can tell from file dates on the iSCSI target.  But it is acting like it is backing up the whole thing on every incremental due to how much time it takes.  I think this will take roughly 20 hours total which is not what we want.

37 REPLIES 37

teiva-boy
Level 6

Client-side or media server dedupe?  

Any particular high CPU or memory resources being consumed duringg the backup?

Have you made sure to exclude from the client, the beremote.exe process, the entire installation directories on client and servers, as well as the B2D/Dedupe folders on your AV software?  

Dennis_S_
Level 4

Media server dedupe.

I haven't checked the resources being consumed during backup.

I'm not sure what excluding beremote.exe process is and installation directories on client and servers is irrelevant because I'm backing up CAD data and PDFs and such only.  I'm not backing up Windows workstations or servers.  Right now I don't have SEP installed on the backup exec system.

Dennis_S_
Level 4

I just started a backup to disk profile and I'm getting 1.43GB out of 2GB memory in use, and about 68% - 90% of CPU load. My coworker is having me install and run backup exec from a microsoft virtual machine Windows 2008r2 and I guess those just give a single core, so that is why it only has one CPU core showing in task manager.

Dennis_S_
Level 4

Ok I read the documentation for best practices and found that my IT team had me downsize the server to the point that it is not covered under best practices guide.

 

So I should be running 8 gigs of RAM and quad cores which is not what I am doing right now.  I am going to try to reconfigure and do that as well as I just found that there was one license that was not inserted in the media server yet.

teiva-boy
Level 6

Because it is best practice for you to do so with AV software.  You not only exlcude the BE installation directories, but also the running processes (e.g. beremote.exe) and the targets too (e.g. B1D folders, dedupe store, etc)

It affect's throughput and overall performance with AV software scanning running processes, and every file written to disk.  Thus why you exlcude it.  Both on clients and servers.

teiva-boy
Level 6

Dedupe is highly dependant on CPU power and RAM.  It should make a huge difference.  It's also not a good practice to virtualize a backup server, due to the high I/O nature of backup tasks.  Trust me, I've been doding backcup design for over a decade, it's best when it's physical!

robnicholson
Level 6

Whilst I know where you are coming from here, it depends upon the capacity of the media server. That said, a multi-core system doesn't really help BE does it as it's a single threaded design?

Cheers, Rob.

robnicholson
Level 6

The Backup Exec dedupe engine (PureDisk using Postreg SQL) is not a very effecient system IMO (sorry guys). Quite why you need 8GB of RAM to basically calculate a hash of a 64k block, look that hash up in a database and write the 64k block if it doesn't exist is perplexing.

I too dabbled with putting the dedupe database on iSCSI mainly because we've got terrabytes of SATA-3 storage on there so can scale up as needed.

But I've put that on hold as performance was pants.Not sure the cause but dedupe on USB-3 (140MB/s) isn't exactly a lot to write home about.

The first dedupe backup is pretty slow but it does speed up after that.

I could be that the performance of the PostgreSQL database over iSCSI isn't too good. You can split the database from the data but that's not officially suppored.

So we're probably going to stick a 12TB SATA JBOD disk enclosure on the media server.

Cheers, Rob.

teiva-boy
Level 6

The thing about iSCSI is that it needs to multi-path in order to be fast.  Which means multiple concurrent jobs.  iSCSI can be fast, when done right.  Most cases, it's not no matter what the storage admin thinks.  Who wants their baby called ugly?

Riyaj_S
Moderator
Moderator
Employee Accredited Certified

You may install the Microsoft patches mentioned in the following Technote :

http://www.symantec.com/docs/TECH76051

Hope this helps.

robnicholson
Level 6

I've used multipath to allow multiple servers to connect to the same iSCSI target (in the XenServer virtualisation world with XenApp pools) but are you saying that you can use multi-path on the same iSCSI initiator to allow multiple jobs to run faster?

teiva-boy
Level 6

If you have one Gb connection, and one job with 10 servers in it.  It'll open only one iSCSI connection.

If you have two iSCSI Gb connections, and setup MPIO with round-robin, and create one backup job with 10 servers in it, it'll only create one iSCSI connection, using only one link.  

You need to do #2, but with 10 individual jobs, and now iSCSI MPIO will work as advertised with multiple links improving throughput and performance.  

The dirty secret to BE is only opening one connection per backup job/policy.  So one must create more smaller jobs to take advantage of things like link load balancing, MPIO, and such...

Dennis_S_
Level 4

Comment removed at the request of community member

teiva-boy
Level 6

Dennis, you can virtualize the media server just fine, and your data size is not large by any means...  But when you get into it, you end up granting almost as much resources as a physical machine...

If you think about it, a backup server is meant to move data from A to B.  This means, you need lots of bandwidth, lots of CPU, and xMB of RAM per host, xkB of RAM per GB of data, and xkB of RAM per 1M files.  It all adds up...  DId you know folks that move to 10GbE links don't typically have enough CPU to move that kind of data?  a 10GbE link at 70% utilization, has the CPU at near or above that number, just processing interrupts and managing the PCI bus; to get data in and out of the server.

An IT Manager has no clue, and often "backup," is moved to the lowest priority as it's not seen as a "mission critical," application.  Well tell that to them when their Exchange server dies, and they can't recover in a realiable or fast enough manner!  

Is Symantec to blame, some what yes.  Many of their BackupExec SE's will just quote the minimum requirments, but don't have a clue how to size properly beyond that. They are also not capable of choosing the correct model numbers of servers to use.  They are a software company, thats why, so there isn't much of a need to get into the hardware business.

Atur_S_
Level 3

I work with Dennis and as you can tell we're new working with the software as we're trying to get a grip on what configuration is right for us.  We are hoping we can virtualize this.  We're going to try it out, and in the end we'll probably end up placing it on a physical server, but going through this process will be a learning experience for us.  And we're okay with taking the time to make mistakes and learn.  

Teiva, I think you're right, but we've allocated enough resources, even in our virutal environment for the guest machine.  I think its not a bottleneck of our available resources on the server (CPU/RAM), but its a bottleneck of our network.  Additionally, I don't think we have a good handle on how the data is actually processed during deduplication from point A to B through BackupExec.  To give a clearer picture we have a NetApp filer that is handling the CIFS storage (no windows filer), we have an iSCSI target on the NetApp that is initiated by a guest machine (2008R2), and the .VHD files are stored on on the Netapp (CSV).

Since we do not have an agent on a Windows filer, could this cause the backup times to be much slower pulling directly from the Netapp?  I am not sure how the Agent for Windows Systems funtions or helps when installed in the environment.  Documentation for the agent "Optimizes data transfers for 32-bit and 64-bit remote Windows servers for faster backup". 
http://eval.symantec.com/mktginfo/enterprise/fact_sheets/b-be2010_agents_and_options_DS_20983645.en-...

Here's the bottom line problem we are having: When we do an incremental backup its processing all the files and its taking a long time because its processing each file, one at at time @1,800MB/Min.  Not sure what is going on with that and I'm wondering its because we don't have the Agent for Windows Systems installed on the filer.

All thoughts are welcomed and appreciated.

Thanks.

 

teiva-boy
Level 6

If you are backing up your CIFS shares off your NetApp, thats your problem. 1800MB/min is about par for the course in a CIFS environment, e.g. 30MB/s

CIFSv2 is better, should be about 20%-40% faster.  But implementation and support for that may require an upgrade to ONTAP, or other changes that are well outside the scope of this forum.  

NDMP will be faster, but you cannot deduplicate it with BackupExec.  In fact NDMP is often the preferred choice for NetApp Filers.  Just keep in mind its other disadvantages...  e.g. proprietary.  

If you NDMP your backup, it should be faster over your current implementation.

Atur_S_
Level 3

Teiva, thanks for your response.  Also thanks for bringing up CIFSv2, we'll look into what we're running.  We're running ONTAP 8.  We want deduplication so we can create an onsite 60-90day backup store.  We are writing this to a Synology NAS system as cheap storage and because we want to do this, we are trying to save on capacity with deduplication.

teiva-boy
Level 6

You may be better off with a small DataDomain device which I know can deduplicate NetApp NDMP.  

1.4TB of data would only be a DD610 or a slightly larger 630 if you can foot that...  It would perform 1000% better than a Synology array.  It can ingest at some 200MB/s-300MB/s (respectively) where as you may only get upwards of 50MB/s on the Synology in large sequential writes.

Dennis_S_
Level 4

I turned off some of our nightly backups but left one on (which only takes 2 hours to complete) on our older backup server.

The dedupe incremental for some reason only took 12 hours this time rather than 18.  I still find that 12 hours is too long.

Please let me know if there is something that I missed on configuration.  I have it set to incremental backup by time/date and if I changed that to anything else, it would only go 800MB/s rather than what we get with time/date which is 1,800MB/s.

We want to achieve a system that quickly verifies which files need to be backed up and then proceeds to only copy those files into the dedupe iSCSI target structure.

Our current backup manages this amount of data on our project drives in 2 hours to complete (we tested our data thruput on that server and found that it only goes at 30 megabytes/second, however it completes the entire task in 2 hours as said).  12 hours is 10 hours beyond what we are normally accustomed to.  We would prefer maximum of 6 hours but hopefully achieve 2 hours so that would give our company a lot of flexibility in our backup.

 

Thanks