cancel
Showing results for 
Search instead for 
Did you mean: 

11d and Lotus Domino

Andreas_N_sbe1
Level 3
Eventlog:
Event Type:Error
Event Source:Backup Exec
Event Category:None
Event ID:65215
Date:28.11.2006
Time:23:26:59
User:N/A
Computer:IVORY
Description:
The connection to the Lotus Domino server has been lost. The following error was reported: 8,26. The job that was running on this server has been stopped.


Job History
Backup- IVORY|D: V-79-57344-65072 - The database server is not responding. Backup set canceled

V-79-57344-65072 - The database server is not responding. Backup set canceled.
WARNING: "IVORY|D:\Lotus\Dominodata\mail\mmagdzio.nsf" is a corrupt file.
This file cannot verify.


But why it can backup 1074 domino files?
Set Information - IVORY|D:

Backup completed on 28.11.2006 at 23:33:58.

Backup Set Summary
Backed up 1074 files in 15 directories.
1 corrupt file was backed up
Processed 45'491'408'638 bytes in 19 minutes and 0 seconds.
Throughput rate: 2283 MB/min
31 REPLIES 31

Russ_Perry
Level 6
Employee
A few possibilities:

1) Did the Remote Agent service on the Domino server stop?
2) Did some process on the Domino server kick off during the backup?
3) Is virus scanning running during the backup?
4) Is the mmagdzio.nsf file healthy?

The corrupt file indication can mean that while we were backing up the mmagdzio.nsf file some other process cut our connection to Domino. This could be virus scanner or some Notes process. It could also mean something is wrong with the nsf. Try backing it up by itself.

Russ

Andreas_N_sbe1
Level 3
Thanks for your Answer.

- Remote Agent was ok, no error.
- no other process (I double check it)
- virus scanning excludes the notes directory
- it is not always the same file...

If it cannot backup 1-2 Files each time, this is not a big problem. The error message displayed is the problem:
The connection to the Lotus Domino server has been lost. The following error was reported: 8,26. The job that was running on this server has been stopped.
Does it mean it aborted the hole job or only one file....???

Russ_Perry
Level 6
Employee
The event and the job log seem to contradict each other. The event occurs 7 minutes before the end of the backup job but the job looks like it is properly wrapped up in the log. I would recommend running the backup job with the log settings in Tools | Options | Job Logs set to 'Summary info, directories, and files processed'. This will allow you to see whether databases continue to be backed up after the corrupt file message. I've seen .nsf files be reported as corrupt but I've never seen an event like this during a Domino backup.

Russ

Niels_Rasmussen
Not applicable
Actually we encounter the same problem - sort of - we have the same "8,26" event error on the Lotus Notes server. After upgrading to 11d, our backup device suddenly hangs during the job. The only way of canceling is by restarting the backup server. Furthermore the transfer rate suddenly decreases from 1200 MB/min to ~600 MB/s. It is a brand new tape library (LTO III).
One of the few times we managed to get a full backup it runs through perfectly. When choosing "Restore" from one of the successful jobs some mailboxes show a size of 18.014.398.507.806.464KB which is of course not the case. Any idea why?

Niels

Shane_Vincent
Level 4
Hi Niels, In regards to the mailbox size, If you have a look at any file over the size of approx 2GB via the Domino agent it will show a size simular to what you are seeing... I have shown this to a symantec tech (via web session) and they have reported it. It looks to be only a display problem however should have been picked up in beta...

Shane_Vincent
Level 4
Hello Andreas, I too have had this problem. Current workaround which seems to be working for now... 2 x successful domino backups in succession..

1. Separate the domino files backup into its own job.
2. Make sure advanced open file option is unticked.
3. Under the Advanced tab in job settings change the 'Open file backup when AOFO is not used' to Never
4. If your domino files are on a remote server, use the 'domains' to find the server and not the 'Favorite Resouces' (not sure if this changes much but it is part of the ruling things out process).

I hope this helps, but i guess anything is worth a go...

Seasons Greetings to All
Shane

Shane_Vincent
Level 4
wow did i speak too soon... third domino backup just failed with same error...

back to the drawing board...

Andreas_N_sbe1
Level 3
The symantec support told me the following:
Change the settings for the change Journal in the Registry to 0
HKLM\software\VERITAS\Backup Exec\Engine\domino > Enable change Journal.

After that you have to restart the remote Agent service.

Can someone else please test this and give an answer in this forum. For me the backup worked after this change. Now I'm looking for how long!!!

Shane_Vincent
Level 4
Hi Andreas,
I changed this on both my servers, backup worked once and now it is erroring again.. different error this time thou..

Russ_Perry
Level 6
Employee
Using the Windows change journal for backup is supposed to streamline the process to figure out what files have changed and what needs to be backed up instead of going to each file to determine archive bit/modified time parameters. Disabling the change journal setting might only slow the backup slightly but can eliminate any problems caused by a corrupt or over-active change journal.

Shane: You state that one backup worked but you received a different error after that. What are you getting now?

Russ

Shane_Vincent
Level 4
Thanks Russ I now have no errors with my backups, due to sheer frustration, last night I fully uninstalled BEX 11d and have gone back to 10d. I am going to wait until 11d is out of BETA form before i try it again....

Tim_Longbotham
Level 3
We have been suffering with the upgrade to 11d sounds like similar issues. I get different error messages with no real pattern which is frustrating.
The one thing I have concluded is that I must restart the BE agent on all my Domino servers before any backup job has a chance of completing.
The error I'm now left with is "connection to server has been lost" event id 65215

The job log indicates that this happens on the f drive of 2 of our 6 domino servers.

I had a support call open about this issue and was working well with one of the technicians however they closed the call down unresolved while I was off for Christmas and the technician isn't responing to my emails grrr.

I'm sure its an agent issue, can anyone from symantec share any info as to any more patches that may be released soon?

I don't want to downgrade back to v10 because ultimately windows and novell servers are all backed up ok and Domino databases are being backed up although in a messy way. I haven't tried to restore yet though....

I'm happy to work with symantec to fix it but obviously my boss only has so much patience...

I have noticed also that on occasion a Domino job fails because a file is corrupt... shouldn't this file just be skipped?

Tim

Andreas_N_sbe1
Level 3
I had similar the same on our 2 Notes Server. A supporter from Symantec told me to change the following key:

Change the settings for the change Journal in the Registry to 0
HKLM\software\VERITAS\Backup Exec\Engine\domino > Enable change Journal.

After this change my backup works.

Tim_Longbotham
Level 3
Thanks, I have tried this but it made no difference... Will try it again though
Did you leave the registry setting at 0? When I did it they told me to change it back once the services had been restarted...

We have installed and reinstalled agents, changed AFO option settings.
The latest thing I'm testing this evening is the Open file backup when open file option is not used set to "never" in the advanced part of job set up.

The painful part is it takes so long for a job to complete to see if the change works :)
I'll let you know how it goes

Russ_Perry
Level 6
Employee
You can leave the change journal setting at 0. This will cause the Lotus backup to do a file scan for changes rather than rely on the change journal.

The AOFO settings should have no effect on Lotus backups as long as you are selecting the Lotus DBs under the Lotus icon in backup selections.

Russ

Mike_Roden
Level 4
I'll test this too. I just set the change journal registry key on each of my Domino servers to 0, cleared out the existing UCJ files, and restarted the services on each server.

I will run mine like this tonight and see what happens.

Mike

Mike_Roden
Level 4
Caution: Long Posting. Ok, here's my report from some afternoon testing:

As stated in my previous posting, I turned off the change journal on each of my Domino servers, cleared the UCJ files, and restarted the services.

I decided to do a test of back-to-back backups on one of my Domino servers. This server only has about 10 GB of Notes data, so it doesn't take too long to test a backup. I ran the backup on this server manually and it took about 30 minutes to do a full backup and verify and was successful. However, I've always gotten a good backup the first time the remote service is restarted.

So I decided to run the backup again. Well, it failed. The job stayed on Backup Scan for 15 minutes before I finally gave up. I tried to cancel the job but of course you can't cancel a job doing a backup scan. So after 15 more minutes waiting for the job to cancel (knowing that it wouldn't), I finally just stopped the service on the remote server and the backup job canceled immediately.

Now here is something I noticed: Even though I had deleted the UCJ files and disabled the change journal on the remote server, I noticed that there was still a UCJ file that got created and was actually updated several times throughout the backup.

Is the change journal actually disabled if a UCJ file is getting created? Is the existing UCJ file the cause of the backup job to get stuck scanning on the second backup?

Finally, I have another Domino server that only has 1.3 GB of data. This server has backed up flawlessly since the very first day of the upgrade...er..reinstall. The backup jobs on all Domino servers are set up the exact same way. The only difference is this server has very little Notes data and is very fast. Also, the UCJ files did get created on this server as well even though the change journal was disabled.

Maybe this information will help someone figure out this problem we Domino users are having.

That's my story and I'm sticking to it.

Follow-up: I restarted the remote service on the failed backup server without deleting UCJ files. I ran the backup and as expected, it was successful. Now here's the kink: I immediately started the second backup of the same server and this time it started just fine. Arrggghhhh!

Mike

Russ_Perry
Level 6
Employee
Mike,
Thanks for the great info. I'm trying to determine whether the continued access of the UCJ files is normal with change journal disabled. I've seen the hanging backup thing a couple other times but haven't figured out yet. A couple related questions:

1) When you create backups in 11d of the Lotus servers, how much longer does it take to display the Lotus selections with 11d compared to 10d.
2) If you don't cancel the job that seems to hang on your larger Lotus server, how long have you let it sit there? Will it ever run/complete? Have you looked at task manager on the Lotus server to see what processes are using CPU and memory during the hang?

Russ

Mike_Roden
Level 4
Thanks Russ.

1) It doesn't seem to make a difference whether or not the change journal is disabled. The time it takes to start the backup (where it goes from scanning to actually counting bytes) is practically the same either way.

What does seem to make a difference is whether or not the UCJ file has been deleted. When the file is deleted, it's like the backup starts immediately and when the file is there (and the backup actually works), it takes two or five minutes for the backup to start following the backup scan.

2) I just happened to come here this morning to post this specific thing. After the two successful manual backups in a row on the 10 GB server yesterday evening, the scheduled one which kicked off about an hour later is still stuck this morning going on nearly 14 hours doing the backup scan.

Hope this helps.

Mike