cancel
Showing results for 
Search instead for 
Did you mean: 

Linux agent glibc crash

Murat_2
Level 2
Hello,
 
Backup exec agent 11d running on x86_64 RHEL 4. Agent crashes "sometimes" and gives the following error:
 
*** glibc detected *** free(): invalid next size (normal): 0x00000000006283a0 ***

I could go ahead and export MALLOC_CHECK_=0 prior to running the agent as a workaround, but is there way to fix this problem? Is this a glibc bug or agent bug?
 
# uname -a
Linux host 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:29:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux
# rpm -q --queryformat "%{NAME}-%{VERSION}.%{RELEASE} (%{ARCH})\n" glibc glibc-devel
glibc-2.3.4.2 (i686)
glibc-2.3.4.2 (x86_64)
glibc-devel-2.3.4.2 (i386)
glibc-devel-2.3.4.2 (x86_64)

Any help is appreciated.
 
Thanks.


Message Edited by Murat on 08-30-2007 06:55 AM
7 REPLIES 7

Kobus
Level 3
I had this problem until eventually i installed the latest version.

Whether this will help you or not i dont know but here is the file I used for 11d
wget http://mirrors.dedipower.com/centos/4.5/os/i386/CentOS/RPMS/compat-libstdc++-33-3.2.3-47.3.i386.rpm

and this is the file i used for 10d
wget ftp://mirrors.dedipower.com/centos/4.4/os/i386/CentOS/RPMS/compat-libstdc++-296-2.96-132.7.2.i386.rpm

HTH or at least put you on the right track

Kobus

Murat_2
Level 2
Agent version is 7170 and the compat-libstdc++ version is I have is the same.
 
# rpm -qa --queryformat "%{NAME}-%{VERSION}.%{RELEASE} (%{ARCH})\n" | grep compa
t-lib
compat-libstdc++-33-3.2.3.47.3 (x86_64)
compat-libstdc++-33-3.2.3.47.3 (i386)
compat-libgcc-296-2.96.132.7.2 (i386)
compat-libstdc++-296-2.96.132.7.2 (i386)

Murat_2
Level 2
Symantec employees,
 
Do we have a fix for this problem?
 
This problem has become very annoying.

Murat_2
Level 2
Ok, a little info about the system.
 
We have a bunch of RH Linux servers. Some of them are x64. All of them have Backup agent 7170 installed.
 
But only the agents running on x64 Linux crash. The ones that running on 32bit Linux don't crash, they're just doing fine.
 
I tried export MALLOC_CHECK_=0 prior to beremote launch in the init script, it didn't help.
 
Here are some output:
 
# cat /var/VRTSralus/ralus.ver
7170
# uname -a
Linux hostname 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:29:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux

beremote daemon is launched as:
 
/opt/VRTSralus/bin/beremote --log-file /var/VRTSralus/beremote.service.log 2> /var/VRTSralus/beremote.service.log &

# tail /var/VRTSralus/beremote.service.log
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\proc\*.* /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\mnt\nss\pools\ /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\mnt\nss\.pools\ /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\sys\*.* /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\_admin\*.* /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\dev\* /s
43204960 Tue Sep 11 12:12:24 2007 :   [ROOT]\xfn\* /s
43204960 Tue Sep 11 12:12:24 2007 : ====>VX_GetLocalDeviceName: device_name is [ROOT]
43204960 Tue Sep 11 12:12:24 2007 : ====>VX_GetLocalDeviceName: device_name is [ROOT]
43204960 Tue Sep 11 12:12:24 2007 : ====>VX_GetLocalDeviceName: device_nam#
 
See above that the last line is not written completely to the file. I checked all the servers, agents crash while calling VX_GetLocalDeviceName, but not exactly at the same point as above. So I guess the problem is related to calling this function on x64 platform.
 
Please advise.
 
Thanks.
 
Murat
 
 


Message Edited by Murat on 09-11-2007 12:31 PM

Message Edited by Murat on 09-11-2007 12:31 PM

Allen_K
Level 6


Murat wrote:
Symantec employees,
 
Do we have a fix for this problem?
 
This problem has become very annoying.



Hi Murat, 

Please note that this is a peer-to-peer discussion forum and not technical support.

Usually customers come to the rescue. Other times, when they are available, a Symantec employee will volunteer his/her time to find and help solve issues just like any other member of the community.

However, in this case, it looks like you might be better off contacting Symantec Support Services for 24/7 access to technical support professionals. You can also try using our Knowledge Base Search.

Hope that helps!

Robert_Oschwald
Level 3
Same problem here with CentOS5 x86_64 (same as RHEL5 x86_64)

RALUS Version is 7170

Kernel:
Linux 2.6.18-8.1.8.el5 #1 SMP Tue Jul 10 06:39:17 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

GLIBC:
# rpm -q --queryformat "%{NAME}-%{VERSION}.%{RELEASE} (%{ARCH})\n" glibc glibc-devel
glibc-2.5.12 (i686)
glibc-2.5.12 (x86_64)
glibc-devel-2.5.12 (x86_64)

Error on console is:
beremote[15436]: segfault at 00000000000000be rip 00002aaab1ef33ca rsp 0000000043c00d70 error 4
and sometimes:
beremote[26060] general protection rip:3e0b66e1f3 rsp:43203540 error:0

Other machines running 32bit Linux run fine.
Seems to be a problem in the 64bit version of beremote

The crash occures seldom, therefore we monitor the beremote process via cron-script and restart the agent when it crashed.

BTW: The old legacy agent works rock solid, but we can't use it on 64bit Linux, as we can only traverse the root-partition, no other partitions. (Files and subdirs not displayed)

Robert


quesse
Level 3
Hi,
any news for solution of this problem?
I still encounter it!
Please help me

thanks

DV