cancel
Showing results for 
Search instead for 
Did you mean: 

PureDisk Freeze randomly

ElGringo
Level 6
Partner Accredited

Hi all,

 

Some of my  Puredisk node freeze randomly.

We run PureDisk6.6.1 in a clustered Storage Pool.

4 REPLIES 4

kunal
Level 4
Employee

Hi,

 

what exactly happens when you say that PureDisk node freeze randomly???

Can you log in to the node at that time (telnet/ssh)?

Does it respond to OS commands?

What do you do to resolve this issue?

ElGringo
Level 6
Partner Accredited

The server is complety frozen I am not ablme to loggin. I did not found any usefull information.

Only thing we can do is a poweroff using the power button.

Now the BIOS of the Motherboard has been upgraded. We also upgraded PureDisk to 6.6.1.1

 

Now, wait and see ...

PureD
Level 4

BIOS upgrade was a good move as was PD6611 since a kernel upgrade is included. Check the /var/log/mcelog ...if it is zero byte in size, good. If not, could be CPU issues.

PDOS 6.6+ has some stuff that really doesn't have to run in most environments...if you aren't using iSCSI, turn off the open-iscsi daemons (service open-iscsi stop; chkconfig open-iscsi off), same goes for novell-zmd, postfix if you're not using sendmail, maybe powersaved....check 'ps' for others.

You might start a script to execute a series of commands, output (append) to file every 3 minutes, commands like 'ps faux', vmstat, iostat, top, free, 'cat /proc/meminfo', 'netstat -s', lsof. If the issue reoccurs, view the bottom of the output file your script appended to. Doing that you might find swap depeted, some process using up RAM/CPU, etc. Grab that and /var/log/messages. Hopefully there will be clues.

bartman10
Level 4

I just got done with a NASTY rebooting issue with PD and HP DL385 G7's.

No one could figure it out, there was nothing written in the linux logs it just POOF rebooted. Linux had no idea what happened or why. Odd errors where found in the iLo log whitch HP was completely useless with decyphering.

Only thing we foudn was high ram usage for extended time. 3 PD servers each with 32TB stroage and 32GB ram as per a Symantec PS engaugement... "The top" symantec support guy said with this config he would like to see 48GB and said there is currently a disconnect between what sales recommends and what they see the servers needing... Classic Symantec... So.. any way, while I was installing the new ram I flashed the bios the the latest. It was the same bios rev, A18, but I noticed it was 2mo newer.. Didn't think much of it at the time.

No reboots for over a week now.

During this time I found an HP tech note about the G7 and it's AMD 6100 CPU's. there is a nasty issue with VMware and "purple screen of death" with rebooting and crazy stuff in the iLo logs... except for the VMware part the rest of the issue is identical to what I was seeing. I also noted the new bios I flash "fixes" this issue by not allowing the CPU to enter such a deep power saving mode.

I can't prove it right now but I really feel the issue HP has noted with VMware is also present with other versions of Linux and I was seeing that issue.

 

After typing all this I'm thinking I should make a post about this so others could find it.