Forum Discussion

HoldTheLine's avatar
8 years ago
Solved

Odd recurring issue, 5320 eventually times out and can not login until reboot. NBU operations fine

We have many appliances with mostly no issues but one in particular is a problem child; the master that we have for one of our environments seems to get hosed after a few weeks of operation and the only solution is to reboot.  The problem is nobod can log in - whether using individual Ids or the admin login via SSH, after typing in the password it hangs for a while and eventually just times out.  

Backup operations continue without issue, loggin into the console takes a while but eventually works.  So far the solution has just been to reboot the thing but I dont like that idea, its not like this is a Windows box that should be bounced "just because" and I have always hated that idea, would rather get to the bottom of it. 

I suspect network issues of some kind but its beyond me what they might be, especially since this only seems to happen after a few weeks of activity and the fact that backup/duplication/replication operations are all working just fine.

Any clues as to what I can look for?  Assuming I can ever log in, that is!

 

 

  • HoldTheLine's avatar
    HoldTheLine
    8 years ago

    Turns out this had to do with Active Directory setup; the system kept trying to connect the DC over and over while timing out, chewing up all the system resources.  Very strange since we set up bunch of appliances with AD and didn't see this anywhere else even in the same network segment using the same AD controllers.  Guess my suspicion was correct, all of those winbindd entries in /var/log/messages is what lead tech. support to the smoking gun.

    I would have been here more often and provided more information but since the Wannacry thing hit in May we have lost all internet access so the only way for me to post is from my own system.  

     

  • I recall a case where the /tmp was being occupied with log files which hangs the system, or at least makes it a pain to log in. Rebooting would clean it out and eventually it would happen again.

    • RiaanBadenhorst's avatar
      RiaanBadenhorst
      Level 6

      Checked with a colleague, it was actually not /tmp but some other folder. The fact being that, space not being available was the issue. They had to boot into single user mode to find the culprit to release the space. 

      • Marianne's avatar
        Marianne
        Level 6

        Seemingly not too much of an issue as HoldTheLine has not been back to look for replies ....