Verbose output truncated in bpdbjobs -all_columns?
I have some tools I wrote that trawl through "bpdbjobs -all_columns -jobid <jobid>" to pull out the writing speed over the course of the job. This lets me graph it and see how things are going. (I have a lot of netapps that when they do NDMP incrementals, they do nothing for a while (hours often) and then write at speed. So an "average speed" isn't very useful).
I mostly did this back under 6.0. I've noticed that I have a lot of 6.5 jobs where the information in bpdbjobs just stops. It might be something to do with 6.5, or it might be the jobs themselves have changed somehow.
As an example, this job began at 2/17/2011 21:53 and completed at 2/20/2011 08:58 (over 59 hours, ~7TB job). If I look at the end of bpdbjobs -all_columns for it, I get this:
[snip] 02/18/11 16:04:39 - 40005 KB written - 41157.113 KB/sec 02/18/11 16:04:39 - 40005 KB written - 41157.348 KB/sec 02/18/11 16:04:40 - 40005 KB written - 41157.586 KB/sec 02/18/11 16:04:41 - 40005 KB written - 41157.816 KB/sec 02/18/11 16:04:41 - 40005 KB written - 41158.043 KB/sec 02/18/11 16:04:42 - 40005 KB written - 41158.277 KB/sec 02/18/11 16:04:42 - 40005 KB written - 41158.504 KB/sec ... 6967081863 46377009 292129 33078 [snip]
Basically, the last performance/timing bit that it logged was less than 24 hours into the job. I don't have any more performance logs after that point.
Anyone ever seen this or have any idea what limits there might be on how much data is available through bpdbjobs? System is currently Linux, single master/media server, lots of NDMP hosts, 6.5.5, but I've run these tools on 6.0 systems as well.
Darren
Well, lo and behold, the "bpdbjobs" output is truncated, but the <job>.t trylog file is not.
So I need to adapt my program to be able to pull a trylog file directly. Slightly more restrictive, but much better than not having the data at all.
Last bits from bpdbjobs:
03/06/11 12:14:27 - 40005 KB written - 47183.555 KB/sec 03/06/11 12:14:28 - 40005 KB written - 47183.691 KB/sec ... 12147569595
End of the same job trylog:
KBW 1299625965 40005 50522.602 KBW 1299625968 40005 50522.117 KBW 1299625969 40005 50522.062
And 1299625969 => is March 8, 15:12 in my timezone. So bpdbjobs is just truncating it.