Blog Entry

aevans's picture
blog
Reads:

2158

Score:
0
0
 
Comments:

10

Hidden under the covers

Author Info

7 August 2007 - 3:31pm
Submitted by: aevans

Tags

Dean already mentioned that we had some Bonsai milestone 3 Demos last week which were very interesting.  I am very excited about this new release.  What he didn't mention were a couple of things that are hidden under the covers in the POA code.  I'll go into them in more detail at some future juncture but a quick overview and why I am so excited:

The first one is an overhaul of the Purge code.  Over the last year or so we have seen an upswing in the number of 'slow Post Office' calls and, invariably, it's the SAN that is slow and not able to respond back to the OS quickly enough.  A lot of the time there are a large number of i/o requests coming from GroupWise and often this is caused by lots of users having auto-archive enabled.  Essentially, what auto-archive does is download the mail, purge it from the databases and then send any status tracking updates.  This is an expensive operation.  This purge code change will batch those updates and move them to another thread so that the client is not needing to wait on them to complete (and not holding the thread so it's not available to other users).  I am particularly excited about this change.

The second change is in the HTTP interface of the POA and is probably more useful for GMS troublshooting than anything else.  I had an incident this year that kept me on the phone for something like 30 hours straight and we ended up calling in developers from vacation to help troubleshoot.  We held a post-mortem afterwards and felt that we didn't have enough information at hand to really get to the root of the problem quickly enough - hence these changes.  When I get a chance I will include screenshots, but the enhancements include the ability to see which users are registered with the event notification system, what their status is and the option to manually fire off an event notification trigger to anything listening for those notifications (GMS mainly, but also BES and any home grown SOAP apps you may have).

So, like I said these are pretty much hidden under the covers but will have a big impact.





User Comments

FlyingGuy's picture

Wow your good! Now perhaps

Submitted by FlyingGuy (not verified) on 7 August 2007 - 9:06pm.

Wow your good!

Now perhaps you can tell me why POA's and MTA's and GWIA's keep blowing up on Non-Present Page Reads, Red-Zone problems, TSAFSGW problems and last but, certainly not least, GHCHK thread problems.

Not picking on you, just carping about reliability of the agent software. I think its a bit of a crap-shoot though. I have 7.0.2HP running on my 5.1 server ( cheap clone hardware ) and not a single hickup. Same thing running on a 6.5-sp5 ( on a compaq/HP ML370 ) box and things get wonky.

Ahh the art & science of software.

I really hope the new POA, MTA, GWIA, WEBACCESS stuff really is put through the wringer hard before its on the street.

FlyingGuy's picture

Just in case you were

Submitted by FlyingGuy (not verified) on 8 August 2007 - 6:37am.

Just in case you were thinking otherwise, I really meant "Wow you are good" genuinely. Novell tech support, while expensive, has always been great.

Alex Evans's picture

This isn't really the place

Submitted by Alex Evans (not verified) on 8 August 2007 - 6:53am.

This isn't really the place to troubleshoot issues but if you are having multiple, seemingly random abends in different modules then often the cause is bad hardware, like memory. You mention non-present page reads and redzone abends, both of which are memory abends. If you have spare hardware, migrate everything to that and see what happens - or if it is the same model then selectively swap hardware.

Alex Evans's picture

Thanks for the vote of

Submitted by Alex Evans (not verified) on 8 August 2007 - 11:56am.

Thanks for the vote of confidence. I think a lot of it comes from working with a bunch of people that are genuinely passionate about what they do.

FlyingGuy's picture

I wasn't trying to weasel

Submitted by FlyingGuy (not verified) on 9 August 2007 - 12:36pm.

I wasn't trying to weasel tech support. But it is rather strange that _nothing_ else on the server I mentioned goes haywire.

Then there is the other one where noting else in GW goes sideways except the java app that supports WebAccess once in a while, which just requires an unload java, unload apache then re-load and everything is good.

Ya know, just those little things that make you go, "Hmmmmm".

John's picture

That all depends on who you

Submitted by John (not verified) on 9 August 2007 - 4:11pm.

That all depends on who you get. We are forced to use India tech support for Groupwise and from my experience the Tier I support leaves a lot to be desired. I hear a lot of "I think" or "I guess" not I know. Please train them better. It was difficult to find someone that knows both Groupwise and Linux. US support for GW is so much better and much more knowledgeable. Why did Novell outsource Groupwise support to India?

FlyingGuy's picture

$$$$$$

Submitted by FlyingGuy (not verified) on 9 August 2007 - 9:49pm.

$$$$$$

FlyingGuy's picture

Basically, if I call Novell

Submitted by FlyingGuy (not verified) on 9 August 2007 - 9:54pm.

Basically, if I call Novell Tech Support and fork over my $450.00 or whatever it is there days, its because I haven't been able to find the answer to the problem and I have tried just about everything I can conjure up from my expirience, Google, Novell Knowledge Base and last but certainly not least, the documentation.

When i make that call, I want an ENGINEER, no a script reader, not someone who has been working there for 6 months and took a crash course in NetWare, GroupWise, Border Manager, Zen, Linux etc.

I give them 10 minutes max before I start yelling at them to escalate this back to Provo.

Alex Evans's picture

The thing is that this is an

Submitted by Alex Evans (not verified) on 10 August 2007 - 8:05am.

The thing is that this is an accusation often levelled at GW. 'None of our other apps have a problem' - but the point is that so few of the other apps make quite such intensive use of the memory and disk channels as GW does - which means that GW is more likely to hit that bad spot in memory. Now, I have not looked at your abend logs or really know whats going on there, but if you tell me you have two servers doing the same tasks and one is abending in everything it can find to abend in and the other is running just fine then I need to suspect something other than a GW code issue as a starting point.

As for you Java issue, there have been a number of winsock fixes to resolve similar sounding problems, including one late last month that is still to be released - plus there was a jvm update a couple of years back, and a couple of GW servlet fixes (not sure how old your OS and GW code is)

FlyingGuy's picture

I agree with you. GW does

Submitted by FlyingGuy (not verified) on 12 August 2007 - 8:50am.

I agree with you. GW does really work the system, at at times works it pretty hard, when the message volumes start really ramping up.

Now I know there exits someplace over in engineering an NLM or 20 that is used to really beat on a server. Perhaps some of these could be released as tools to really hammer a server when there seems to be no other direction to turn.

Just a thought.

© 2013 Novell