Novell Home

Ximian Hackers Self Help: Part 3

Novell Cool Solutions: Trench

Digg This - Slashdot This

Posted: 14 Jul 2004
 

These self-help letters began as a post to an internal Ximian development list. We've posted them here in a three-part series to provide you with insight into the development process at Ximian and within the GNOME community at large.


Contents

Forward Declarations
Porting Code to Gnome 2.0
Avoiding Re-Entrancy Bugs
Dereferencing Function Pointers
Electric Fence
Conclusions
Forward Declarations

In the Gtk+ world, it appears that a lot of typing is a positive virtue. For example, why simply type:

static int
get_length (MyObject *object)
{
   return object->priv->length;
}

When you can type: 

static int get_length (MyObject *object);
static int
get_length (MyObject *object)
{
   return object->priv->length;
}

Ok - so in Gtk+ land they don't actually provide a totally redundant function prototype all the time, merely most of the time. The problem is that they don't keep their wasted effort in a sensible place - next to the function itself. Consequently when you want to change the signature of the get_length method, you have to change it twice.

As all right thinking people know, this is extremely foolish. In 98% of cases, there is precisely 0 need for forward declarations for functions[1]. However - the Gtk+ forward declarations are not strictly redundant. In fact, they are neccessary. This is because they have insisted on sticking the 'get_type' method right at the top of the source file, then the class_init / instance_init type methods etc. ie. they have ordered the code almost precisely backwards.

If you take a more enlightened view of method ordering, you will realize that internal functions should be at the top of a module, and bigger chunky public methods below them. Having done this all the forward declarations can be removed, and then the code is forced into a logical structure. If you search from the top for a method, first you hit its implementation, then you see all of its invocations (instead of some sort of big sprawling unordered mess).

Porting Code to Gnome 2.0

Much has changed in Gnome 2.0, and very much of it for the better. However, there are however several pitfalls, and gratuitous differences to trap the unwary.

Firstly, GObject takes a singular joy in multiply destroying pretty much everything, so your destroy methods should be cagey:

   if (o->priv) {
      ... cleanup priv ...
      g_free (o->priv);
   }
   o->priv = NULL;

GtkObject has moved to GObject, but it is not enough just to do a s/Gtk/G/ on your code, here are a few points:

  1. All the macros in your header will not work; things have changed, copy and sed someone's that do work:
  2. -  (GTK_CHECK_CAST ((o), BONOBO_OBJECT_TYPE, BonoboObject))
    +  (G_TYPE_CHECK_INSTANCE_CAST ((o), BONOBO_OBJECT_TYPE,
    BonoboObject))
  3. Remember to kill redundant Gtk+ includes, and alter both the class and instance structures to inherit from GObject, return a GType from the get_type method.
  4. the get_type method - substantially changed; perhaps this will help:
  5.    GtkTypeInfo:            GTypeInfo:
    
       type_name               class_size
       instance_size           base_class_init_func
       class_size              base_class_finalize_func
       class_init_func         class_init_func
       instance_init_func      class_finalize_func
       reserved_1              class_data
       reserved_2              instance_size
       base_class_init_func    n_preallocs
                               instance_init_func.

As might be obvious by now, it's totally different in a tricky to work out way, and this needs total re-writing by hand. Those who use things like E_MAKE_TYPE will have a much easier ride here.

One particularly nice Gtk+ change is the removal of the 'draw' signal and virtual method on the GtkWidget. Unfortunately, due to the duplication between 'draw' and 'expose' methods in Gtk+1.2, some code out there does things like this:

static int
impl_expose_event (GtkWidget *widget,
                   GdkEventExpose *expose)
{
   gtk_widget_draw (widget, &expose->area);
   return FALSE;
}

This works fine with Gtk+ 1.2, but in Gtk+ 2.0, the gtk_widget_draw method is provided for backwards compatibility, and it works by synthesizing exposes - so we get a rather convoluted infinite recursion here.

Make sure that as you deal with the 'draw' method, that you merge it into the expose method cleanly.

Avoiding Re-Entrancy Bugs

There were mutterings that ORBit2's processing of pending CORBA requests via a mechanism that could cause re-entrancy could be 'fixed' by having a thread request handling mechanism, ie. such that a new thread was instantiated (from a pool) to handle each incoming request.

The problem with this is that it's really no solution to the underlying problem, the re-entrancy will happen anyway - but this time it's worse, since you have to protect all your variables with locks. Of course, in the absence of locks - it would appear to be far better, but then you get random memory corruption races instead. Of course, if locks are used that's just great - but then it seems likely that if the locks are coarse grained (critical sections or some such) that you just move a re-entrancy problem to a deadlock issue.

So - we all have to deal with re-entrancy at some stage, so lets deal with it:

One common type issue goes like this:

my_unsafe_finalize_fn (GtkObject *object)
{
   MyObject *o = (MyObject *) object;

   if (o->priv) {
      bonobo_object_release_unref (o->priv->my_ref, NULL);
      g_free (o->priv->segv);
   }
   o->priv = NULL;
}

The problem with this is that the release_unref will (most likely) hit the ORB and could re-enter. Worse, this finalize fn could re-enter, especially with GObject, which takes a real joy from multiply finalizing everything. Thus, having returned from the release_unref o->priv is NULL from another finalize.

So ... how to protect against this one. This type of issue is fairly simple to protect against in fact, simply take a local reference on the object you're interested in staying around over the call. You'll notice this happens over Gtk+ signal emissions and in all well behaved code that could re-enter. In this case, we get:

my_safe_finalize_fn (GtkObject *object)
{
   MyObject *o = (MyObject *) object;

   if (o->priv) {
+     gtk_object_ref (object);
      bonobo_object_release_unref (o->priv->my_ref, NULL);
      g_free (o->priv->segv);
+     gtk_object_unref (object);
   }
   o->priv = NULL;
}

Of course, while this is a common issue, there are plenty of others; particularly iterating over lists of CORBA references and unreffing them:

my_unsafe_unref_loop (MyObject *myobj)
{
   GSList *l;

   for (l = myobj->priv->refs; l; l = l->next)
         bonobo_object_release_unref (l->data, NULL);

   g_slist_free (myobj->priv->refs);
   myobj->priv->refs = NULL;
}

Looks safe enough, the problem is that again on re-entrancy the list can be freed by entering the loop lower down the stack. Thats sounds fine, problem is - that when the g_slist_free is called, the list items are stuck onto the 'free gslist items list' which means that it is most likely that l->next will point to the last free'd list, whose ->data pointer has been adjusted to be a GSList pointer. Next iteration we try to cast that to a Bonobo_Unknown to release it and ... bang.

So ... sadly it costs to overcome these issues, but in this case there is a standard impl. see bonobo_object_list_unref_all.

Dereferencing Function Pointers

Again the Gtk+ typists have decided to excel themselves. The magic thing about function pointers in C is that they are a very special case, you can't dereference a function pointer[2].

Knowing this the Gtk+ people decided that it would be great to spread a bit of confusion by redundantly de-referencing all their function pointers something like:

   a = (**((*(*my_fn_ptr)))) (arg1, arg2);

Of course, one can simply do:

   a = klass->my_fn_ptr (arg1, arg2);

And this is clearer, more beautiful, and generally right. In addition, C also knows about functions, so instead of:

   MyFnPtrType my_fn_ptr = &the_function;

you can just do:

   MyFnPtrType my_fn_ptr = the_function;
Electric Fence

If you go round reading other people's code, ( and everyone should, regularly, if only to scare them ), then sometimes you'll come across something that says:

   free (malloc (8)); /* -lefence */

And you might wonder - what is going on here?

The answer is quite simple: Electric fence. Electric fence is quite simply excellent, the poor man's purify ( or alternatively the lazy man's purify ) and it's very quick to use for small apps.

Electric fence has in the past been a tragic victim of its own success, inasmuch that since it bales out at the first trace of memory corruption - it has died in glibc before it has even got to running the code. Luckily with newer glibc's we no longer have to go through contortions working round the glibc bugs to get a useful result.

To use electric fence, simply re-link against -lefence. This is best done by firing up vi on your Makefile, and doing something like: 1,$s/^LIBS = /LIBS = -lefence /, rm my-binary ; make my-binary.

Why though do we need free (malloc (8)); well, the sad truth is that most gnome apps nowadays don't actually have any references to the native memory allocation code: using g_new / g_malloc / g_free in their place. Sadly, this means that the linker, in all its wisdom notices that -lefence isn't being used and clobbers it silently, leaving people banging their heads against the wall trying to work out why it doesn't help them.

Either way, the trick then is to run my-binary inside gdb, and ( with patience ) hopefully your memory corruption will show early. -lefence has some horrible inefficiencies in its methodology that means that it doesn't scale well to large amounts of allocations, and this is mostly a feature of how processor page table lookup's are implemented in hardware sadly. So don't try it on evolution.

The man page is quite instructive on how to tweak it. Just man efence for more information.

Conclusions
  • Order your code to avoid redundant forward declarations, and as if by magic it will be more readable and maintainable.
  • Port your code to Gnome 2.0, but be slightly wary. Remember:
    • Watch out for multiple object destroys.
    • Type creation macros can protect you from TimJ.
    • Use care when dealing with expose/draw.
  • Avoid typing *'s and ('s just for fun. Function pointers are clean creatures at heart, and you should help keep them that way.
  • Electric fence will find the rat's nest of bugs in your code more quickly than you can alone.

Until next time, happy hacking.

Notes

[1] - With co-recursive functions its neccessary, but not much otherwise.

[2] - What would you get anyway, the first assembly instruction?


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2012 Novell