Since the start of the GTK4 development branch I've had to deal with creating fundamental types to replace ad hoc boxed types with inheritance three times; I thought about writing this stuff down, so the next time somebody thinks "I don't want to use GObject but I want a type hierarchy" they'll do something that doesn't make people using language bindings cry.
The particular nature of our work is up for any amount of debate, but the
basic fact of it comes with a few requirements, and they are by and large
inevitable if you wish to be a well-behaved, well-integrated member of the
GNOME community. One of which is: “please, think of the language bindings”.
These days, luckily for all of us, this means writing introspectable
interfaces that adhere to fairly sensible best practices and conventions.
One of the basic conventions has to do with types. By and large, types
exposed by libraries fall into these two categories:
plain old data structures, which are represented by what’s called a
“boxed” type; these are simple types with a copy and a free function,
mostly meant for marshalling things around so that language bindings can
implement properties, signal handlers, and abide to ownership transfer
rules. Boxed types cannot have sub-types.
object types, used for everything else: properties, emitting signals,
inheritance, interface implementation, the whole shebang.
Boxed and object types cover most of the functionality in a modern,
GObject-based API, and people can consume the very same API from languages
that are not C.
Except that there’s a third, kind of niche data type:
fully opaque, with instance fields only known within the scope of the
project itself
immutable, or at least with low-mutability, after construction
reference counted, with optional cloning and serialization
derivable within the scope of the project, typically with a base
abstract class
without signals or properties
Boxing
One strategy used to implement this niche type has been to use a boxed type,
and then invent some private, ad hoc derivation technique, with some
structure containing function pointers used as a vtable, for instance:
The code above lets us create derived types that conform to the base type
API contract, while providing additional functionality; for instance:
Since the Base type is also a boxed type, it can be used for signal
marshallers and GObject properties at zero cost.
This whole thing seems pretty efficient, and fairly simple to wrap your head
around, but things fall apart pretty quickly as soon as you make this API
public and tell people to use it from languages that are not C.
As I said above, boxed types cannot have sub-types; the type system has no
idea that DerivedA implements the BaseAPI contract. Additionally, since
the whole introspection system is based on conventions applied on top of
some C API, there is no way for language bindings to know that the
derived_a_get_some_other_field() function is really a DerivedA method,
meant to operate on DerivedA instances. Instead, you’ll only be able to
access the method as a static function, like:
In short: please, don’t use boxed types for this, unless you’re planning to
hide this functionality from the public API.
Typed instances
At this point the recommendation would be to switch to GObject for your
type; make the type derivable in your project’s scope, avoid properties and
signals, and you get fairly idiomatic code, and a bunch of other features,
like weak references, toggle references, and keyed instance data. You can
use your types for properties and signals, and you’re pretty much done.
But what if you don’t want to use GObject…
Well, in that case GLib lets you create your own type hierarchy, with its
own rules, by using GTypeInstance as the base type.
GTypeInstance is the common ancestor for everything that is meant to be
derivable; it’s the base type for GObject as well. Implementing a
GTypeInstance-derived hierarchy doesn’t take much effort: it’s mostly low
level glue code:
Yes, this is a lot of code.
The base code stays pretty much the same:
except:
the reference counting is explicit, as we must use
g_type_create_instance() and g_type_free_instance() to allocate and
free the memory associated to the instance
you need to get the class structure from the instance using the GType
macros instead of direct pointer access
Finally, you will need to add code to let you register derived types; since
we want to tightly control the derivation, we use an ad hoc structure for
the virtual functions, and we use a generic class initialization function:
Otherwise, you could re-use the G_DEFINE_TYPE macro—yes, it does not
require GObject—but then you’d have to implement your own class
initialization and instance initialization functions.
After you defined the base type, you can structure your types in the same
way as the boxed type code:
The nice bit is that you can tell the introspection scanner how to deal with
each derived type through annotations, and keep the API simple to use in C
while idiomatic to use in other languages:
Cost-benefit
Of course, there are costs to this approach. In no particular order:
The type system boilerplate is a lot; the code size more than doubled
from the boxed type approach. This is quite annoying, but at least it is
a one-off cost, and you won’t likely ever need to change it. It would be
nice to have it hidden by some magic macro incantation, but it’s
understandably hard to do so without imposing restrictions on the kind of
types you can create; since you’re trying to escape the restrictions of
GObject, it would not make sense to impose a different set of restrictions.
If you want to be able to use this new type with properties and you
cannot use G_TYPE_POINTER as a generic, hands-off container, you
will need to derive GParamSpec, and add ad hocAPI for GValue,
which is even more annoying boilerplate. I’ve skipped it in the example,
because that would add about 100 more lines of code.
Generated signal marshallers, and the generic one using libffi, do not
know how to marshal typed instances; you will need custom written
marshallers, or you’re going to use G_TYPE_POINTER everywhere and
assume the risk of untyped boxing. The same applies to anything that uses
the type system to perform things like serialization and deserialization,
or GValue boxing and unboxing. You decided to build your own theme park
on the Moon, and the type system has no idea how to represent it, or
access its functionality.
Language bindings need to be able to deal with GTypeInstance and
fundamental types; this is not always immediately necessary, so some
maintainers do not add the code to handle this aspect of the type system.
The benefit is, of course, the fact that you are using a separate type
hierarchy, and you get to make your own rules on things like memory
management, lifetimes, and ownership. You can control the inheritance chain,
and the rules on the overridable virtual functions. Since you control the
whole type, you can add things like serialization and deserialization, or
instance cloning, right at the top of the hierarchy. You could even
implement properties without using GParamSpec.
Conclusion
Please, please use GObject. Writing type system code is already boring and
error prone, which is why we added a ton of macros to avoid people shooting
themselves in both their feet, and we hammered away all the special
snowflake API flourishes that made parsing C API to generate introspection
data impossible.
I can only recommend you go down the GTypeInstance route if you’ve done
your due diligence on what that entails, and are aware that it is a last
resort if GObject simply does not work within your project’s constraints.