11 Feb 2007

Getting rid of virtual method calls

In Nebula2, platform abstraction was done through subclassing and virtual methods. For instance, the nGfxServer2 class implemented the platform-independent interface of the graphics server, and a specific subclass (for instance nD3D9Server) implemented the Direct3D9 version of the graphics server, overriding the virtual methods of the base class. Client code would then talk to the nGfxServer2 class interface and doesn't need to care whether rendering is done through Direct3D or some other rendering API.

Depending on the host platform, the performance of virtual method calls is somewhere from slightly bad to very bad, because the additional memory lookup may trash the cache, flush the instruction pipeline, disable branch prediction, etc... Virtual method calls are still the fastest way for runtime-polymorphism, but that's usually not necessary for platform-abstraction, where "compile-time-polymorphism" is often enough.

Nebula3 uses typedef-ing for platform abstraction, and thus eliminates most virtual methods and even enables inlining for frequently called methods without sacrificing platform-independent code.

This is done by first writing a base class which defines the class interface. The base class usually doesn't have virtual methods (except the destructor, since the base class is usually derived from Core::RefCounted). From the base class, a platform specific class is derived, overriding most or all of the methods defined in the base class with platform specific code. Finally, the platform specific class is typedef-ed to the proper platform-independent class name, which is then used by the client code.

Here's an example: let's say we want to implement the RenderDevice class in the CoreGraphics namespace. First, the class interface is defined in the base class:


namespace CoreGraphics
{
class RenderDeviceBase : public Core::RefCounted
{
DeclareClass(RenderDeviceBase);
DeclareSingleton(RenderDeviceBase);
public:
/// constructor
...
};

} // namespace CoreGraphics


Note that the class name is RenderDeviceBase, not RenderDevice.

From RenderDeviceBase, a platform-specific subclass is derived, called D3D9RenderDevice, this is the Direct3D9-implementation of the RenderDevice class:


namespace CoreGraphics
{
class D3D9RenderDevice : public RenderDeviceBase
{
...
};

} // namespace CoreGraphics


Finally, there's the "proper" class header for the RenderDevice class (coregraphics/renderdevice.h), which performs the conditional typedef:


#if __USE_DIRECT3D9__
#include "coregraphics/d3d9/d3d9renderdevice.h"
namespace CoreGraphics
{
typedef D3D9RenderDevice RenderDevice;
}
#elif __USE_DIRECT3D10__
#include "coregraphics/d3d10/d3d10renderdevice.h"
namespace CoreGraphics
{
typedef D3D10RenderDevice RenderDevice;
}
#elif __USE_OPENGL__
#include "coregraphics/ogl/oglrenderdevice.h"
namespace CoreGraphics
{
typedef OGLRenderDevice RenderDevice;
}
#else
#error "RenderDevice class not implemented on this platform!"
#endif


Client code just works with the RenderDevice class, and is completely unaware that it is actually using the D3D9RenderDevice or D3D10RenderDevice classes. All platform specific stuff is resolved at compile time, and all calls into the RenderDevice are normal method calls, not virtual method calls. This lets the compiler also do a much better optimization job (for instance inlining methods, better link-time code generation, and so on...).