28 Feb 2008

COD4 FTW!

Hmm, just noticed that COD4 is now my personal most-played game, replacing Oblivion which ruled at the top spot since I bought my 360 in 2006. The funny thing is that I remember Oblivion as that huge time-eater which I played the whole summer of 2006, while COD4 feels like I just barely started to really play it. Scary.

I also finally finished the final boss in Prince Of Persia (the XBLA remake)! The combat system is surprisingly complex, attacking and blocking requires very good timing, a bit like the sword fights in Assassins Creed. Oh and I started with Lost Odyssey. Don't know what to think of it yet. It didn't exactly grab me, I'm 3 hours in or so, and there's just nothing happening in this game! Everything is so stretched out, and I keep thinking "man I could play some COD4 instead of struggling through this borefest"... and that's what I usually do about 10 minutes later...

I'm also currently on my third DMC4 play-through in Son-Of-Sparda mode. Great great game. Nice distraction until the king returns in June ;)

27 Feb 2008

Lowlevel Optimizations

I'm currently doing memory optimizations in Drakensang, and together with the new ideas from the asynchronous rendering code in N3 I'm going to do a few low-level optimizations in the memory subsystem over the next time in Nebula3. Here's what I'm planning to do:
  • Add a thread-safe flag to the Heap class, currently a heap is always thread-safe, but there will be quite a few cases now where it makes sense to not have the additional thread-safety-overhead in the allocation routines.
  • Add some useful higher-level allocators:
    • FixedSizeAllocator: This optimizes the allocation of many small same-size objects from the heap, it will pre-allocate big pages, and manage the memory within the pages itself. The main-advantage comes from the fact that all blocks in the page are guaranteed to be the same size.
    • BucketAllocator: This is a general allocator which holds a number of buckets for e.g. 16, 32, 48, ...256 byte blocks (the buckets are just normal FixedSizeAllocators). Small allocations can be satisified from the buckets, larger allocation go directly through the heap as usual.
  • Overwrite the new-operator in all RefCounted-derived classes to use the a BucketAllocator (however, I'll actually do some profiling whether this is actually faster then Windows' Low Fragmentation Heaps). This stuff will be behind the scenes in the DeclareClass()/ImplementClass()-macros, so there are no changes necessary to the class source code.
The biggest new feature (which depends on all of the above) is that I want to split RefCounted classes into thread-safe and thread-local classes. The idea is that a thread-local class promises that creation, manipulation and destruction of its objects happens from the same thread. A thread-local class can do a few important things with less overhead:
  • thread-local classes would create their instances from a thread-local BucketAllocator which doesn't have to be thread-safe
  • thread-local classes could use normal increment and decrement operations for their refcounting instead of Interlocked::Increment() and Interlocked::Decrement(). Since every Ptr<> assignment changes the refcount of an object this may add up quite a bit.
By far most classes in Nebula3 can be thread-local, only message objects which are used for thread-communication, and some low-level classes need to be thread-safe. I'm planning to enforce thread-locallity in Debug mode by storing a pointer to the local thread-name in each instance, and checking in AddRef, Release and other RefCounted methods whether the current thread-context is the same (that's a simple pointer-comparision, and it happens only in Debug mode).

The general strategy is to get away as far as possible from the general Alloc()/Free() calls, and to make memory management more "context sensitive" and also more static (in the sense that the memory layout doesn't change wildly over the runtime of the game). It's extremely important for a game application to set aside fixed amounts of memory right from the beginning of the project (and to let the game fail hard if the limits are violated), otherwise everything will grow without control, and towards the end of the project much time must be invested to cut everything back to reasonable limits.

19 Feb 2008

Nebula3 Februar 2008 SDK

Alright, here it is finally:

N3SDK_Feb2008.exe

Notable new features:
  • a "quick'n'dirty port" of our current Nebula2 modular character and animation system
  • PSSM-VSM shadow support for global light sources (which I'm not quite happy with yet)
  • some restructuring and cleanup in the Application layer
The internal Wii port is coming along nicely, Johannes got SQLite up and running on the Wii (which will also be helpful for other console ports), and apart from physics and collision detection, the Application Layer is now running on the Wii.

During the last 2 weekends I started to implement a new AsyncGraphics subsystem, which will put the Graphics subsystem and everything beneath it under its own thread. I'll write more about this in another post soon.

Finally here's a new screenshot of the new "demo character":

9 Feb 2008

COD4 Multiplayer

I have developed a serious addiction to COD4 multiplayer. Usually I'm not that big into competitive multiplayer, but I'm having my phases. Back in the old days, it was Counter Strike, a few years later I was excessively playing Battlefield2, and now for the last couple of weeks I'm totally hooked to COD4. On a console. With a game-pad. I tried the competitive multiplayer portions of Gears Of War, Rainbow Six Vegas and Halo3 before, and none of them really ticked with me in multiplayer like the old-school PC shooters. But COD4... one match did it. The reason - I think - is because COD4 is the love-child of CS and BF2, my two favorits of the past. Sessions feel very fast-paced, with an initial rush to the control-points, just like in CS. The class system and up-leveling is more like BF2, but it's more back-to-the-roots, especially the class system, which doesn't feature advanced classes like medics or engineers, instead, all classes are strictly offensive. Every single feature is extremely polished, and everything which would hinder the flow and speed of the game has been removed. And my personal killer-feature: you don't have to communicate a lot. Nothing breaks the immersion more then some jerks talking about some completely unrelated shit like their last GameStop-visit during a Team Deathmatch in some destroyed middle-eastern city. Fortunately the game plays just as well without headset. Big win.

2 Feb 2008

D3D Debugging

I just spent a bit of time debugging the D3D9 specific code. Running the test viewer under the D3D debug runtime and with the warning level to highest reveals 2 warnings:
  • redundant render state switches (which I'm ignoring for now, the frame shader system already helps to reduce redundant state switches a lot, and fixing the remaining state switch warnings would involve implementing a D3DXEffectStateManager, but before I do this I want to make sure that my own redundant state switch detection would actually be faster then D3D's)
  • more serious is the second warning: "render target was detected as bound, but couldn't detect if texture was actually used in rendering". This is only a real problem if you want to read from the same render target you're currently rendering to, which isn't happening anywhere in Nebula3 (unless you screw up the frame-shaders). I fixed the warning by adding a "UnbindD3D9Resources()" method to the D3D9RenderDevice, which is called at the end of a rendering pass, and before the D3D9 device is shut down. The method simply sets the texture stages, vertex buffer and index buffer of the device to NULL.
The most serious problem though was that Direct3D reported memory leaks when shutting down the application. Finding a D3D9 memory leak can be tricky, but thankfully Direct3D has a nice mechanism built into its debug runtime to find the allocation which causes the memory leak: on shutdown D3D writes a memleak log to debug-out, where each memory leak is given an unique id. The problem is, that even one forgotten Release() call can generate hundreds of memory leaks because of D3D's many internal dependencies. The most interesting leak however is usually the last reported, at the bottom of the leak-report. To find the offending allocation, open the DirectX Control Panel, go to the Direct3D 9 tab, and enter the last reported AllocID into the "Break On AllocID" field. Run the application in the debugger, and it should break at the allocation call in question. Turns out I forgot 3 Release() calls: one in D3D9RenderTarget::BeginPass() after obtaining the backbuffer surface from the D3D9 device, one in D3D9RenderTarget::EndPass() after a GetSurfaceLevel(), and after another GetSurfaceLevel() call in D3D9RenderDevice::SaveScreenShot().

The moral of the story: read the D3D9 docs carefully even for "uninteresting" functions, and run the application through the debug runtime after each change to the rendering code, to prevent such bugs from piling up to unmanageable levels.

I wanted to get this stuff fixed before the January SDK release, so this will come next week as time permits. Drakensang bugfixing and optimization has full priority for me at the moment.

Completely unrelated:
  • Watched Death Proof yesterday, and I was a little bit disappointed. The first half was outright boring, then 10 seconds with (probably) the most spectacular (and gory) car crash in movie history. The second half with the "new girls" was actually really good, but the car-chase at the end and the finale was quite a letdown as well, guess I was hoping to see Kurt Russel die in a more spectacular way hehe...
  • RezHD on XBLA is ... wow.