With all the focus on the console platforms I didn’t notice one very cool addition to the March DirectX SDK: XNAMath. This is basically the traditional Xbox360 vector math lib, ported to the PC with SSE2 and inlining support. The N3 math classes are now running from the same code base on top of XNAMath for the PC and Xbox360 platforms. Maik has spent a few days to analyze the generated code and after some tweaking the improvements for our simple math benchmarks are absolutely dramatic, up to 4x faster on the PC side!
We had to change our memory allocation routines on the PC to always return 16-byte aligned memory, without this, XNAMath isn’t really useful since the aligned load/store functions can’t be used on vectors residing in heap buffers. Really strange that there isn’t a way to do this through the Win32 heap functions directly (or is there?).
Other then that I’m currently deep into “jobifiying” the render thread, in order to free the PS3-PPU from the mundane number-crunching tasks. Properly jobified code will also “automatically” run about 2x faster on a 2-core PC, and about 3..4x faster on the Xbox360, since even single jobs will be split and processing will be distributed to worker threads. The actual speedup may even be higher, since the data must be re-organized into small independent chunks (“slices”) of about 16..32 kByte each in order to make the best use of the SPU local memory, and this improved spatial locality is also extremely beneficial for CPU caches on the other platforms (I think I’m starting to sound like a record, but I can’t stress enough how good this data-reorganization will be for N3 on ALL platforms :)
4 Kommentare:
So, out of curiosity, are you now using _aligned_malloc() or something like
void* data=HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size + 15)
void* data_aligned=(data+15)&(~15);
for 16-byte aligned heap allocations? Guessing from various Internet sources, the latter should be faster.
We're using HeapAlloc wrapper functions (called __HeapAlloc16, __HeapFree16, etc...), so it's like your second example.
Hello! Don't know exactly where to as this, still I'll try here. I've been on Drakensang and was impressed with technical level (N2). (I used nnpack tool to get their game data :) ). What I found the most amazing are shader trees.
Very tricky and nice looking. So I wonder, why did they refused from this trees in DS:DE:River of Time? Is this about N3 specs?
潤滑液的內衣的性感內衣的自慰器的充氣娃娃的AV的情趣的衣蝶的
按摩棒的電動按摩棒的飛機杯的自慰套的自慰套的情趣內衣的
G點的性感丁字褲的吊帶襪的丁字褲的無線跳蛋的性感睡衣的
角色扮演的跳蛋的情趣跳蛋的煙火批發的煙火的情趣用品的SM的
Post a Comment