15 Mar 2008

Vertex Component Packing

I finally got around to optimize vertex component sizes for Drakensang. A typical vertex (coords, normal, tangent, binormal, one uv-set) is now 28 bytes instead of 56 bytes, a light-mapped mesh vertex (2 uv-sets) is now 32 bytes instead of 64, and a skinned vertex has been reduced to 36 bytes instead of 88. With this step I have finally burned all DX7-bridges, all our projects have a 2.0 minspec now (since Radon Labs also does casual titles, we had to support Win98 and DX7 for much too long). As a result, the size of all mesh resources in Drakensang has been reduced from from a whopping 1.2 GByte down to about 650 MByte. This also means reduced loading times and better vertex-through-put when transferring vertex data to the graphics chip. Some vertex components need to be scaled to the proper range in the vertex shader, but this is at most one multiply-add operation per component.

I also implemented support for the new vertex formats in Nebula3. N3 always had support for packed vertex components, so all I had to do was to add a few lines to the legacy NVX2 mesh loader and fix a few places in the vertex shaders for unpacking normals and texcoords.

Here's how the vertex components are now packed by default:
  • Position: Float3 (just as before)
  • Normal, Tangent, Binormal: UByte4N (unsigned byte, normalized)
  • TexCoord: Short2 as 4.12 fixed point
  • Color: UByte4N
  • Skin Weights: UByte4N
  • Skin Joint Indices: UByte4
Normals, tangents and binormals and tex-coords need an extra unpacking instruction in the vertex shader. Skin weights need to be "re-normalized" in the vertex shader because they loose too much precision:

float4 weights = packedWeights / dot(packedWeights, float4(1.0, 1.0, 1.0, 1.0));

This will make sure that the components add up to 1.0. In case you're wondering, the dot product is equivalent with s = (x + y + z + w), it's just much more efficient, because the dot product is a native vertex shader instruction (although I must confess that I didn't check yet whether fxc's optimizer is clever enough to optimize the horizontal sum into a dot product automatically).


Arseny Kapoulkine said...

Renormalizing weights is not needed, you just need to ensure that bytes that form a packed ubyte4 sum to 255 exactly (or, if you have 2 ubyte4 for 8 weights, that all 8 sum to 255).

Floh said...

Yes makes sense. At first I didn't pay attention to this issue since at first glance the characters that I tested looked fine. Only after I had exported the Drakensang data for final checking I noticed that fine detail in the character faces was borked, so I did sort of a last minute fix with the renormalization in the vertex shader. It didn't dawn at me that the real problem is the clamping when converting from float to uchar in the export step. I'll try that out when I'm back in the labs next week.

Floh said...

PS: Arseny, do you actually use 8 weights in your skinning? Is this for normal characters models? Is there any noteworthy visual improvement over 4 weights? We're currently considering going back to only 2 weights on some platforms where we're doing CPU skinning (DS and Wii), so I'd be interested in any pro and con arguments...

Arseny Kapoulkine said...

It's 2 bones per vertex for our PSP projects and 4 bones per vertex for PS3/360, our artists are quite happy with it - in fact I think 8 weights are needed only if you want to do facial animation with bones.

Kim Hyoun Woo said...

Supporting vertex component compression also will be good for deferred shading. So, is there any plan to release deferred shading stuff on N3 in near future?