Ok, so I’ve finally managed to get back to doing more testing on the ‘integrated’ texture shadow techniques this week (formerly known as ‘custom sequence’ texture shadows but I thought that was an ambiguous term). For those who haven’t been following this, ‘integrated’ shadows in Ogre are those where the shadowing calculation is done directly in the primary material of the receiver, and not done as additional passes. The easiest way to use Ogre’s texture shadows (and the only way previously) is to use the non-integrated methods, which basically do one of two things - re-render the receivers in the scene with a modulative pass per light, or break apart your receiver materials into ambient, per-light and decal (albedo) render passes in order to build shadow-masked lighting up additively. This is great in a way, especially for people using fixed-function materials since it’s all automatic and you don’t have to worry about shadows when writing your materials. The downside is that it takes extra passes, burning GPU time and bandwidth.
‘Integrated’ shadows, in contrast, totally get rid of the extra receiver passes, and expect you to use the shadow textures inside your own object materials. This is clearly faster, and you also get total control of exactly how the shadows are applied in your lighting calculations, which can be very handy, especially if you want accurate specular effects or HDR. The downside is it puts the onus on you to pull in the shadow texture and use them, although in practice this isn’t very hard. You basically use a regular texture_unit directive, but instead of referencing a regular texture, you use the ‘content_type shadow’ attribute, and Ogre matches up the shadow texture for the ‘most important’ lights to fill as many of those as you declare, syncing it up with the lighting and texture projection parameters that you auto-bind to your shaders too. Pretty funky, and although it’s more advanced I personally wouldn’t want to use any other way now in a serious project since it’s so much more powerful.
One of the first things I did this week was improve the fallback systems when you run out of relevant lights, or you ask for spotlight parameters when in fact you don’t have a spotlight in that slot. Now there are safe fallbacks so for example if your shader is written for 2 shadowed lights but there’s only 1 active light in range, the second shadow texture is bound to a properly initialised 1×1 texture which will result in no shadows. This wouldn’t have been necessary if your shader is implementing additive lighting, since if there was no light the light colour would be black and garbage in the shadow texture wouldn’t have mattered, but since I don’t want to asume how people will implemet their shaders I wanted to make this as safe as possible.
One thing this triggered me to do today as I did some more testing was to add support for 3 new pixel formats: PF_SHORT_GR, PF_FLOAT16_GR and PF_FLOAT32_GR. Obviously these are 2-channel versions of the formats we already offer in R, RGB and RGBA forms. I’ve thought about this a couple of times since GR formats can occasionally be useful, but never got around to it. What bumped it up the priority list was that in D3D, when you ask for a D3DFMT_R16F (ie PF_FLOAT16_R), D3D actually gives you a D3DFMT_G16R16. This surprised me since I know that using a 16-bit floating point format is faster even with a single channel, bt clearly size-wise it doesn’t give you any benefits since it’s padded to 32-bits, understandably for the hardware alignment. However, I was just surprised it didn’t give me a D3DFMT_G16R16F - notice it actually gave me the integer format, for a reason I can’t explain. Most odd. Perhaps this has been the root of some of my precision problems in depth shadow mapping? More investigation needed there.
Anyway, this was confusing Ogre when you tried to lock the surface since it had no concept of D3DFMT_G16R16 until I added it - until this happened I had no idea D3D was behaving like this. I’ve never locked a 16-bit single-channel floating point texture before, I’ve done it on R32F and RGBA16F, which clearly are already 32-bit aligned so there were no issues. But here I was locking it to fill the 1×1 null shadow texture with high-values so I tripped over it. Anyway, the 2-channel formats are there to be used for whatever reason now. GL doesn’t have a 2-channel green/red format, but I’ve represented these new formats as GL_LUMINANCE_ALPHA16F_ARB and similar variants, which amounts to the same thing in terms of physical size and formatting. If you actually used this format in a shader you might have to read / write different channels in GLSL compared to HLSL since GL treats them differently.
The screenshot shows a fairly simple test of integrated shadows using a shader which supports offset mapping and 2 per-pixel spotlights with shadowmaps (accurate masked-additive mode), all done in a single pass. Doing the shadows directly there adds only 2 extra texture fetches to the pixel shader (bringing it to 5), plus a small handful of arithmetic operations in the vertex shader (of the form mul(worldPos, textureViewProj[i])), which is easily compensated for by the reduced number of passes. Most of the cost of the shader in this case is going into the per-pixel spotlight calculation. I’m not really using much of the power of this technique to make great looking materials here since I’m short of time, but I’m sure people will come up with some great things. Texture shadows & shaders have had the training wheels taken off 😀