Working around GLSL attribute aliasing problems on OS X

As much as I love using OS X, one of the double-edged swords is that the graphics driver updates are controlled by Apple. On the one hand, that’s a bonus because you have a better idea of what you’re dealing with out in the wild, and people get prompted to update their drivers (as part of the regular OS X auto-update). On the other hand, it’s a pain in the ass because the drivers tend to lag behind those from the GPU manufacturers and therefore have bugs the mainstream ones don’t.

I just recently committed a patch from ‘hellcatv’, one of the more prolific Mac users in our community to deal with a few driver bugs in some of the older Powerbooks, and also some quirks of the recent Intel GMA-based iMacs – stuff like choking on glCompressedTexSubImage2DARB for no good reason (ie, forget uploading part of a DXT-compressed texture, it’s all or nothing). I’m indebted to him for testing on a huge range of Macs that I’d never have access to, without spending a load of cash and filling up my home office with yet more surplus hardware (my wife would not entirely approve of either methinks). One of the remaining problems we’ve had is that the OS X GLSL drivers on the recent NVIDIA-based MacBook Pros suffered from vertex attribute aliasing-associated performance problems that other platforms did not.

Now, NVIDIA has always had a fixed set of vertex attribute assignments for the built-ins – gl_Vertex is 0, gl_Normal is 2, etc. If you used gl_Normal in a shader, but also bound a custom attribute (say, skeletal blend weights) to index 2 too, you’d get a performance drop because of the aliasing. That’s fine – so instead, when we used custom attributes, we didn’t fix the indexes we used, we let the linker decide, taking into account what was actually used in the shader. We’d include the attributes in the shader, and then after glLinkProgramARB, we’d ask the program object what indexes it had chosen for the custom attributes, then wire them up that way. The well-behaved drivers (Windows, Linux) on NVIDIA would avoid clashing with any built-in attributes that had been referenced in the shader and we’d have a nice tight list of unique indexes, but on OS X, the driver would stupidly often assign custom attributes to built-in indexes that were in fact being used in the shader. Tsk, bad driver, no treats for you today.

It’s been reported to Apple as a bug, but so far, no dice on the fix front, so I decided it was time to try to work around it. The first thing I tried was simply telling the driver at the pre-link stage that I wanted any occurrences of the custom vertex attributes we supported to be placed out of the way of any possible built-ins that might be used. So for example, I did this:

glBindAttribLocationARB(mGLHandle, 6, "blendWeights");
glBindAttribLocationARB(mGLHandle, 7, "blendIndices");

That seemed to take effect, and calling glGetAttribLocationARB after the link reflected that I was indeed getting indexes 6/7 bound, rather than the 1/2 that the driver kept picking before (bad, because I used gl_Normal in this shader which is index 2). However, despite the indexes being out of the way of anything else, the shader still performed really poorly. I tried a few other indexes, like 14 and 15 which overlap with the top 2 UVs but which are rarely used (you can’t exceed 15, at least on NVIDIA), but the result was the same.

Cue head-scratching. There should be no aliasing problems anymore, yet still the shader performs like an asthmatic ant carrying some heavy shopping. So, the last thing I tried was going the whole hog, and implementing support for custom attribute replacements for all of the built-ins, all at known, fixed indexes matching currently known hardware defaults & limitations, ie:

Index Built-in Custom Name
0  gl_Vertex vertex
1 n/a blendWeights
2 gl_Normal normal
3 gl_Color colour
4 gl_SecondaryColor secondary_colour
5 gl_FogCoord n/a
7 n/a blendIndices
8 gl_MultiTexCoord0 uv0
9 gl_MultiTexCoord1 uv1
10 gl_MultiTexCoord2 uv2
11 gl_MultiTexCoord3 uv3
12 gl_MultiTexCoord4 uv4
13 gl_MultiTexCoord5 uv5
14 gl_MultiTexCoord6 uv6, tangent
15 gl_MultiTexCoord7 uv7, binormal

And what do you know, that works. The skinning shader runs considerably better like that – still not great, I think the Apple GLSL implementation is not that good, but at least 2-3 times faster than it did before. It kinda sucks to have had to do it that way, I really liked being able to leave it up to the driver to organise the attribute bindings and only using the ones I needed because that’s more in the spirit of the GLSL way, but clearly being more rigid is the more reliable way. I know that I could have packed the tangent in attribute 6 instead to save a UV entry, but the use of 6 seemed to have some performance issues still so I’ve gone with the fixed bindings I would have used with ARB programs. It’s incredibly rare to need more than 5 UVs going into a vertex program anyway in my experience.

So, the advice appears to be that if you need a custom binding in GLSL and you want it to run well on a Mac, using all custom attribute bindings in the vertex shader and fixing the indexes seems to be the way to go.

  • btmorex

    Speaking of driver bugs, do you have an nvidia contact that you can lean on regarding:

    http://www.ogre3d.org/phpBB2/viewtopic.php?t=38037

    Seems to be pretty widespread among ogre linux users.

  • http://www.stevestreeting.com Steve

    I’ve emailed my main driver support contact to see if we can get some focussed assistance.

  • http://www.imilabs.com Shawn Kendall

    Wow, I am very glad you took the time to post this.

    I had a pretty good map (I thought) of what was causing the GLSL driver fallback to the “Apple Software Renderer” (testable in the OpenGL Shader Builder) but I knew it wasn’t totally right. I can confirm that this is a major cause of what appears to be crazy GLSL bugs on Mac OS X, multiple hardware configs as well.

    Once you know, it’s pretty simple to work around, I’ve gotten several really broken shaders to work. I hope they fix this soon, and I hope my text here will help other find this page when searching. Good luck everyone!

  • Rhys

    Any further updates on this from Apple? Do you still find this an issue with 10.5.7 and require the fix?

    We believe we’re seeing the same problem in Freespace 2 Open Source with GLSL falling back to software rendering on OS X – http://www.hard-light.net/wiki/index.php/OpenGL_Shaders_(GLSL)#Mac_users.

    Any assistance or code samples of the above fix would be appreciated.

    http://scp.indiegames.us/

  • http://www.stevestreeting.com Steve

    I never got any specific technical assistance on this from Apple / NVIDIA, but we just continue to work around it as described here. I believe the issue is still there in 10.5.7.

  • Rhys

    Thanks for the update Steve, we’re progressing on the workaround in our codebase and will let you know how it goes if you like.

  • http://www.stevestreeting.com Steve

    Yeah, please let me know – always good to have feedback on awkward things like this, share the pain ;)

  • http://www.stevestreeting.com Steve

    Oh, I realised I didn’t give you any code samples, sorry. Our workaround is here: http://ogre.svn.sourceforge.net/viewvc/ogre/trunk/RenderSystems/GL/src/GLSL/src/OgreGLSLLinkProgram.cpp?revision=8699

    GLGpuProgram::getFixedAttributeIndex basically returns the values from table in my original post.

  • Arseny Kapoulkine

    I’d like to note that the code that binds the attributes is, like, REALLY BAD. I mean, choosing names like “normal” for default built-ins is bad enough (may be a legacy thing though – but otherwise something like ogre_Normal would’ve been much better) – however, stopping at the first occurance of the attribute name in the shader source is, how to best put it… confusing, when adding a comment with a word containing normal, or a function that does a normalize before the declaration of the attribute makes the varying unbound.