Buffer access modes in DirectX10

· by Steve · Read in about 4 min · (693 Words)

So, I finally got around to creating that ‘Direct3D10’ folder underneath ogrenew/RenderSystems today. Don’t get too excited, it’s basically just a copy / rename job so far, but I have at least made a start. Anyway, even though I’ve read bits of the DirectX10 docs before, there’s nothing like a practical implementation to really focus the mind.

So, I figured the first thing to at least attempt to get compiling was the vertex / index buffer support. Fairly straight forward stuff (I thought), except that now Dx10 has made all vertex, index and texture buffers a subtype of a single root type, the Dx10 ‘buffer’; which of course nicely dovetails with the concept we’ve had for ages in HardwareBuffer and its subtypes (HardwareVertexBuffer, HardwarePixelBuffer etc). Might be able to exploit that at some point, but for now I’m simply aiming to get a running rendersystem which is feature-equivalent to Dx9 as a first step.

I found buffer access modes are significantly tightened up in DirectX10 though. In a way that’s good, it’s far more clear-cut what you can and can’t do in terms of reading and writing data with the CPU and GPU. On the negative side though, it does make you jump through hoops. If you set the usage mode in D3D9 to ‘write only’ for example, you can in fact still read from the buffer - it’s just likely to be a little slow. This is usually ok if you’re only doing it once or infrequently; for example you might want to preprocess a mesh after loading, then never touch it again. You don’t want to change the usage mode to ‘read/write’ just for that because you know that will impact performance for what the buffer is used for most of the time.

In DirectX10 you can’t do this. If you want one-off access to a buffer which you’ve marked as  ‘write only’ (terms differ now, but bear with me for consistency), you have to use a different API to copy the data into a buffer which has a readable mode.  Now, this probably reflects what the Dx9 driver would have done anyway in my understanding, it’s just that now you have to do it explicitly. Again in a way I like this, because it’s nice to be very precise about what you’re doing. In other ways it’s a pain in the rear end, because it’s really quite common to need this temporary ‘enhanced’ access at startup before settling on the stable usage state - and there is code in Ogre to assume that read access is still available at a temporary performance cost for just this reason. It’s really messy to force the use of a different API just because of a temporary usage discrepancy. At Ogre we like to (try to) provide a unified interface that ‘just works’ in all contexts - with caveats of course about performance.

It’s resolvable of course, in our classes I will just have to detect this kind of condition (and there are others with write modes, and now there’s a GPU write mode too for stream out and I assume RTT uses it too) and cope with it internally. Whilst mapping or reading / writing data from a buffer which has an incompatible mode, we’ll just have to perform the copy internally and map the shadow copy. We already have the concept of shadow buffers in our code, but that’s more for when you want a permanent readable copy that’s fast, not for one-off locks. For the one-off cases I think I’ll use a dynamic scratch buffer pool - I already wrote one of these for GL in fact to resolve some GL buffer locking performance oddities earlier this year.

So, in the first thing I looked at I found some conceptual adjustments needed. 😕 Ah well, just don’t hold your breath 😉 I’m doing this as a spare-time task for the moment but I hope to get it done in the next couple of months - although if you want to pay me to spend some of my daytime on this too, I’m open to reasonable offers, you know the email 😉