We have been using hardware instancing under D3D9C for some time to render lines as thick tubes.
Our technique involves the use of an overlapping vertex declaration (i.e. the amount of data accessed by the vertex declaration is twice the size of the stride used in the vertex stream) so that each invocation of the vertex shader can see the "current" vertex position as well as the "next" vertex position.
This all works fine on every card we have seen to date (that is SM3 compliant) other than the HD7000 series.
Unfortunately on the HD7000 series cards (with Catalyst version 12.8), we get rendering artifacts with this technique.
Specifically, if the number of vertices presented to the DrawIndexedPrimitives() call exceeds 2730, then the primitives that access vertices beyond 2730 will generally not render at all (although on some random occasions they might).
This is highly reproducible (I have attached a small cut-down test bed application to reproduce) and seems independent of any of the many variations that I have made to the technique.