I read in the GCN whitepaper the following fact about the rasterizer: "Each rasterizer can read in a single triangle per cycle, and write out 16 pixels. Afterwards, the hierarchical Z-testing will eliminate any occluded pixels prior to pixel shading."
When we render points, than we project each primitive into one pixel, which means, that the rasterizer writes out one occluded pixel per clock. When also for only one occluded pixel an hierarchical Z-test is done, than the z-test waste performance, right?
Please tell me, if the hierarchical approach is also done by the hardware, even when the rasterizer delivers only one occluded pixel.