Contents

Implementing order-independent transparency with image blocks

Draw overlapping, transparent surfaces in any order by using tile shaders and image blocks.

Overview

[Image]

In order to draw a convincing transparency effect, apps typically draw a scene’s transparent, 3D geometry in depth order, from farthest to nearest, relative to the camera. These apps use the CPU to sort the geometry by depth before drawing, but this approach has some drawbacks in that it:

  • Can produce incorrect results when two meshes intersect

  • Requires the app to recalculate the depth of each mesh with the CPU each time the camera or objects move within the scene

Order-independent transparency solves these issues by:

  • Eliminating the need to sort meshes

  • Moving the computation workload to the GPU

  • Evaluating the geometry’s depth on a per-fragment basis

The sample takes advantage of tile memory available on Apple silicon GPUs to quickly evaluate the depth and color of each fragment. Tile memory is a fast, temporary storage that shaders can use to store and retrieve information to and from other shaders.

Configure the sample code project

To run this sample, you need Xcode 12 and a physical device including:

  • A Mac with Apple silicon running macOS 12 or later

  • An iOS device with an A11 Bionic chip or later running iOS 15 or later

This sample can only run on a physical device because it uses the image block and tile shader features, which Simulator doesn’t support.

Check for support

The app checks that the device supports the features required to implement this order-independent transparency algorithm.

The techniques implemented by this sample require the image block and tile shader features, which are a part of the MTLGPUFamily.apple4 feature set. Although this app does run on devices that don’t support these features, it incorrectly blends transparent geometries.

Set up storage for transparent fragments

The sample uses tile memory to store data for multiple transparent layers. The processTransparentFragment shader draws the transparent geometry and populates the TransparentFragmentStore structure. This structure contains a single member, fragmentValues, which uses the [[imageblock_data]] attribute to specify that data for this element resides in tile memory.

The TransparentFragmentValues structure stores the color and depth values for each fragment generated by transparent triangles. Because the fragment shader writes this data to tile memory, each member needs to be part of a raster order group. Raster order groups allow precise control of the order of parallel fragment shader threads accessing the same pixel coordinates. For more information, see Tailor your apps for Apple GPUs and tile-based deferred rendering.

The kNumLayers constant defines the maximum number of transparent layers the image block can store.

If the number of transparent layers for any pixel exceeds kNumLayers, the processTransparentFragment shader drops the excess values. The subsequent blending shader ignores those color values that yield a technically incorrect result. Despite this, even if kNumLayers isn’t large enough, this algorithm still produces a convincing result.

Reserve tile memory

When the sample sets up a render pass, it partitions the tile memory between storage for the TransparentFragmentValues structure and render targets.

Once the sample builds a render pipeline using the TransparentFragmentValues structure, the pipeline’s imageblockSampleLength property specifies the amount of tile memory the pipeline requires. As the value of kNumLayers increases and the structure side increases, the value of imageblockSampleLength also increases.

The sample reserves memory in the render pass by setting the imageblockSampleLength property of _forwardRenderPassDescriptor to the pipeline’s imageblockSampleLength property. Once the sample builds a render pipeline using the TransparentFragmentStore structure, the pipeline’s imageblockSampleLength property specifies the amount of tile memory the pipeline requires. As the value of kNumLayers increases, the value of imageblockSampleLength also increases.

The sample chooses the tile size for the rasterized color and depth targets by setting the tileWidth and tileHeight properties of _forwardRenderPassDescriptor.

In general, a larger tile size yields better performance because the GPU incurs less overhead to render fewer tiles. However, the GPU uses tile memory for both rasterization data and image block data. You need to balance the dimensions chosen for the tile size with that of the value for kNumLayers. If these values exceed the amount of tile memory, the GPU can’t execute the tile shader in the render pass. In that case, creation of the render pass fails.

Initialize the image block

The sample begins each render pass by initializing the image block structure.

The initTransparentFragmentStore kernel writes zeros to each color value in the image block and INFINITY to each depth value.

Layer transparent geometry

After the sample initializes the image block and renders opaque objects, it uses the processTransparentFragments shader to populate the layers of the TransparentFragmentValues structure. This fragment shader doesn’t actually render to any of the render pass attachments and only writes to the structure in the image block. Because it doesn’t write to an attachment, the pipeline disables writes to color attachments.

The shader inserts transparent fragment values in order of depth. It also discards fragment values with the farthest depth value when fragment values occupy all layers.

Each index in the array represents a layer. The shader tests the depth value of the incoming fragment against the depth value of the fragment in the current layer. If the depth value is less than the fragment in the current layer, the shader sets insert to true. When insert is true, the shader swaps the fragment values in the current layer with the tested fragment’s values. The values replaced become the new fragment values, which the shader tests in the next iteration of the loop. When the array fills, the shader discards the fragments with the farthest depth.

The following diagram shows an example of a shader populating layers of the image block as the app renders three transparent triangles.

[Image]

In this example, the app first renders a green triangle whose fragments populate the first layer in the image block. It then renders an orange triangle with depth values less than the green triangle. The shader replaces any fragments covering the green triangle with the orange color and moves the fragments from the green triangle to the next layer. Finally, the app renders a blue triangle whose depth is in between the orange triangle and the green triangle. The shader replaces all the values of the green triangle in the second layer and moves them to the third. This results in an image block with fragments ordered by their depth values.

Blend fragments for transparency

After the app layers all transparent geometry, the image block contains a list of fragment colors ordered by depth for every transparent fragment.

The app executes the blendFragments shader by drawing a full screen quad. This shader takes the current values in the color buffer populated by the processOpaqueFragments shader. It also takes the image block data populated by the processTransparentFragments shader.

The shader calculates the output color starting with the opaque value in the color attachment.

It iterates through each layer in the image block structure, accumulating their color values.

The resulting value is a mix of the transparent and opaque colors blended in order of their depth values.

See Also

Render workflows