Synchronizing passes with a fence
Block GPU stages in a pass until another pass unblocks it by signaling a fence.
Overview
A fence resolves access conflicts between commands in different passes that you submit to the same command queue, including the passes you commit in other command buffers.
When your app encodes commands that access a resource from different passes — or different stages within a single pass — it creates an access conflict when at least one command modifies that resource. This conflict happens because the GPU can run multiple commands at the same time, including those from:
Multiple passes
Different stages of a pass, such as the blit and dispatch stages of a compute pass
Multiple instances of a stage, such as two or more dispatch commands within a compute pass
For more information about resource access conflicts and GPU stages, see Resource synchronization and MTLStages, respectively.
For more information about synchronizing within a single pass, see Synchronizing stages within a pass.
Start by identifying which memory operations from different passes introduce a conflict and resolve it with a fence:
Update a fence in the producing pass.
Wait for that fence in the consuming pass.
Identify access conflicts between two or more passes
The following code example encodes two compute passes. The first encoder creates a pass with a copy command and a dispatch command:
The second encoder also creates a pass with a copy command and a dispatch command:
The example has at least one access conflict because both passes access a common resource, bufferC:
The dispatch command from the first pass stores to
bufferC.The copy command from the second pass loads from
bufferC.
[Image]
Without synchronization, the GPU can run both passes and their stages in parallel, which can yield inconsistent results in resources with access conflicts.
[Image]
Resolve an access conflict between passes with a fence
Resolve access conflicts between passes from the same command queue with an MTLFence instance by:
Instructing the producing pass to signal a pass that’s waiting for a fence by calling the encoder’s updateFence(_:afterEncoderStages:) method.
Instructing the consuming pass to wait for the fence by calling the encoder’s waitForFence(_:beforeEncoderStages:) method.
The GPU pauses before running the commands you encode in the consuming pass after the wait command until the GPU runs all update commands you encode for the same fence in the other relevant, producing passes.
The following code example modifies the code for the first pass by adding a call that updates the fence:
The following code example modifies the code for the second pass by adding a call that waits for the fence.
The fence forces the GPU to wait before it runs the blit stage of the second pass until the dispatch stage of the first pass finishes storing its modifications to the underlying memory for bufferC.
[Image]
You can reuse a fence instance to resolve resource access conflicts in subsequent commands after encoding a wait command for a pass.
For more information about other synchronization mechanisms, see these articles in the series: