Contents

GenerationOptions

Options that control how the model generates its response to a prompt.

Declaration

struct GenerationOptions

Mentioned in

Overview

Generation options determine the decoding strategy the framework uses to adjust the way the model chooses output tokens. When you interact with the model, it converts your input to a token sequence, and uses it to generate the response.

Only use maximumResponseTokens when you need to protect against unexpectedly verbose responses. Enforcing a strict token response limit can lead to the model producing malformed results or gramatically incorrect responses.

All input to the model contributes tokens to the context window of the LanguageModelSession — including the Instructions, Prompt, Tool, and Generable types, and the model’s responses. If your session exceeds the available context size, it throws LanguageModelSession.GenerationError.exceededContextWindowSize(_:). For more information on managing the context window size, see TN3193: Managing the on-device foundation model’s context window.

Topics

Creating options

Configuring the response tokens

Configuring the sampling mode

Configuring the temperature

See Also

Prompting