GenerationOptions
Options that control how the model generates its response to a prompt.
Declaration
struct GenerationOptionsMentioned in
Overview
Generation options determine the decoding strategy the framework uses to adjust the way the model chooses output tokens. When you interact with the model, it converts your input to a token sequence, and uses it to generate the response.
Only use maximumResponseTokens when you need to protect against unexpectedly verbose responses. Enforcing a strict token response limit can lead to the model producing malformed results or gramatically incorrect responses.
All input to the model contributes tokens to the context window of the LanguageModelSession — including the Instructions, Prompt, Tool, and Generable types, and the model’s responses. If your session exceeds the available context size, it throws LanguageModelSession.GenerationError.exceededContextWindowSize(_:). For more information on managing the context window size, see TN3193: Managing the on-device foundation model’s context window.