Optional
extraExtra parameters to include in the payload
Optional
grammarThe gnbf grammar to use for grammar-based sampling.
Optional
imagesOptional
max_The number of predictions to return.
Optional
min_The minimum probability for a token to be considered, relative to the probability of the most likely token.s
Optional
modelThe model configuration details for inference.
Optional
repeat_Adjusts penalty for repeated tokens.
Optional
stopList of stop words or phrases to halt predictions.
Optional
streamIndicates if results should be streamed progressively.
Optional
temperatureAdjusts randomness in sampling; higher values mean more randomness.
Optional
templateThe template to use, for the backends that support it.
Optional
tfsSet the tail free sampling value.
Optional
top_Limits the result set to the top K results.
Optional
top_Filters results based on cumulative probability.
Describes the parameters for making an inference request.
InferenceParams