Optional
extraExtra parameters to include in the payload
Optional
grammarThe gnbf grammar to use for grammar-based sampling.
Optional
imagesThe base64 images data (for multimodal models).
Optional
max_The number of predictions to return.
Optional
min_The minimum probability for a token to be considered, relative to the probability of the most likely token.
Optional
modelThe model configuration details for inference.
Optional
repeat_Adjusts penalty for repeated tokens.
Optional
schemaA json schema to format the output.
Optional
stopList of stop words or phrases to halt predictions.
Optional
streamIndicates if results should be streamed progressively.
Optional
temperatureAdjusts randomness in sampling; higher values mean more randomness.
Optional
templateThe template to use, for the backends that support it.
Optional
tfsSet the tail free sampling value.
Optional
top_Limits the result set to the top K results.
Optional
top_Filters results based on cumulative probability.
Optional
tsA Typescript interface to be converted to a gnbf grammar to use for grammar-based sampling.
Describes the parameters for making an inference request.
InferenceParams
Example