Grok – Token Trimming

Each chat message sent has a token value based on the total content of all the messages in the chat session. As each chat session logs messages, the token total for the chat session increases as the chat session is used.

Each model has a limit on the number of tokens that can be used per request.

For Methods in the Chats Category, the available models have token limits as follows:

Name	Model	Token Limit
Grok 4	grok-4-0709	256,000
Grok 3	grok-3	131,072
Grok 3 Mini	grok-3-mini	131,072
Grok 3 Fast	grok-3-fast	131,072
Grok 3 Mini Fast	grok-3-mini-fast	131,072
Grok 2 Vision	grok-2-vision-1212	32,768

To automatically manage chat session tokens, methods in the Chats method category have 2 parameters you can use:

Trim Tokens?
Custom Token Limit

Trim Tokens?

Set this parameter to true to automatically trim chat sessions down to below the token limit for the model. If no model is found, a default token limit of 131,072 is used. When you trim a chat session, Cyclr trims the oldest message in the chat session first. This process repeats until the chat session is below the token limit, and then Cyclr sends the request with the trimmed chat session.

Note: The token total for a chat session is calculated by Cyclr calling the tokenizer for each new message sent, and capturing the response tokens for each response message from Grok.

Warning: If you enable Trim Tokens?, messages that Cyclr removes from chat sessions are permanently removed.

Warning: The Trim Tokens? parameter should not be changed after you have created the chat session to ensure token calculations remain accurate.

Custom Token Limit

Set this parameter to force a custom token limit for Cyclr to trim a chat session down to. This overrides the limit for the model and the default token limit. There is a lower limit of 100 tokens for this value to make sure that at least one message can be sent in the request.