OpenAI language model processes text by dividing it into tokens, which can be individual words or groups of characters. For example, the word “fantastic” would be split into the tokens “fan”, “tas”, and “tic”, while a shorter word like “gold” would be considered a single token. Many tokens start with a space, such as ” hi” and ” greetings”.
The number of tokens processed in a single API request depends on the length of the input and output text. As a general guideline, one token is roughly equivalent to 4 characters or 0.75 words for English text. It’s important to note that the combined length of the text prompt and generated completion must not exceed the model’s maximum context length, which is typically 2048 tokens or approximately 1500 words.
Our plugin’s default value for max token is 700 which is good enough to produce longer articles. If you’d like to produce much shorter articles then you can lower this value.
There are several models available, each with its own features and pricing. Ada is the quickest model, while Davinci is the most advanced. Our plugin uses Davinci.
Prices are based on a rate of 1,000 tokens. Tokens can be thought of as parts of words, and 1,000 tokens is approximately 750 words.
Understanding the ‘max_tokens’ parameter in GPT-3 and its impact on generated text
The max_length
or max_tokens
parameter is a control for the maximum number of tokens that can be generated in a single call to the GPT-3 (Generative Pre-trained Transformer 3) model.
A token is a discrete unit of meaning in natural language processing. In GPT-3, each token is represented as a unique integer, and the set of all possible tokens is called the vocabulary. In the case of GPT-3 the vocabulary size is 175000 tokens.
When generating text with GPT-3, the model takes in a prompt (also called the “seed text“), which is a starting point for the generated text. The model then uses this prompt to generate a sequence of tokens, one at a time. The max_length
or max_tokens
parameter controls the maximum number of tokens that can be generated in a single call to the model.
The max_length
or max_tokens
parameter works as a limit for the output of the model, in case the model is generating very long text, by setting this parameter, you can control the size of the generated text. The default value for max_length
or max_tokens
is 2048 tokens and the value can be set up to 4096 tokens.
This parameter can be useful in different scenarios like:
- When you want to generate a specific amount of text, regardless of how much context the model has.
- When you want to constrain the generated text to a specific size to fit in a specific format or application
- When you want to improve the performance of the model.
However, keep in mind that setting the max_length
or max_tokens
parameter too low can prevent the model from fully expressing its ideas or completing its thought. It could also lead to incomplete sentences or grammatically incorrect.
In summary, max_length
or max_tokens
parameter is a control for the maximum number of tokens that can be generated by the GPT-3 model in a single call and it allows to control the size of the generated text. It can be useful for constraining the size of the output of the model and adapting it to specific use cases, but it is important to use it in a way that does not compromise the quality of the output.