LLM Settings

Temperature

AI is by definition probabilistic and it decides "randomly" between the most probable possibilities. Temperature tells the model how deterministic the results needs to be reducing the pool of probabilities which leads to the model to be more consistent.

Top P

Defines what is the cap of confidence to be accepted in the pool of possibilities. Reducing the top p will select the most confident responses. The general recommendation is to alter temperature or top P but not both.

Stop Sequence and max length

Max length tells the model what is the maximum amount of tokens that can be used. Stop Sequence will tell the model to stop the generation once it finds this token. eg. 11 when you ask for top 10 of something.

Penalties

Penalties allows you to reduce the repetition of tokens in a response, thus avoiding the model to be repeating the same ideas. There are two types of penalties:

Frequency penalty applies a penalty to the next token proportional to the amount of times that it appeared before
Presence penalty applies a penalty to the next token, but it doesn't increase the amount based on how many times it already appeared
It is recommended to not add both penalties.