# Configuration params

Each LLM exposes a set of configuration parameters that can affect the model's output. Such configuration parameters are invoked at inference time and give you control over various things such as output creativity, max number of tokens in the completion, model's likelihood to repeat the same line, etc.

Let's Check a few examples and their usage

* **max\_tokens** - The maximum number of tokens in the completion
* **temperature** - This affects the model's creativity. The value must be integer between 0 and 1. Higher values like 0.9 will make the output more random, while lower values like 0.1 will generate more focused and deterministic completion. Changing the temperature actually alters the predictions that the model will make.&#x20;
* **top\_p** - Alternative to sampling with temperature. Basically, you can limit the random sampling to the predictions - e.g. 0.1 means only the tokens comprising the top 10% probability mass are considered.&#x20;

&#x20;     \* *Usually, manipulation of either temperature or top\_p is recommended, but NOT both.*&#x20;

* **n** - you can specify how many chat completion choices are to be generated for each prompt.

These are only a few of the most frequently used parameters. A full list and detailed explanation of all exposed by OpenAI parameters may be found in the official docs - <https://platform.openai.com/docs/api-reference/chat/create>.

Check an example of how to transform your request and start making changes at inference.

&#x20;

{% tabs %}
{% tab title="Python" %}

```python
# Request to OpenAI and modifying output at inference to be more creative and setting a max_tokens limit
# This example is for v1+ of the openai: https://pypi.org/project/openai/
from openai import OpenAI

client = OpenAI( 
    base_url = "https://turbo.gptboost.io/v1",
    api_key = os.getenv("OPENAI_API_KEY"),
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me an interesting fact about zebras"},
    ], 
    temperature=0.9,
    max_tokens=256
)

print(response.choices[0].message.content)
```

{% endtab %}

{% tab title="cURL" %}

```powershell
curl --location 'https://turbo.gptboost.io/v1/chat/completions' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "user",
            "content": "How to cheerfully greet a girl in Spanish!"
        }
    ],
    "temperature": 0.9,
    "max_tokens": 33,
    "frequency_penalty": 0,
    "presence_penalty": 0
}'
```

{% endtab %}

{% tab title="NodeJS" %}

```javascript
// This code is for v4 of the openai package: npmjs.com/package/openai
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://turbo.gptboost.io/v1",
});


async function ask_gpt(){ 
    const response = await openai.chat.completions.create({
        model: "gpt-3.5-turbo",
        messages: [{ role: "user", content: "Get me 3 inspirational quotes" }],
        temperature: 0,
        max_tokens: 67,
        frequency_penalty: 0,
        presence_penalty: 0

    });
    console.log(response.choices[0].message.content)
}

ask_gpt()
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.gptboost.io/advanced/methods.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
