Streaming

Unlocking Real-Time Interactions

Why Choose Streaming?

Streaming is a great feature meant to enable real-time, dynamic interactions between developers and the OpenAI API. It provides an efficient way to receive responses from the OpenAI API incrementally, as the data becomes available. This real-time interaction offers enhanced flexibility and responsiveness, making it a valuable addition to your toolkit.

Developers often opt for streaming in scenarios where real-time responses are critical and standard request-response mechanisms may not suffice.

How to stream the response?

Simply add stream=True to your request to receive the response in chunks as they're generated, instead of waiting for the complete response to be assembled.

# This code is for v1+ of the openai package https://pypi.org/project/openai/
from openai import OpenAI 

client = OpenAI(
    api_key= os.getenv("OPENAI_API_KEY"),
    base_url = "https://turbo.gptboost.io/v1"
)

# Make a request to OpenAI API
stream = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Tell me a story about a happy developer."}
    ],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "")

curl --request 'https://turbo.gptboost.io/v1/chat/completions' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "user", 
            "content": "Tell me an interesting fact about koalas!"
        }
    ], 
    "stream": true
}'

// This code is for v4+ of the openai package: npmjs.com/package/openai
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://turbo.gptboost.io/v1"
})

async function generateStream(prompt) {
  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{"role": "user", "content": prompt}],
    stream: true,
  });
  for await (const part of stream) {
    console.log(part.choices[0].delta.content);
  }
}

generateStream("Tell me one curious fact")

// This code is for v4 of the openai package: npmjs.com/package/openai
import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';
 
// Can be 'nodejs', but Vercel recommends using 'edge'
export const runtime = 'edge';
 
const openai = new OpenAI({
  apiKey: process.env.OPEN_API_KEY,
  baseURL: "https://turbo.gptboost.io/v1",
});

// This method must be named GET
export async function GET() {

  // Make a request to OpenAI's API
  const response = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    stream: true,
    messages: [{ role: 'user', content: 'Say this is a test.' }],
  });
  
  //"Save the streaming results in a variable to prevent the stream from being exhausted."
  const [logStream, responseStream] = response.tee ? response.tee() : [response, response];

    // Convert the response into a friendly text-stream
    const stream = OpenAIStream(response);
    // Respond with the stream and headers
    return new StreamingTextResponse(stream);
}

What are some use cases?

Here are a few use cases where streaming shines:

Conversational Interfaces: Streaming is perfect for building chatbots and conversational interfaces that require instant responses to user input. It allows for a more natural, interactive conversation flow.
Live Transcriptions: In applications like transcription services or live captioning, streaming provides immediate results as audio or speech input is processed. This is particularly useful in live event scenarios.
Interactive Gaming: For interactive games that demand dynamic and real-time AI-generated content, streaming ensures seamless integration of AI-driven elements.
On-the-Fly Content Generation: In applications where content is generated on the fly, such as dynamic content creation for websites or marketing materials, streaming ensures rapid content delivery.
Simultaneous Data Processing: For tasks requiring parallel processing of multiple inputs, like language translation or sentiment analysis for a stream of social media posts, streaming enhances efficiency.

PreviousAzure Integration NextAnalyze Logs

Last updated 1 year ago

Was this helpful?