Streaming is a great feature meant to enable real-time, dynamic interactions between developers and the OpenAI API. It provides an efficient way to receive responses from the OpenAI API incrementally, as the data becomes available. This real-time interaction offers enhanced flexibility and responsiveness, making it a valuable addition to your toolkit.
Developers often opt for streaming in scenarios where real-time responses are critical and standard request-response mechanisms may not suffice.
How to stream the response?
Simply add stream=True to your request to receive the response in chunks as they're generated, instead of waiting for the complete response to be assembled.
# This code is for v1+ of the openai package https://pypi.org/project/openai/
from openai import OpenAI
client = OpenAI(
api_key= os.getenv("OPENAI_API_KEY"),
base_url = "https://turbo.gptboost.io/v1"
)
# Make a request to OpenAI API
stream = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Tell me a story about a happy developer."}
],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "")
// This code is for v4+ of the openai package: npmjs.com/package/openai
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://turbo.gptboost.io/v1"
})
async function generateStream(prompt) {
const stream = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
messages: [{"role": "user", "content": prompt}],
stream: true,
});
for await (const part of stream) {
console.log(part.choices[0].delta.content);
}
}
generateStream("Tell me one curious fact")
// This code is for v4 of the openai package: npmjs.com/package/openai
import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';
// Can be 'nodejs', but Vercel recommends using 'edge'
export const runtime = 'edge';
const openai = new OpenAI({
apiKey: process.env.OPEN_API_KEY,
baseURL: "https://turbo.gptboost.io/v1",
});
// This method must be named GET
export async function GET() {
// Make a request to OpenAI's API
const response = await openai.chat.completions.create({
model: 'gpt-3.5-turbo',
stream: true,
messages: [{ role: 'user', content: 'Say this is a test.' }],
});
//"Save the streaming results in a variable to prevent the stream from being exhausted."
const [logStream, responseStream] = response.tee ? response.tee() : [response, response];
// Convert the response into a friendly text-stream
const stream = OpenAIStream(response);
// Respond with the stream and headers
return new StreamingTextResponse(stream);
}
What are some use cases?
Here are a few use cases where streaming shines:
Conversational Interfaces: Streaming is perfect for building chatbots and conversational interfaces that require instant responses to user input. It allows for a more natural, interactive conversation flow.
Live Transcriptions: In applications like transcription services or live captioning, streaming provides immediate results as audio or speech input is processed. This is particularly useful in live event scenarios.
Interactive Gaming: For interactive games that demand dynamic and real-time AI-generated content, streaming ensures seamless integration of AI-driven elements.
On-the-Fly Content Generation: In applications where content is generated on the fly, such as dynamic content creation for websites or marketing materials, streaming ensures rapid content delivery.
Simultaneous Data Processing: For tasks requiring parallel processing of multiple inputs, like language translation or sentiment analysis for a stream of social media posts, streaming enhances efficiency.