Streaming is a great feature meant to enable real-time, dynamic interactions between developers and the OpenAI API. It provides an efficient way to receive responses from the OpenAI API incrementally, as the data becomes available. This real-time interaction offers enhanced flexibility and responsiveness, making it a valuable addition to your toolkit.
Developers often opt for streaming in scenarios where real-time responses are critical and standard request-response mechanisms may not suffice.
How to stream the response?
Simply add stream=True to your request to receive the response in chunks as they're generated, instead of waiting for the complete response to be assembled.
# This code is for v1+ of the openai package https://pypi.org/project/openai/from openai import OpenAI client =OpenAI( api_key= os.getenv("OPENAI_API_KEY"), base_url ="https://turbo.gptboost.io/v1")# Make a request to OpenAI APIstream = client.chat.completions.create( model="gpt-4", messages=[ {"role": "user", "content": "Tell me a story about a happy developer."} ], stream=True,)for chunk in stream:print(chunk.choices[0].delta.content or"")
// This code is for v4+ of the openai package: npmjs.com/package/openaiimport OpenAI from'openai';constopenai=newOpenAI({ apiKey:process.env.OPENAI_API_KEY, baseURL:"https://turbo.gptboost.io/v1"})asyncfunctiongenerateStream(prompt) {conststream=awaitopenai.chat.completions.create({ model:"gpt-3.5-turbo", messages: [{"role":"user","content": prompt}], stream:true, });forawait (constpartof stream) {console.log(part.choices[0].delta.content); }}generateStream("Tell me one curious fact")
// This code is for v4 of the openai package: npmjs.com/package/openaiimport OpenAI from'openai';import { OpenAIStream, StreamingTextResponse } from'ai';// Can be 'nodejs', but Vercel recommends using 'edge'exportconstruntime='edge';constopenai=newOpenAI({ apiKey:process.env.OPEN_API_KEY, baseURL:"https://turbo.gptboost.io/v1",});// This method must be named GETexportasyncfunctionGET() {// Make a request to OpenAI's APIconstresponse=awaitopenai.chat.completions.create({ model:'gpt-3.5-turbo', stream:true, messages: [{ role:'user', content:'Say this is a test.' }], });//"Save the streaming results in a variable to prevent the stream from being exhausted."const [logStream,responseStream] =response.tee ?response.tee() : [response, response];// Convert the response into a friendly text-streamconststream=OpenAIStream(response);// Respond with the stream and headersreturnnewStreamingTextResponse(stream);}
What are some use cases?
Here are a few use cases where streaming shines:
Conversational Interfaces: Streaming is perfect for building chatbots and conversational interfaces that require instant responses to user input. It allows for a more natural, interactive conversation flow.
Live Transcriptions: In applications like transcription services or live captioning, streaming provides immediate results as audio or speech input is processed. This is particularly useful in live event scenarios.
Interactive Gaming: For interactive games that demand dynamic and real-time AI-generated content, streaming ensures seamless integration of AI-driven elements.
On-the-Fly Content Generation: In applications where content is generated on the fly, such as dynamic content creation for websites or marketing materials, streaming ensures rapid content delivery.
Simultaneous Data Processing: For tasks requiring parallel processing of multiple inputs, like language translation or sentiment analysis for a stream of social media posts, streaming enhances efficiency.