1. Verify chunk parsing
Confirm your SDK or HTTP parser handles empty deltas, final chunks, disconnects and model-specific stream behavior.
Streaming responses
Use the TKEN OpenAI-compatible gateway for token-by-token user interfaces, CLI output and agent logs while keeping one base URL: https://www.tken.shop/v1.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.TKEN_API_KEY,
baseURL: "https://www.tken.shop/v1"
});
const stream = await client.chat.completions.create({
model: process.env.TKEN_CHAT_MODEL,
stream: true,
messages: [{ role: "user", content: "Write three short bullets." }]
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
Test the transport before wiring it into a UI. A streaming route should deliver incremental chunks and close cleanly when the completion is finished.
curl https://www.tken.shop/v1/chat/completions \
-H "Authorization: Bearer $TKEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "'$TKEN_CHAT_MODEL'",
"stream": true,
"messages": [
{"role": "user", "content": "Stream a one sentence answer."}
]
}'
Confirm your SDK or HTTP parser handles empty deltas, final chunks, disconnects and model-specific stream behavior.
User interfaces should support aborting a request and should show a clear error if the stream stalls or the network drops.
Use a non-streaming fallback for batch jobs, retries and clients that cannot consume server-sent events reliably.
Start with a terminal smoke test, then wire the same https://www.tken.shop/v1 base URL into your application.