Skip to main content
Tuteliq provides a WebSocket-based voice streaming endpoint that transcribes audio in real time and emits safety alerts as they are detected. This allows you to moderate voice chat, calls, and other live audio without waiting for the full recording to finish.

Endpoint

wss://api.tuteliq.ai/safety/voice/stream?token=YOUR_API_KEY
Authentication is handled via the token query parameter. The connection will be rejected with a 4001 close code if the key is invalid or expired.

Audio Format

Send audio data as binary WebSocket frames. The recommended format is:
ParameterValue
EncodingPCM 16-bit LE
Sample Rate16 kHz
Channels1 (mono)
Chunk Size4096–32768 bytes
Other sample rates (8 kHz, 44.1 kHz, 48 kHz) are accepted but will be resampled server-side, which adds latency. 16 kHz mono gives the best balance of accuracy and speed.

Connection Lifecycle

1

Connect

Open a WebSocket connection to the streaming endpoint with your API key.
2

Configure (optional)

Send a JSON text frame to adjust settings before streaming audio.
3

Send Audio

Stream binary audio chunks continuously. The server begins transcription and analysis immediately.
4

Receive Alerts

The server sends JSON text frames containing partial transcriptions and safety alerts as they are detected.
5

Close

Close the connection normally. The server will flush any remaining audio and send a final summary frame.

Configuration

After connecting, you can send a JSON text frame to configure the session:
{
  "type": "config",
  "flush_interval_ms": 2000,
  "categories": ["grooming", "bullying", "self_harm", "substance", "sexual_content"],
  "language": "en",
  "min_severity": "medium"
}
FieldTypeDefaultDescription
flush_interval_msnumber3000How often (in ms) the server emits transcription results. Lower values give faster feedback but may be less accurate.
categoriesstring[]allSafety categories to monitor. Omit to enable all.
languagestring"en"Language hint for transcription.
min_severitystring"low"Minimum severity level to trigger alerts (low, medium, high, critical).

Server Messages

Transcription Frame

{
  "type": "transcription",
  "text": "hey do you want to come over to my place after school",
  "is_partial": false,
  "timestamp_ms": 14200
}

Safety Alert Frame

{
  "type": "alert",
  "category": "grooming",
  "severity": "high",
  "risk_score": 0.87,
  "text": "hey do you want to come over to my place after school",
  "description": "Potential grooming pattern detected: private meeting solicitation directed at a minor.",
  "timestamp_ms": 14200
}

Session Summary Frame

Sent when the connection closes:
{
  "type": "summary",
  "duration_ms": 62000,
  "alerts_count": 2,
  "highest_severity": "high",
  "categories_flagged": ["grooming"],
  "transcript_length": 347
}

Code Example

import WebSocket from "ws";

const ws = new WebSocket(
  "wss://api.tuteliq.ai/safety/voice/stream?token=YOUR_API_KEY"
);

ws.on("open", () => {
  // Optional: configure the session
  ws.send(JSON.stringify({
    type: "config",
    flush_interval_ms: 2000,
    categories: ["grooming", "bullying", "self_harm"],
    min_severity: "medium",
  }));

  // Stream audio from a source (e.g., microphone, file, or RTC track)
  const audioStream = getAudioStream(); // your PCM 16-bit 16kHz mono source
  audioStream.on("data", (chunk) => {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(chunk);
    }
  });

  audioStream.on("end", () => {
    ws.close(1000, "stream_complete");
  });
});

ws.on("message", (data) => {
  const message = JSON.parse(data.toString());

  if (message.type === "alert") {
    console.log(
      `[${message.severity.toUpperCase()}] ${message.category}: ${message.description}`
    );
    // Trigger your moderation workflow here
  }

  if (message.type === "transcription" && !message.is_partial) {
    console.log(`Transcript: ${message.text}`);
  }

  if (message.type === "summary") {
    console.log(`Session ended. Alerts: ${message.alerts_count}`);
  }
});

ws.on("close", (code, reason) => {
  console.log(`Connection closed: ${code} ${reason}`);
});

ws.on("error", (err) => {
  console.error("WebSocket error:", err.message);
});

Close Codes

CodeMeaning
1000Normal closure
4001Authentication failed
4002Rate limit exceeded
4003Invalid audio format
4008Session duration limit reached
4500Internal server error
Voice streaming sessions have a maximum duration of 10 minutes on the free tier and 60 minutes on paid tiers. The server will send a summary frame and close the connection with code 4008 when the limit is reached.