Quick-Start Guide for VAD

Important Clarification

⚠️ Connection Protocol: This project uses TCP, not WebSocket.
VAD is built on top of the existing TCP protocol to provide automatic voice-activity detection.

Overview

TCP v2.5 introduces VAD (Voice Activity Detection).
The server automatically senses when the user starts and stops speaking, enabling a more natural voice-interaction experience.

Quick Start

1. Basic Integration (Recommended for New Users)

2. Upgrade for Existing Users

Simply add the mode:auto parameter to your current authentication message:

Key Changes:

No need to send END_FRAME

Listen for the server's LISTEN messages

Audio can be sent continuously

Mode Comparison

Feature	Manual Mode (Legacy)	Auto Mode (New)
End-of-speech detection	Client sends `END_FRAME`	Server VAD auto-detects
Audio format requirement	PCM or Opus	Opus required
Interaction style	Push-to-talk	Hands-free
Noise handling	None	Auto-filtered
Compatibility	Fully backward-compatible	Requires client adaptation

Step-by-Step Integration

Step 1: Audit Your Current Implementation

Step 2: Update Authentication Logic

Step 3: Handle LISTEN Messages

Step 4: Modify Audio-Sending Logic

Best Practices

1. UI Design

2. Audio-Quality Optimization

3. Force-Stop a Dialogue

4. Error-Recovery Mechanism

Performance-Tuning Tips

1. Audio-Buffer Management

2. Network Optimization

Troubleshooting

Common Issues

VAD not working

Verify Opus format is used

Confirm server returns mode “auto”

Check console for VAD-init errors

Frequent false triggers

Measure ambient-noise level

Lower microphone gain

Contact support to tune VAD thresholds

High latency

Measure network RTT

Tune encoder settings

Consider Manual mode for ultra-low delay

Debug Tips

Summary

VAD delivers a more natural voice-interaction experience.
Keep these essentials in mind:

Auto mode supports both PCM and Opus (Opus recommended for better performance)

Listen for the server’s LISTEN messages

No END_FRAME required

Implement robust error handling and reconnection logic

For further assistance, please contact technical support.

Connect with VAD

Quick-Start Guide for VAD#

Important Clarification#

Overview#

Quick Start#

1. Basic Integration (Recommended for New Users)#

2. Upgrade for Existing Users#

Mode Comparison#

Step-by-Step Integration#

Step 1: Audit Your Current Implementation#

Step 2: Update Authentication Logic#

Step 3: Handle LISTEN Messages#

Step 4: Modify Audio-Sending Logic#

Best Practices#

1. UI Design#

2. Audio-Quality Optimization#

3. Force-Stop a Dialogue#

4. Error-Recovery Mechanism#

Performance-Tuning Tips#

1. Audio-Buffer Management#

2. Network Optimization#

Troubleshooting#

Common Issues#

Debug Tips#

Summary#