Settings
Several aspects of the Whisper pipeline can be customized from the settings on the subsystem.
| Setting | Default | Description |
|---|---|---|
| SmartTurnEnabled | false | Can be set to true to turn on endpoint detection. This will attempt to detect if the user is done speaking during realtime audio processing and will wait for the speaker to finish their phrase. |
| NoiseSuppressionEnabled | false | Isolate the user's voice from background noise if set to true. This can improve accuracy for voice encoding when using in noisy environments, but may decrease transcription quality especially with smaller models. |
| AttenuationLimit | 0 | The intensity of noise suppression. Higher is more aggressive with 0 being the maximum. |
| VADThreshold | 0.5 | The threshold for detecting voice activity. The default of 0.5 will be sufficient for the majority of use cases. |
| EndpointThreshold | 0.5 | The threshold for detecting end of speech when using Smart Turn. |
| SegmentCheckSize | 60 | The number of samples to check for an interrupting voice comparison. Can be set lower for more responsive interrupts, but may be less accurate. |
| InterruptThreshold | 0.75 | The similarity threshold for interrupting voice comparison. |
| InterruptDelay | 1 | The time in seconds to delay after an interrupt is detected before processing audio again. This prevents potential spill over audio. |
| StopChunks | 24 | The number of chunks of audio to check for voice activity. Can be set lower for shorter delay, but may cut off speech early. |
| SimilarityThreshold | 0.7 | Threshold used for voice encoding similarity checks. |
Databiomes