Web STT: speaker diarization via pyannote; whisper_stt snapshot validation
- Add app/diarize.py: merge faster-whisper segments with pyannote (A/B/C) - Wire /api/jobs and /api/transcribe; job API returns speaker_diarization, diarize_skip_reason - UI: meta line shows diarization applied/skipped; hint for models path - requirements.txt: pyannote.audio; README APP_DIARIZE / APP_PYANNOTE_MODEL_DIR - whisper_stt.py: validate config.yaml before loading pipeline - requirements-whisper-stt.txt: minor doc updates if any Made-with: Cursor
This commit is contained in:
@@ -314,7 +314,8 @@
|
||||
|
||||
<div class="hint">
|
||||
- 허용: mp3, m4a, wav, mp4, aac, ogg, flac, webm<br />
|
||||
- 첫 실행 시 Whisper 모델 다운로드로 시간이 걸릴 수 있습니다.
|
||||
- 첫 실행 시 Whisper 모델 다운로드로 시간이 걸릴 수 있습니다.<br />
|
||||
- 완료 후 pyannote로 화자 구분을 시도합니다 (<code>models/pyannote-diarization-3.1</code> 필요).
|
||||
</div>
|
||||
|
||||
<div class="progress">
|
||||
@@ -613,7 +614,10 @@
|
||||
const lang = body.detected_language ? `${body.detected_language}` : "-";
|
||||
const prob = typeof body.language_probability === "number" ? body.language_probability.toFixed(3) : "-";
|
||||
const dur = typeof body.duration_sec === "number" ? `${body.duration_sec.toFixed(1)}s` : "-";
|
||||
metaEl.textContent = `감지 언어: ${lang} (p=${prob}), 오디오 길이: ${dur}`;
|
||||
let diarizeMeta = "";
|
||||
if (body.speaker_diarization === true) diarizeMeta = " · 화자 구분: 적용";
|
||||
else if (body.diarize_skip_reason) diarizeMeta = " · 화자 구분: 생략";
|
||||
metaEl.textContent = `감지 언어: ${lang} (p=${prob}), 오디오 길이: ${dur}${diarizeMeta}`;
|
||||
|
||||
if (startedAt) {
|
||||
timingEl.textContent = `${((performance.now() - startedAt) / 1000).toFixed(2)}s`;
|
||||
|
||||
Reference in New Issue
Block a user