Speaker Diarization: Optimal Clustering and Learning Speaker Embeddings

Speaker diarization consist of automatically partitioning an input audio stream into homogeneous segments (segmentation) and assigning these segments to the same speaker (speaker clustering). This process can allow to enhance the readability by structuring an audio document, or provide the speaker’s true identity when it’s used in conjunction with speaker recognition system. In this seminar I will talk about two new methods: ILP Clustering and Speaker embeddings. In speaker clustering, a major problem wit

4 views

425

126