Audio Transcript Classification System

I&T Solution

Audio Transcript Classification System
(REF: S-1134)

Solution Feature

A call recording is sent to speech-to-text (STT) software on the premise.
The call recording is transcribed to Chinese by STT software on the first pass. The same audio file is transcribed to English on the second pass.
The Chinese words and English words are combined to form one set of words.
The combined set of words is sent to a text classifier, such as a Naive Bayes text classifier, for classification into a complaint or an appreciation.
Assumption: a call recording can contain both Chinese words and English words at the same time. Data will be collected for training a text classifier. Chinese can be Cantonese and/or Mandarin.

Trial Application and Expected Outcome

Call recordings are collected and annotated as either ‘complaint’ or ‘appreciation’.
The STT software will be configured and installed on the premise. The speech-to-text software will be tested on Chinese and English by using sample call recordings.
A (Naive Bayes) text classifier will be trained from annotated audio files. The classifier will be evaluated using sample transcripts produced from selected call recordings. Transcripts produced will be saved into a database.
A portion of the call recordings will be sampled every week to be annotated into training data: the text classifier will be upgraded in real time continuously.
During the whole period of trial, a server will be provided to execute the entire pipeline of transcribing call recordings and classifying transcripts.

Info on I&T Solution Provider

Solution Provider	:	GIMMICK HOUSE LIMITED
Address	:	Rm 2, 7/F., Block B, Mai Hing Industrial Building, 16-18 Hing Yip Street, Kwun Tong, Kowloon, Hong Kong
Contact Person	:	Alex Mar
Position	:	Director
Tel	:	95120604
Email	:	amar@gimmickhouse.com

For details of the above I&T solution, please contact the I&T solution provider.

I&T Solution - Audio Transcript Classification System 2022-01-20