I&T Solution |
Audio Transcript Classification System
(REF: S-1134) |
Trial Project |
|
Solution Feature |
- A call recording is sent to speech-to-text (STT) software on the premise.
- The call recording is transcribed to Chinese by STT software on the first pass. The same audio file is transcribed to English on the second pass.
- The Chinese words and English words are combined to form one set of words.
- The combined set of words is sent to a text classifier, such as a Naive Bayes text classifier, for classification into a complaint or an appreciation.
- Assumption: a call recording can contain both Chinese words and English words at the same time. Data will be collected for training a text classifier. Chinese can be Cantonese and/or Mandarin.
|
Trial Application and Expected Outcome |
- Call recordings are collected and annotated as either ‘complaint’ or ‘appreciation’.
- The STT software will be configured and installed on the premise. The speech-to-text software will be tested on Chinese and English by using sample call recordings.
- A (Naive Bayes) text classifier will be trained from annotated audio files. The classifier will be evaluated using sample transcripts produced from selected call recordings. Transcripts produced will be saved into a database.
- A portion of the call recordings will be sampled every week to be annotated into training data: the text classifier will be upgraded in real time continuously.
- During the whole period of trial, a server will be provided to execute the entire pipeline of transcribing call recordings and classifying transcripts.
|
Info on I&T Solution Provider |
Solution Provider | : | GIMMICK HOUSE LIMITED | Address | : | Rm 2, 7/F., Block B, Mai Hing Industrial Building, 16-18 Hing Yip Street, Kwun Tong, Kowloon, Hong Kong | Contact Person | : | Alex Mar |
Position | : | Director | Tel | : | 95120604 | Email | : |
amar@gimmickhouse.com |
|