Speech Recognition
-
An Overview of Multi-Task Learning in Speech Recognition
-
My INTERSPEECH Schedule
-
Kaldi Troubleshooting Head-to-Toe
-
Kaldi Hyperparameter Cheatsheet
-
Kaldi nnet3 notes
-
Kaldi on AWS
-
Josh's Speaker ID Challenge
-
Seminal Papers in ASR
-
How to use an Existing DNN Recognizer for Decoding in Kaldi
-
How to Visualize a Word Lattice with Kaldi
-
How to Train a Deep Neural Net Acoustic Model with Kaldi
-
How to use an Existing GMM Recognizer for Decoding in Kaldi
-
Some Kaldi Notes
-
CMU-Sphinx Cheatsheet
-
Installing Kaldi
-
The CMU-Sphinx Speech Recognition Toolkit: First Steps
Speech Synthesis
-
Create New Voice with Ossian & Merlin
-
Getting started with the Merlin Speech Synthesis Toolkit
-
Let's make a Chuvash voice! : Moscow Higher School of Economics Speech Synthesis Workshop
-
eSpeak NG Notes
-
How to Add a Language to eSpeak NG
Machine Learning
-
How to Train practically any Model from practically any Data with TensorFlow
-
Maximum Likelihood Estimation of Gaussian Parameters
-
A TensorFlow Tutorial: Email Classification
Miscellaneous
-
How we added Kyrgyz to Mozilla's Common Voice project
-
Some Linux Text Processing Notes
-
A List of Other Blogs
-
Some SoX(I) Notes
-
Soricut & Och (2015): Unsupervised Morphology Induction Using Word Embeddings
-
Installing Praat on Ubuntu - getting sound to work
subscribe via RSS
Downloads
You can download an NVDA installer with Kyrgyz language support here. You should be able to install the program by double clicking on the file and following the directions. To turn Kyrgyz language support on or off, navigate to "Voices" under "Settings" after you've installed the program. This project was conducted with Empower Blind People, a non-profit organization for blind people in Kyrgyzstan. Any feedback on the Kyrgyz support (accent, translation, errors, etc) is gladly welcomed!Lectures & Talks
Practical AI 104: Speech tech and Common Voice at Mozilla – Listen on Changelog.com
News about the Hakha Chin language being added to Mozilla's Common Voice. The project was spear-headed by Peng Hlei Thang and the Linguistics Department at Indiana University Bloomington.
Interview during the Week of Young International Scientific Talents (Semaine des jeunes talents scientifiques internationaux):
Here's another interview from the same week at France Inter.
Here's a couple videos below about our speech synthesis project for the Kyrgyz language. This project was done in collaboration with Empower Blind People to create a speech synthesizer for the Kyrgyz language, to be used in the open source project NVDA.
Иностранные программисты о программе, которая помогает незрячим читать текст на кыргызском языке.
Опубликовано Kloop 5 октября 2016 г.