Welcome to my little corner of the internet. I'm Josh. I've been working on AI and language systems since 2011. A few things I've done: co-founded a voice cloning startup, released open-source models with 9M+ monthly downloads, lead applied AI teams, shipped LLM and multimodal SaaS and consumer products, shipped AI hardware, a PhD in automatic speech recognition, and lots of travel. Currently I'm a founding engineer at Veris AI, building simulation sandboxes for LLM agents.

Speech Recognition

Mar 26, 2026
Replacing Apple Dictation with Moonshine Flow (local)
Mar 21, 2020
An Overview of Multi-Task Learning in Speech Recognition
Aug 17, 2019
My INTERSPEECH Schedule
Aug 17, 2019
Kaldi Troubleshooting Head-to-Toe
Aug 17, 2019
Kaldi Hyperparameter Cheatsheet
Nov 9, 2017
Kaldi nnet3 notes
Oct 13, 2017
Kaldi on AWS
Sep 29, 2017
Josh's Speaker ID Challenge
Apr 5, 2017
Seminal Papers in ASR
Jan 10, 2017
How to use an Existing DNN Recognizer for Decoding in Kaldi
Dec 15, 2016
How to Visualize a Word Lattice with Kaldi
Dec 15, 2016
How to Train a Deep Neural Net Acoustic Model with Kaldi
Sep 12, 2016
How to use an Existing GMM Recognizer for Decoding in Kaldi
Feb 1, 2016
Some Kaldi Notes
Jan 27, 2016
CMU-Sphinx Cheatsheet
Jan 26, 2016
Installing Kaldi
Jan 9, 2016
The CMU-Sphinx Speech Recognition Toolkit: First Steps

Speech Synthesis

Sep 15, 2017
Create New Voice with Ossian & Merlin
Feb 14, 2017
Getting started with the Merlin Speech Synthesis Toolkit
Dec 9, 2016
Let's make a Chuvash voice! : Moscow Higher School of Economics Speech Synthesis Workshop
Sep 19, 2016
eSpeak NG Notes
Jul 3, 2016
How to Add a Language to eSpeak NG

Machine Learning

May 29, 2019
How to Train practically any Model from practically any Data with TensorFlow
Aug 18, 2017
Maximum Likelihood Estimation of Gaussian Parameters
Feb 1, 2016
A TensorFlow Tutorial: Email Classification

Miscellaneous

Jul 30, 2026
What happened to Coqui?
Mar 9, 2026
Displaying Images in Claude Code
Feb 21, 2026
Watercolor Shaders for Ghostty
May 29, 2019
How we added Kyrgyz to Mozilla's Common Voice project
Mar 2, 2019
Some Linux Text Processing Notes
Feb 14, 2017
A List of Other Blogs
Aug 2, 2016
Some SoX(I) Notes
Jan 21, 2016
Soricut & Och (2015): Unsupervised Morphology Induction Using Word Embeddings
Jan 20, 2016
Installing Praat on Ubuntu - getting sound to work