A node.js wrapper around the CMU Pronunciation Dictionary


cmudict is a basic wrapper around the CMU Pronouncing Dictonary. The purpose of this wrapper is to enable phoneme extraction from a given word to do linguistic operations such as syllable counting and rhyming. Note that the dictionary is finite: though large, it will not find certain words. In this case you will have to improvise.


var CMUDict = require('cmudict').CMUDict;
var cmudict = new CMUDict();
var phoneme_str = cmudict.get('prosaic'); // 'P R OW0 Z EY1 IH0 K'

Counting syllables and determining the end rhyme is an exercise left up to the reader as there are a number of ways to do this. (see: Word Hy-phen-a-tion by Com-put-er).


npm install cmudict


The dictionary is a huge flatfile. It is lazily read into memory upon the first call to .get(). It takes about 1 second, on average, to read this file in, but from that point on accesses are a simple object property lookup. I originally used node-mmap but found the performance of fs.readFileSync() to be comparable.


The nodejs code was written by Nathaniel K Smith The CMU Pronouncing Dictionary is Copyright (C) 1993-2008 by Carnegie Mellon University.


All code not under copyright by CMU is licensed under a Creative Commons Attribution-ShareAlike 3.0.