From the word2vec site I can download GoogleNews-vectors-negative300.bin.gz. The .bin file (about 3.4GB) is a binary format not useful to me. Tomas Mikolov assures us that "It should be fairly straightforward to convert the binary format to text format (though that will take more disk space). Check the code in the distance tool, it's rather trivial to read the binary file." Unfortunately, I don't know enough C to understand http://word2vec.googlecode.com/svn/trunk/distance.c.
Supposedly gensim can do this also, but all the tutorials I've found seem to be about converting from text, not the other way.
Can someone suggest modifications to the C code or instructions for gensim to emit text?
See Question&Answers more detail:os