$ ../julius/julius/julius -C julius.jconf -dnnconf dnn.jconf
### read waveform input
Error: adin_file: channel num != 1 (2)
Error: adin_file: error in parsing wav header at mozilla.wav
Error: adin_file: failed to read speech data: “mozilla.wav”
0 files processed
チャンネル数が1chでない=ステレオ のエラーに様です。
WAVファイルはWindows標準の音データファイルでRIFF形式で作られている。
RIFFにはchunkと呼ばれる考え方があり、wavはいくつかのチャンクを1つにまとめた集合体
識別子(4), Size(4), Data(n)
### モノラルとステレオ
モノラルというのは、左右から違う音が聴こえる音声
ステレオとは違い真ん中からしか聴こえない音声のこと
Soxをインストールします
$ sudo git clone git://sox.git.sourceforge.net/gitroot/sox/sox
$ cd sox
$ sudo yum groupinstall “Development Tools”
$ ./configure
-bash: ./configure: No such file or directory
$ yum install sox
$ sox mozilla.wav -c 1 test.wav
$ ../julius/julius/julius -C julius.jconf -dnnconf dnn.jconf
Error: adin_file: sampling rate != 16000 (44100)
Error: adin_file: error in parsing wav header at mozilla.wav
Error: adin_file: failed to read speech data: “mozilla.wav”
$ sox mozilla.wav -c 1 -r 16000 test1.wav
——
### read waveform input
Stat: adin_file: input speechfile: mozilla.wav
STAT: 0 samples (0.00 sec.)
STAT: ### speech analysis (waveform -> MFCC)
WARNING: input too short (0 samples), ignored
ファイルを変えて再度やります。
id: from to n_score unit
—————————————-
[ 0 2] -0.890920 []] [
[ 3 43] 1.508327 plans [plans]
[ 44 52] 0.579483 are [are]
[ 53 83] 2.098300 well [well]
[ 84 141] 1.983006 underway [underway]
[ 142 219] 1.388610 already [already]
[ 220 309] 1.076294 martin [martin]
[ 310 364] 1.698448 nineteen [nineteen]
[ 365 398] 2.135265 ninety [ninety]
[ 399 472] 1.064299 two [two]
[ 473 504] 1.476521 five [five]
[ 505 561] 0.660421 dollars [dollars]
[ 562 608] 2.348794 bail [bail]
[ 609 736] 0.248682
re-computed AM score: 920.427368
=== end forced alignment ===
=== begin forced alignment ===
— word alignment —
id: from to n_score unit
—————————————-
[ 0 71] 0.859664 []] [
[ 72 111] 1.162892 director [director]
[ 112 164] 1.981413 martin [martin]
[ 165 180] 1.593118 to [to]
[ 181 221] 2.427887 commemorate [commemorate]
[ 222 267] 1.872279 kilometer [kilometer]
[ 268 306] 2.526583 journey [journey]
[ 307 319] 2.079670 to [to]
[ 320 327] 2.000595 the [the]
[ 328 348] 3.200890 new [new]
[ 349 386] 2.590411 world [world]
[ 387 414] 2.556754 five [five]
[ 415 443] 1.544829 hundred [hundred]
[ 444 464] 0.974130 years [years]
[ 465 531] 1.067814 ago [ago]
[ 532 546] 1.595085 and [and]
[ 547 583] 1.752286 wanted [wanted]
[ 584 642] 1.655993 moving [moving]
[ 643 658] 2.205574 it [it]
[ 659 670] 2.086497 to [to]
[ 671 704] 2.005465 promote [promote]
[ 705 732] 1.775316 use [use]
[ 733 755] 1.450466 of [of]
[ 756 773] 1.704210 those [those]
[ 774 856] 1.187828 detailed [detailed]
[ 857 887] 1.474861 in [in]
[ 888 990] 2.152141 exploration [exploration]
[ 991 1010] 0.570776
re-computed AM score: 1743.703125
=== end forced alignment ===
精度には問題があるが、一連の流れとしてはOKかな。