Components of a Dialogue Content
This page briefly describes what a dialogue content consists of in MMDAgent-EX. See the File Format secion for details of each definition files.
A “dialogue contents” is a set of files that defines various component of a dialogue system. it consists of a set of files for dialogue system, including a content-specific user dictionary for speech recognition, a voice model for speech synthesis, 3-D models, images, text, motions, and dialogue scenarios. By creating these, you can construct any voice conversation / speech interaction.
A brief list of files in a dialogue content are shown below. Note that files marked as [*] should be located in the same folder. See Files section for their formats in details.
topdir/ |- Configuration file (.mdf) [*] |- Dialogue scenario script (.fst) [*] |- Recognition word dictionary (.dic) [*] |- Rapid word dictionary (.rapiddic) [*] |- Julius JConf file (.jconf) [*] |- Open JTalk setting file (.ojt) [*] |- Button definitions (BUTTON0.txt - BUTTON9.txt) [*] |- Package description (PACKAGE_DESC.txt) [*] |- Description text (README.txt) [*] +- (SubDirectories) |- 3-D models (.pmd) |- Motions (.vmd) |- TTS Voice model (.htsvoice) |- Background/Floor (images) |- Sound / Music files (sound files) |- Stage models (.pmd) |- Other assets (images, text files, etc.)
Here we divide components into four groups and describe them in the order:
- Core components
- Speech processing and dialogue scenario
- Scene components
- Materials and resources for 3-D scene rendering
- Media components
- Sounds, images, and texts that can be displayed in the course of dialogue.
- Meta components
- Things outside dialogue system like package description, readme, etc.
Configuration file (.mdf)
A text file containing system configurations and parameters. Open this file by MMDAgent or MMDAgent-EX to start this content. This file is required for all contents. See its file format page for full list of configurable parameters.
# example of .mdf file display_comment_time=0 stage_size=50.0,25.0,40.0 campus_color=0.722,0.431,0.737 light_direction=0.0,1.0,1.0,0.0 use_shadow_mapping=true exclude_Plugin_LookAt=true
Dialogue Scenario script (.fst)
A text file containing dialogue management definition, written in OpenFST format. See the reference page how to write it.
# example of .fst file 0 10 RECOG_EVENT_STOP|hello <eps> 10 11 <eps> MOTION_ADD|mei|greet|greet.vmd 11 12 <eps> SYNTH_START|mei|normal|hi 12 0 SYNTH_EVENT_STOP|mei <eps>
Speech recognition setting files (.dic, .rapiddic, .jconf)
.dic file is an optional user dictionary for Julius speech recognizer. Writing task-specific words in this file will make MMDAgent-EX recognize those words more. See the reference page for details.
# example of .dic file <unk> @1.0 <unk> [MMDAgent] e m u e m u d i: e: j e N t o <unk> @2.0 <unk> [おっはー] O q h a:
.jconf file is an optional configuration file for Julius speech recognizer. You can give Julius any configuration parameters in addition to system default. See all options available on Julius.
## example of .jconf file # set lower audio trigger level threshold -lv 120 # set duration time to reject too long input -rejectlong 6000
Speech synthesis setting files (.ojt, .htsvoice, etc.)
Definition files for “Open JTalk” speech synthesis module. They are required to do speech synthesis. The
.ojt file defines voice names and configuration parameters. See the file format page how to set up a voice parameters in MMDAgent-EX.
## example of .ojt file # number of voices 5 # voice names Voice\mei\mei_normal.htsvoice Voice\mei\mei_angry.htsvoice Voice\mei\mei_bashful.htsvoice Voice\mei\mei_happy.htsvoice Voice\mei\mei_sad.htsvoice # number of speaking styles 9 # speaking style names, interpolation weight, and synthesis parameter mei_voice_normal 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.52 1.0 mei_voice_angry 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.1 -0.5 0.52 1.1 mei_voice_bashful 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.5 0.52 0.9 mei_voice_happy 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 1.1 1.5 0.52 1.0 mei_voice_sad 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 1.0 -0.5 0.52 0.9 mei_voice_fast 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 2.0 1.0 0.52 1.0 mei_voice_slow 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.5 1.0 0.52 1.0 mei_voice_high 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 4.0 0.52 1.0 mei_voice_low 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 -2.0 0.52 1.0
Voice model files (
.htsvoice) should be prepared trained from speech corpus by HTS and place it anywhere in the content. (MMDAgent-EX does not have default voice definitions in its distribution)
3-D models and motions (.pmd, .vmd)
You can use any PMD models and most of PMX models, and their VMD motion files for MikuMikuDance in MMDAgent-EX. The CG rendering part is fully compatible with MikuMikuDance (MMD).
MikuMikuDance is a free, lightweight software that lets users to create 3D animated movies. The MMD format has a proper level of expression capability that is enough for a modern virtual agent expression, with cartoon-like rendering and physics simulation. It’s adequate capability, expressiveness and availablity was the key for us to adopt its format as agent-based spoken dialogue system. Here are related links you can find more information:
- MikuMikuDance Official site
- Wikipedia has a detailed descrition about the software.
- VPVP wiki is a place where people gathers various informations about MikuMikuDance.
- Nikoni-Rittai (Niconi solid) is hosted by DWANGO to offer a fun share site of user’s 3-D models, including MMD.
The original MMDAgent supports only PMD, but MMDAgent-EX can render PMX models. However, you should convert PMX to PMD and CSV perior to use. See PMX file format document for details.
WarningBe careful on licensing issues. When you are going to use a MMD model or motions obtained on the net, please take care of the license which may be set by the authors. For historical reasons, many MMD materials are intended to be shared only for fandoms of MMD, and (re-)distributing it outside the MMD community is commonly not welcomed. Pay attention to the readme files included in the archives.
Stage (image, .pmd)
Stage image (background and floor), or stage 3-D model can be used to set up the scene behind the agent. You can either give background and floor images, or give a PMD stage model, by
STAGE command message inside dialogue scenario. The size of background and floor can be changed by
stage_size parameter in .mdf file.
Here is an example message of setting / changing stage. See reference for details.
STAGE|(bitmap file name for floor),(bitmap file name for back) STAGE|(stage file name, .xpmd or .pmd)
Camera (parameter or .vmd)
You can give camera position by
CAMERA message, or camera movement as motion VMD file made in MikuMikuDance, inside dialogue scenario.
Here is an example message of changing or start moving the camera position. See reference for details.
CAMERA|x,y,z|rx,ry,rz|(distance)|(fovy) CAMERA|(camera motion file name)
Sound / Music (.mp3, .wav, etc.)
mp3, wav and other format is supported. Place the sound file in the content and use
SOUND_START|filename message to start playing it in the dialogue scenario.
mp3 and wav formats are always supported at all platforms. MMDAgent-EX just calls audio APIs on each OS to play a sound, so available audio format depends on the API it uses. Here is a list of sound APIs that MMDAgent-EX uses:
- Windows: Media Control Interface (MCI).
- macOS, iOS: AudioToolbox framework.
- Android: Android MediaCodec.
- Linux: VLC media player (requires
Here is an example message that makes MMDAgent-EX play a sound. See messsage description how to use it in details.
SOUND_START|(sound alias)|(sound file name)
Raw image / text (image, .txt)
You can put any text or image in the scene, or open a text document file in full screen. Use
TEXTAREA_SET message to display short text or image in the 3-D scene. Reference is here.
TEXTAREA_ADD|(textarea alias)|(width,height)|(size,margin,exlinespace)|r,g,b,a|r,g,b,a|x,y,z TEXTAREA_SET|(textarea alias)|(text)
You can also show content of a text file at full screen in the middle of dialogue scenario and force user to respond by
INFOTEXT messages. Text file should be in UTF-8.
You can show a prompt dialogue in the middle of dialogue scenario and give users a chance to respond by tap or click using
Buttons on screen (BUTTON*.txt)
You can configure optional buttons to be displayed on the screen, and define action what to execute when they are tapped. The definition files are
Package info (PACKAGE_DESC.txt)
It is recommended that you propery define package information in
PACKAGE_DESC.txt. to deal it more correctly and fancy in MMDAgent-EX. See here for details.
If README text is prepared on the text, it will be displayed at the first launch of the content and after some update has been detected. The file name of the README should be given in the
PACKAGE_DESC.txt file The character encoding should be UTF-8.
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.