About Audio API
This page describes audio API and its setting parameters on MMDAgent-EX.
Android
When running on Android 8.0 and later, MMDAgent-EX uses the new Android C API called AAudio for audio recording and playing. On older devices where AAudio is not available, OpenSL ES for Android is used instead. We are using Google Oboe library to support both APIs.
The setup parameters are like below. On AAudio, speech recognition profile is experimentally set as input preset. There is currently no detailed description about this preset, but it seems that it at least sets the lowest latency on most devices, and enables auto-gain control (AGC) on some devices. Additional speech processing may be added depending on Google’s future specification changes.
oboe::AudioStreamBuilder builder;
builder.setFormat(oboe::AudioFormat::I16);
builder.setPerformanceMode(oboe::PerformanceMode::LowLatency);
builder.setInputPreset(oboe::InputPreset::VoiceRecognition);
builder.setContentType(oboe::ContentType::Speech);
iOS
AudioUnit framework is used in MMDAgent-EX. Apple’s internal Voice processing feature is experimentally enabled for input stream. Although no detailed description is available what kind of processing is included, it seems that it enables auto gain control (AGC), noise supression and echo cancellation.
AudioComponentDescription ioUnitDescription;
ioUnitDescription.componentType = kAudioUnitType_Output;
ioUnitDescription.componentSubType = kAudioUnitSubType_VoiceProcessingIO;
ioUnitDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
Windows
For speech recognition and synthesis, DirectSound API is used.
You can choose the audio device to open by setting environment variable “PORTAUDIO_DEV
” or “PORTAUDIO_DEV_NUM
”. The list of available auduio devices are displayed at startup log of MMDAgent-EX, like this:
id [desc1: desc2]
id [desc1: desc2]
...
Set env “PORTAUDIO_DEV
” to the string of desc1: desc2
of the target device, or set the env “PORTAUDIO_DEV_NUM
” to the id number of the target device before launch.
For sound playing by SOUND_START
command, Media Control Interface (MCI) is used.
MacOS
The Core Audio interface is used at Mac OS. The default device as set in audio profile will be opened.
Linux
The ALSA audio interface is used on Linux. The default device as set in audio profile will be used.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.