Can we detect 'scream' or 'loud sound' etc using Android Speech Recognition APIs?
Or is there is any other software/third party tool that can do the same?
Thanks,
Kaps
You mean implement a clapper?
There's no need to use fancy math or the speech recognition API. Just use the MediaRecorder and its getMaxAmplitute() method.
Here is some of code you'll need.
The algorithm, records for a period of time and then measures the amplitute difference. If it is large, then the user probably made a loud sound.
public void recordClap()
{
recorder.start();
int startAmplitude = recorder.getMaxAmplitude();
Log.d(D_LOG, "starting amplitude: " + startAmplitude);
boolean ampDiff;
do
{
Log.d(D_LOG, "waiting while taking in input");
waitSome();
int finishAmplitude = 0;
try
{
finishAmplitude = recorder.getMaxAmplitude();
}
catch (RuntimeException re)
{
Log.e(D_LOG, "unable to get the max amplitude " + re);
}
ampDiff = checkAmplitude(startAmplitude, finishAmplitude);
Log.d(D_LOG, "finishing amp: " + finishAmplitude + " difference: " + ampDiff );
}
while (!ampDiff && recorder.isRecording());
}
private boolean checkAmplitude(int startAmplitude, int finishAmplitude)
{
int ampDiff = finishAmplitude - startAmplitude;
Log.d(D_LOG, "amplitude difference " + ampDiff);
return (ampDiff >= 10000);
}
If I were trying to detect a scream or loud sound, I would just look for a high root-mean-squared of the sounds coming through the microphone. I suppose that you can try to train a speech recognition system to recognize a scream, but it seems like overkill.
Related
I've been struggling with the problem of sending continuous data from arduino to Android.
What I want to do is get analog read convert it to 0-5V information, and send that information to Android app.
My arduino code is just simply:
//(...)defining pins and levels
SoftwareSerial BTSerial(rxPin, txPin);
void setup()
{
pinMode(getData, INPUT);
digitalWrite(keyPin, LOW);
BTSerial.begin(9600);
}
void loop()
{
contact = digitalRead(getData);
if (contact == HIGH) {
sensorValue = analogRead(sensorPin);
double voltage = sensorValue * (5.0 / 1023.0);
if (BTSerial.available()) {
Serial.write(BTSerial.read());
}
BTSerial.println(voltage, 3);
BTSerial.write("\r");
if (Serial.available()) {
BTSerial.write(Serial.read());
}
}
delay(5);
}
I need to send data informing about measurment with ~200Hz frequency.
After sending the data to application it seems that part of data is lost.
I tried higher bound rates but the problem still occurs. Is there a way to send continuous data from arduino using serial port without loosing some % of that data?
I think the problem is in the design of the receiver. I Solved BTL communication in .net Xamarin, but the principle should be the same. In Android reading from InputStream must be quick and can not use sleep. You need to use an endless cycle and there quick read data into temp buffer. Immediately a dune bytes to an auxiliary large buffer (use read / write cursor) and then, for example, in timer treat the data (I suppose you are using some packet protocol)
public override void Run()
{
WriteLogInfoToLog("ConnectedThread.Run() - before");
while (true)
{
try
{
int readBytes = 0;
lock (InternaldataReadLock)
{
readBytes = clientSocketInStream.Read(InternaldataRead, 0, InternaldataRead.Length);
Array.Copy(InternaldataRead, TempdataRead, readBytes);
}
if (readBytes > 0)
{
lock (dataReadLock)
{
dataRead = new byte[readBytes];
for (int i = 0; i < readBytes; i++)
{
dataRead[i] = TempdataRead[i];
}
}
}
}
catch (System.Exception e)
{
btlManager.btlState = BTLService.BTLState.Nothing;//Spadlo spojeni, musi spustit cele od zacatku
WriteLogInfoToLog("ConnectedThread.Run() - EXCEPTION " + e.Message + ", " + e.HResult + ", " + e.StackTrace + ", " + e.InnerException);
if (e is Java.IO.IOException)
{
}
else
{
}
break;
}
}
WriteLogInfoToLog("ConnectedThread.Run() - after");
}
i made a audio-recording in real-time and then playback, but when i get started
it sounded very noisy sounds and my voice. so i want to get my voice only with clean tone
im implementing this app with reverb, bass boost, equalizer and visualizer effects
i just wanna clean sounds without those effects
as for this app target version is at least for Android 2.3.3(api 10)
Q : how do i have to modify this substantial code part? (for audioRecord.startRecording, audioTrack.record, audioTrack.write)
here is my code snippet as below :
////////////////////////////////////
private void main() {
for (;;) {
Log.d("AFX", "Starting audio thread");
this.audioRecord.startRecording();
this.audioTrack.play();
int i;
if (!this.running) {
this.audioRecord.stop();
this.audioTrack.stop();
this.audioRecord.release();
return;
}
i = this.audioRecord.read(this.buffer, 0, this.chunkSize);
Log.v("AudioRecord", "read " + this.chunkSize + "bytes");
try {
// Log.d("AFX", "Starting audio thread");
this.audioRecord.startRecording();
this.audioTrack.play();
Log.d("AFX", "Starting audio thread");
boolean flag = this.running; // df value : false
if (!flag) {
this.audioRecord.stop();
this.audioTrack.stop();
// this.audioRecord.release(); // added
// this.audioTrack.release(); // added
Log.d("AFX", "Exiting audio thread");
return;
}
i = this.audioRecord.read(this.buffer, 0, this.chunkSize);
Log.v("AudioRecord", "read " + this.chunkSize + "bytes");
if (i < 0) {
Log.e("AFX", "Record error: " + i);
this.running = false;
continue;
// break;
}
if (!this.running) {
continue;
// break;
}
} finally {
this.audioRecord.stop();
// this.audioRecord.release(); // added
this.audioTrack.stop();
// this.audioTrack.release(); // added
}
this.audioTrack.write(this.buffer, 0, i);
// Log.d("AFX", "Starting audio thread");
// this.audioRecord.startRecording();
// this.audioTrack.play();
}
}
any idea'll be appreciated
thank you much
Try AudioEffects.
ar = new AudioRecord(MediaRecorder.AudioSource.VOICE_RECOGNITION, SAMPLE_RATE_IN_HZ, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, bs * 10);
if(AutomaticGainControl.isAvailable())
{
AutomaticGainControl agc =AutomaticGainControl.create(ar.getAudioSessionId());
//agc.g
Log.d("AudioRecord", "AGC is " + (agc.getEnabled()?"enabled":"disabled"));
agc.setEnabled(true);
Log.d("AudioRecord", "AGC is " + (agc.getEnabled()?"enabled":"disabled" +" after trying to enable"));
}else
{
Log.d("AudioRecord", "AGC is unavailable");
}
if(NoiseSuppressor.isAvailable()){
NoiseSuppressor ns = NoiseSuppressor.create(ar.getAudioSessionId());
Log.d("AudioRecord", "NS is " + (ns.getEnabled()?"enabled":"disabled"));
ns.setEnabled(true);
Log.d("AudioRecord", "NS is " + (ns.getEnabled()?"enabled":"disabled" +" after trying to disable"));
}else
{
Log.d("AudioRecord", "NS is unavailable");
}
if(AcousticEchoCanceler.isAvailable()){
AcousticEchoCanceler aec = AcousticEchoCanceler.create(ar.getAudioSessionId());
Log.d("AudioRecord", "AEC is " + (aec.getEnabled()?"enabled":"disabled"));
aec.setEnabled(true);
Log.d("AudioRecord", "AEC is " + (aec.getEnabled()?"enabled":"disabled" +" after trying to disable"));
}else
{
Log.d("AudioRecord", "aec is unavailable");
}
They work on api 16+, but they have also Equalizer, Bass boost and other stuff you mention. Setup them before starting recording.
In your question you are a bit unspecific about the type of noise you are experiencing.
But if you record and play back the same sound, you are likely to experience severe feedback or echos. Usually a very loud high pitch sound.
Android has built in echo cancellation to solve this problem. It is enabled by default if you use AudioSource.VOICE_COMMUNICATION, like this:
recorder = new AudioRecord( AudioSource.VOICE_COMMUNICATION,
44100,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT,
n );
Also, 44100 is the only sample rate that is guaranteed to work on all Android devices. Don't use anything else, or you may get strange results.
I am working on streaming radio application. everything is working fine except the changing the equalizer effect does not affect sound.
Changing the equalizer effect by calling usePreset(preset) does not make any changes in the sound effects.
Even though there is no error, why usePreset does not change the sound effects.
I have tested in samsung galaxy sII with 4.0.3.
public void startPlayer() {
//
// Check whether we can acquire the audio focus
// to start the player
//
if (!requestAudioFocus()) {
return;
}
if (null != mAudioPlayer) {
if (mAudioPlayer.isPlaying()) {
mAudioPlayer.stop();
}
mAudioPlayer.reset();
} else {
mAudioPlayer = new MediaPlayer();
mAudioPlayer.reset();
}
try {
notifyProgressUpdate(PLAYER_INITIALIZING);
try {
mEqualizer = new Equalizer(0, mAudioPlayer.getAudioSessionId());
mEqualizer.setEnabled(true);
Log.d(TAG,
"Audio Session ID " + mAudioPlayer.getAudioSessionId()
+ "Equalizer " + mEqualizer + " Preset "
+ mEqualizer.getCurrentPreset());
} catch (Exception ex) {
mEqualizer = null;
}
mAudioPlayer.setAudioStreamType(AudioManager.STREAM_MUSIC);
mAudioPlayer.setDataSource(mCurrentTrack.getStreamURL());
//
// Add the Listener to track the player status
//
mAudioPlayer.setOnCompletionListener(this);
mAudioPlayer.setOnBufferingUpdateListener(this);
mAudioPlayer.setOnPreparedListener(this);
mAudioPlayer.setOnInfoListener(this);
mAudioPlayer.setOnErrorListener(this);
notifyProgressUpdate(PLAYER_BUFFERING);
mAudioPlayer.prepareAsync();
} catch (IllegalArgumentException e) {
e.printStackTrace();
} catch (SecurityException e) {
e.printStackTrace();
} catch (IllegalStateException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
//Get the available presets from the equalizer
public String[] getEqualizerPresets() {
String[] presets = null;
short noOfPresets = -1;
if (null != mEqualizer) {
noOfPresets = mEqualizer.getNumberOfPresets();
presets = new String[noOfPresets];
for (short index = 0; index < noOfPresets; index++) {
presets[index] = mEqualizer.getPresetName(index);
}
}
return presets;
}
//Set the user preferred presets
public void setEqualizerPreset(int position) {
if (null != mEqualizer) {
Log.d(TAG, "setting equlizer effects " + position);
Log.d(TAG, "Equalizer " + mEqualizer + " set Preset " + position);
mEqualizer.usePreset((short)position);
Log.d(TAG, "Equalizer " + mEqualizer + " current Preset "
+ mEqualizer.getCurrentPreset());
}
}
Appreciate your help to identify the issue.
EDIT
This issue is not resolved yet. i did not find any sample code which explain Equalizer Preset usage.
Any reference to code sample which uses Preset welcome.
this is a fully source code for equalizer, hope this will help you
I have the same problem. When I load it on emulator it produce an error that I don't really know why, it always says ...audiofx.Equalizer. and audiofx.AudioEffect. or something similar. But I have discovered that if you have other media player like n7player in my case, try to close it and try again your media player. In my case it works, but I think that it has to be one method to get some equalizer that is active.
Attempting to use Android 4 API 14 face recognition found in Camera.Face class.
I'm having difficulty getting values for face coordinates [Left/Right eye, mouth].
Device im using is Samsung Galaxy Tab 2 [GT-P5100] with Android 4.0.4
I'm initialising face detection something like below code snippet and the value of camera.getParameters().getMaxNumDetectedFaces() is returned as 3 when running on the above mentioned device.
Now when face is introduced to the surface frame and detected in face detection listener, it returns back the values in faces[0].rect.flattenToString() identifying position of the face on surface. However the rest of the values i.e. face id, left/right eye and mouth are returned as -1 and Null respectively.
This behaviour is described in documentation as
This is an optional field, may not be supported on all devices. If not supported, the value will always be set to null. The optional fields are supported as a set. Either they are all valid, or none of them are.
So the question is am I missing something or is it simply that my device can not support Android api face recognition as found in Camera.Face?
It is worth to mention that same device offeres face log in to the device, which is configured trough user settings.
FaceDetectionListener faceDetectionListener = new FaceDetectionListener(){
#Override
public void onFaceDetection(Face[] faces, Camera camera) {
if (faces.length == 0){
prompt.setText(" No Face Detected! ");
}else{
prompt.setText(String.valueOf(faces.length) + " Face Detected :) [ "
+ faces[0].rect.flattenToString()
+ "Coordinates : Left Eye - " + faces[0].leftEye + "]"
) ;
Log.i("TEST", "face coordinates = Rect :" + faces[0].rect.flattenToString());
Log.i("TEST", "face coordinates = Left eye : " + String.valueOf(faces[0].leftEye));
Log.i("TEST", "face coordinates = Right eye - " + String.valueOf(faces[0].rightEye));
Log.i("TEST", "face coordinates = Mouth - " + String.valueOf(faces[0].mouth));
}
.....
if (camera != null){
try {
camera.setPreviewDisplay(surfaceHolder);
camera.startPreview();
prompt.setText(String.valueOf(
"Max Face: " + camera.getParameters().getMaxNumDetectedFaces()));
camera.startFaceDetection();
previewing = true;
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
In your initialization code, you need to set the face detection listener for the camera.
I'm creating a TTS app using google's unofficial tts api and it works fine but the api can only process a maximum of 100 characters at a time but my application may have to send strings containing as many as 300 characters.
Here is my code
try {
String text = "bonjour comment allez-vous faire";
text=text.replace(" ", "%20");
String oLanguage="fr";
MediaPlayer player = new MediaPlayer();
player.setAudioStreamType(AudioManager.STREAM_MUSIC);
player.setDataSource("http://translate.google.com/translate_tts?tl=" + oLanguage + "&q=" + text);
player.prepare();
player.start();
} catch (Exception e) {
// TODO: handle exception
}
So my questions are
How do I get it to check the number of characters in the string and send only complete words within the 100 character limit.
How do I detect when the first group of TTS is finished so I can send the second to avoid both speech overlapping each other
Is there any need for me to use Asynctask for this process?
1.How do I get it to check the number of characters in the string and send only complete words within the 100 character limit.
ArrayList<String> arr = new ArrayList<String>();
int counter = 0;
String textToSpeach = "Your long text";
if(textToSpeach.length()>100)
{
for(int i =0 ; i<(textToSpeach.length()/100)+1;i++)
{
String temp = textToSpeach.substring(0+counter,99+counter);
arr.add(temp.substring(0, temp.lastIndexOf(" ")));
counter = counter + 100;
}
}
2.How do I detect when the first group of TTS is finished so I can send the second to avoid both speech overlapping each other
player.setOnCompletionListener(new OnCompletionListener() {
#Override
public void onCompletion(MediaPlayer mp) {
// pass next block
}
});
3.Is there any need for me to use Asynctask for this process?
Right now I dont see any need for that.
Simple: this is not a public API, don't use it. Use Andorid's built-in TTS engine for speech synthesis. It does not have string length limitations.