i've been reading another posts about calculate the amplitude in real time from a Mediaplayer, but i have no clear how to get a value useful for me. What i need is a linear amplitude value normalize between 0-100, but as i've watched in another posts they are performing a db calculation which has not much sense, cause they are not normalized to max 0dB value (from How to calculate the audio amplitude in real time (android)):
double amplitude = 0;
for (int i = 0; i < audioData.length/2; i++) {
double y = (audioData[i*2] | audioData[i*2+1] << 8) / 32768.0
// depending on your endianness:
// double y = (audioData[i*2]<<8 | audioData[i*2+1]) / 32768.0
amplitude += Math.abs(y);
}
amplitude = amplitude / audioData.length / 2;
I've watched that for calculate de dB, i should do as below (from How to compute decibel (dB) of Amplitude from Media Player?
)
double sum=0;
for (int i = 0; i < audioData.length/2; i++) {
double y = (audioData[i*2] | audioData[i*2+1] << 8) / 32768.0;
sum += y * y;
}
double rms = Math.sqrt(sum / audioData.length/2);
dbAmp = 20.0*Math.log10(rms);
I've tried for that solution but the real time values are near to 0 but sometimes are over than 0, i mean, something between -Inifinit (no sound) to 1.2 (if i avoid 20.0* multiply) or anything else from than order. Anyway, i'd like to obtain a normalized value [0-100], not a dB value.
Related
I'm using the Oboe C++ library for playing sounds in my android application.
I want to change the pitch of my audio samples.
So, I started creating "mPos" float value to hold the current played frame and adding the "mPitch" value every step.
It seems like the audio played correctly with the new Pitch but it's double it's self when the pitch is high(e.g 1.2) and make a weird noise and when the pitch is low (e.g 0.212).
This is my first-time audio programming,
I did a lot of research before I post this question. I even send messages directly to "Oboe" supports but no response.
Does anyone have any idea how to implement the Pitch correctly?
streamLength always 192
channelCount always 2
Code:
void Player::renderAudio(float *stream, int32_t streamLength){
const int32_t channelCount = mSound->getChannelCount();
if (mIsPlaying){
float framesToRenderFromData = streamLength ;
float totalSourceFrames = mSound->getTotalFrames()/mPitch;
const float *data = mSound->getData();
// Check whether we're about to reach the end of the recording
if (mPos + streamLength >= totalSourceFrames ){
framesToRenderFromData = (totalSourceFrames - mPos);
mIsPlaying = false;
}
for (int i = 0; i < framesToRenderFromData; ++i) {
for (int j = 0; j < channelCount; ++j) {
if(j % 2 == 0){
stream[(i*channelCount)+j] = (data[((size_t)mPos * channelCount)) + j] * mLeftVol) * mVol;
}else{
stream[(i*channelCount)+j] = (data[((size_t)mPos * channelCount)) + j] * mRightVol) * mVol;
}
}
mPos += mPitch;
if(mPos >= totalSourceFrames){
mPos = 0;
}
}
if (framesToRenderFromData < streamLength){
renderSilence(&stream[(size_t)framesToRenderFromData], streamLength * channelCount);
}
} else {
renderSilence(stream, streamLength * channelCount);
}
}
void Player::renderSilence(float *start, int32_t numSamples){
for (int i = 0; i < numSamples; ++i) {
start[i] = 0;
}
}
void Player::setPitch(float pitchData){
mPitch = pitchData;
};
When you multiply a float variable (mPos) by an integer-type variable (channelCount), the result is a float. You are, at the least, messing up your channel interleaving. Instead of
(size_t)(mPos * channelCount)
try
((size_t)mPos) * channelCount
EDIT:
You are intentionally looping the source when reaching the end, with the if statement that results in mPos = 0;. Instead of doing this, you could calculate the number of source samples independently of the pitch, but break out of the loop when your source samples are exhausted. Also, your comparison of the source and destination samples isn't useful because of the pitch adjustment:
float framesToRenderFromData = streamLength ;
float totalSourceFrames = mSound->getTotalFrames(); // Note change here
const float *data = mSound->getData();
// Note: Only check here is whether mPos has reached the end from
// a previous call
if ( mPos >= totalSourceFrames ) {
framesToRenderFromData = 0.0f;
}
for (int i = 0; i < framesToRenderFromData; ++i) {
for (int j = 0; j < channelCount; ++j) {
if(j % 2 == 0){
stream[(i*channelCount)+j] = (data[((size_t)mPos * channelCount)) + j] * mLeftVol) * mVol;
}else{
stream[(i*channelCount)+j] = (data[((size_t)mPos * channelCount)) + j] * mRightVol) * mVol;
}
}
mPos += mPitch;
if ( ((size_t)mPos) >= totalSourceFrames ) { // Replace this 'if' and its contents
framesToRenderFromData = (size_t)mPos;
mPos = 0.0f;
break;
}
}
A note, however, for completeness: You really shouldn't be accomplishing pitch change in this way for any serious application -- the sound quality will be terrible. There are free libraries for audio resampling to an arbitrary target rate; these will convert your source sample to a higher or lower number of samples, and provide quality pitch changes when replayed at the same rate as the source.
I am currently writting a a spectrum analyzer for android for university and part of this involves plotting the FFT of sound. However, I am having an issue with plotting the frequencies. The freq values start off correct, but as i move to higher frequencies the error is becoming greater and greater (at 3000Hz, the graph will show ~3750). I feel as though there is an error in the way I am calculating the x-axis or freq values. This is a manually drawn graph for speed purposes.
If more info/code is needed just let me know, but my guess is that it is something simple that I have overlooked. Thanks
xVal is the frequency value. and the scale value is to scale it according to the real graph dimensions.
int length = currentWaveDataDouble.length;
int pow2 = Integer.highestOneBit(length) << 1;
int sampleRate = 44100;
...
//actual plot part
for(int i =0; i<p2.length; i++) {
float xVal = (float)(i * scaleX.ScaleValue(((double) sampleRate / (pow2 >> 1))));
if (xVal < maxFreqPlus1) {
xVal += axisWidth + yAxisMargin;
float yVal = (float) scaleY.ScaleValue(p2[i]);
yVal += axisWidth + xAxisMargin;
canvas.drawPoint(xVal,yVal, marker);
if(yVal > yMax)
{
yMax = yVal;
xMax = xVal;
}
}
}
Freq generator set to 4000 Hz
Freq generator set to 1000 Hz (value is 1250Hz)
Found the issue. it was in the scaler.
ValueScaler scaleY = new ValueScaler(0,maxAmpPlus1 - yAxisMargin,0,baseY);
ValueScaler scaleX = new ValueScaler(0,maxFreqPlus1 - xAxisMargin,0,baseX);
i wasn't taking into account the x and y margin when scaling the numbers.
I am trying to capture the image data in the onFrameAvailable method from a Google Tango. I am using the Leibniz release. In the header file it is said that the buffer contains HAL_PIXEL_FORMAT_YV12 pixel data. In the release notes they say the buffer contains YUV420SP. But in the documentation it is said the pixels are RGBA8888 format (). I am a little confused and additionally. I don't really get image data but a lot of magenta and green. Right now I am trying to convert from YUV to RGB similar to this one. I guess there is something wrong with the stride, too. Here eís the code of the onFrameAvailable method:
int size = (int)(buffer->width * buffer->height);
for (int i = 0; i < buffer->height; ++i)
{
for (int j = 0; j < buffer->width; ++j)
{
float y = buffer->data[i * buffer->stride + j];
float v = buffer->data[(i / 2) * (buffer->stride / 2) + (j / 2) + size];
float u = buffer->data[(i / 2) * (buffer->stride / 2) + (j / 2) + size + (size / 4)];
const float Umax = 0.436f;
const float Vmax = 0.615f;
y = y / 255.0f;
u = (u / 255.0f - 0.5f) ;
v = (v / 255.0f - 0.5f) ;
TangoData::GetInstance().color_buffer[3*(i*width+j)]=y;
TangoData::GetInstance().color_buffer[3*(i*width+j)+1]=u;
TangoData::GetInstance().color_buffer[3*(i*width+j)+2]=v;
}
}
I am doing the yuv to rgb conversion in the fragment shader.
Has anyone ever obtained an RGB image for the Google Tango Leibniz release? Or had someone similar problems when converting from YUV to RGB?
YUV420SP (aka NV21) is correct for the time being. An explanation is here. In this format you have a width x height array where each element is a Y byte, followed by a width/2 x height/2 array where each element is a V byte and a U byte. Your code is implementing YV21, which has separate arrays for V and U instead of interleaving them in one array.
You mention that you are doing YUV to RGB conversion in a fragment shader. If all you want to do with the camera images is draw then you can use TangoService_connectTextureId() and TangoService_updateTexture() instead of TangoService_connectOnFrameAvailable(). This approach delivers the camera image to you already in an OpenGL texture that gives your fragment shader RGB values without bothering with the pixel format details. You will need to bind to GL_TEXTURE_EXTERNAL_OES (instead of GL_TEXTURE_2D), and your fragment shader would look something like this:
#extension GL_OES_EGL_image_external : require
precision mediump float;
varying vec4 v_t;
uniform samplerExternalOES colorTexture;
void main() {
gl_FragColor = texture2D(colorTexture, v_t.xy);
}
If you really do want to pass YUV data to a fragment shader for some reason, you can do so without preprocessing it into floats. In fact, you don't need to unpack it at all - for NV21 just define a 1-byte texture for Y and a 2-byte texture for VU, and load the data as-is. Your fragment shader will use the same texture coordinates for both.
By the way, if someone experienced problems with capturing the image data on the Leibniz release, too: One of the developers told me that there is a bug concerning the camera and that it should be fixed with the Nash release.
The bug caused my buffer to be null but when I used the Nash update I got data again. However, right now the problem is that the data I am using doesn't make sense. I guess/hope the cause is that the Tablet didn't get the OTA update yet (there can be a gap between the actual release date and the OTA software update).
Just try code following:
//C#
public bool YV12ToPhoto(byte[] data, int width, int height, out Texture2D photo)
{
photo = new Texture2D(width, height);
int uv_buffer_offset = width * height;
for (int i = 0; i < height; i++)
{
for (int j = 0; j < width; j++)
{
int x_index = j;
if (j % 2 != 0)
{
x_index = j - 1;
}
// Get the YUV color for this pixel.
int yValue = data[(i * width) + j];
int uValue = data[uv_buffer_offset + ((i / 2) * width) + x_index + 1];
int vValue = data[uv_buffer_offset + ((i / 2) * width) + x_index];
// Convert the YUV value to RGB.
float r = yValue + (1.370705f * (vValue - 128));
float g = yValue - (0.689001f * (vValue - 128)) - (0.337633f * (uValue - 128));
float b = yValue + (1.732446f * (uValue - 128));
Color co = new Color();
co.b = b < 0 ? 0 : (b > 255 ? 1 : b / 255.0f);
co.g = g < 0 ? 0 : (g > 255 ? 1 : g / 255.0f);
co.r = r < 0 ? 0 : (r > 255 ? 1 : r / 255.0f);
co.a = 1.0f;
photo.SetPixel(width - j - 1, height - i - 1, co);
}
}
return true;
}
I have succeeded.
I'm trying to build the weighted-average from the sensor-data I get by the SensorManager.
My problem is, that bearing, pitch and roll have a maximum value and when I'm exactly at this point, the values swap from 0 to 359 or backwards.
My average is at the moment simply an addition of all values and one division by the number of values.
Let's say I get the values: 1, 359, 350, 10
In this case, I want to have an average of 0. How do I have to change my equation to get this functionality?
Do I have to check for the "nearest" distance to 0/360 and using this value instead of the real value?
This would also make some troubles if I have values around 180:
160, 200 -> the average has to be 180, but with my nearest-distance idea, it would be 160, because 200 + 160 = 360.
How can I solve this?
Edit: This are the values I get from the SensorManager.
0 <= azimuth<360
180<=pitch<=180
90<=roll<=90
Edit2: Sorry, I forgot to mention that I'm using a weighted average:
double sum = 0;
for int i = 0; i < max; i++
sum += value[i] * (i / (triangular_number(max))
return sum
To calculate average of angle use the following
public static final float averageAngle(float[] terms, int totalTerm)
{
float sumSin = 0;
float sumCos = 0;
for (int i = 0; i < totalTerm; i++)
{
sumSin += Math.sin(terms[i]);
sumCos += Math.cos(terms[i]);
}
return (float) Math.atan2(sumSin / totalTerm, sumCos / totalTerm);
}
I found a blog post about this.
To summarize it shortly: You have to calculate the average of the sines of all your aizmuth values and the average of the cosines of the aizmuth values and then put these average values in the atan2 function, then if necessary make the result positive by adding 2 * PI. Don't forget to convert degrees values to radians and vice versa.
I am trying to fetch noise level of recorded audio in decibals. I am using following code but it is not giving the correct output
byte[] audioData = new byte[bufferSize];
recorder.read(audioData, 0, bufferSize);
ByteBuffer bb = ByteBuffer.wrap(audioData);
int sampleSize = bb.getInt();
now if I log sampleSize then it gives very huge value like 956318464
Can anybody tell how to get correct noise level in decibals.
First off- decibels is a ratio. You can't just get decibels, you need to compare the volume to a baseline measurement. So the real equation in terms of amplitude is
db= 10* log10(amplitude/baseline_amplitude);
If you're recording the audio now, to get the amplitude use MediaRecorder.getMaxAmplitude. For a baseline amplitude, measure the expected background noise.
public int calculatePowerDb(short[] sdata, int off, int samples)
{
double sum = 0;
double sqsum = 0;
for (int i = 0; i < samples; i++)
{
final long v = sdata[off + i];
sum += v;
sqsum += v * v;
}
double power = (sqsum - sum * sum / samples) / samples;
power /= MAX_16_BIT * MAX_16_BIT;
double result = Math.log10(power) * 10f + FUDGE;
return (int)result;
}
private static final float MAX_16_BIT = 32768;
private static final float FUDGE = 0.6f;
its works fine this methode