I have a PCM datafile that I know is valid. I can play it, edit it into pieces, etc. and it will always play, as well as the individual pieces.
But when I try to translate it into shorts from bytes
bytes[i] | (bytes[i+1] << 8)
The file is 16 bit, single channel and 44100 sampling. I don't see anything that looks like a wave file visually.
As a test I record led among silencer with one very loud sound in the middle. Still the chart I made from my intake looked like every other chart when I try this. Am I somehow doing this wrong? Or misunderstanding what I'm reading/attempting?
All am looking to do is detect a very low threshold to find a word gap.
Thanks
My psychic powers suggest this is a big-endian vs little-endian thing.
If the source file stores samples in big-endian, this is likely what you want:
(bytes[i] << 8) | (bytes[i+1])
For what it's worth, WAV files are little-endian.
Other possibilities include:
I don't see your code, but maybe your code is only incrementing i by 1 instead of 2 on every loop iteration. (A common mistake I've made in my own code).
signed types or casting. Be explicit how you do the bit operations with respect to signed vs. unsigned. I'm not sure if "bytes" is an array of "unsigned char" or "char". Nor am I sure if "char" defaults to signed or unsigned. This might be better:
unsigned char b1 = (unsigned char)(bytes[i]);
unsigned char b2 = (unsigned char)(bytes[i+1]);
short sample = (short)((b1 << 8) | (b2));
Related
I'm trying to write Android Camera stream frames to the UVC Buffer using FileOutputStream. For context: the UVC Driver is working on the device and it has a custom built kernel.
I get 24 frames per second using imageAnalyzer:
imageAnalyzer = ImageAnalysis.Builder()
.setTargetAspectRatio(screenAspectRatio)
.setOutputImageFormat(ImageAnalysis.OUTPUT_IMAGE_FORMAT_YUV_420_888)
...
imageAnalysis.setAnalyzer(cameraExecutor) { image ->
val buffer = image.planes[0].buffer
val data = buffer.toByteArray()
...
}
Then based on UVC Specifications I build the header of the frame:
val header = ByteBuffer.allocate(26)
val frameSize = image.width * image.height * ImageFormat.getBitsPerPixel(image.format) / 8
val EOH = 0x01
val ERR = 0x00
val STI = 0x01
val REST = 0x00
val SRC = 0x00
val PTS = (System.currentTimeMillis() - referenceTime) * 10000
val endOfFrame = 0x01
val FID = (frameId).toByte()
Add all of the above to the header
header.putInt(frameSize)
header.putShort(image.width.toShort())
header.putShort(image.height.toShort())
header.put(image.format.toByte())
header.put(((EOH shl 7) or (ERR shl 6) or (STI shl 5) or (REST shl 4) or SRC).toByte())
header.putLong(PTS)
header.put(endOfFrame.toByte())
header.put(FID)
Open the FileOutputStream and try to write the header and the image:
val uvcFileOutputStream = FileOutputStream("/dev/video3", true)
uvcFileOutputStream.write(header.toByteArray() + data)
uvcFileOutputStream.close()
Tried to tweak the header/payload but I'm still getting the same error:
java.io.IOException: write failed: EINVAL (Invalid argument)
at libcore.io.IoBridge.write(IoBridge.java:654)
at java.io.FileOutputStream.write(FileOutputStream.java:401)
at java.io.FileOutputStream.write(FileOutputStream.java:379)
What could I be doing wrong? is the header format wrong?
I don't know the answer directly, but I was curious to look and have some findings. I focused on the Kotlin part, as I don't know about UVC and because I suspect the problem to be there.
Huge assumption
Since there's no link to the specification I just found this source:
https://www.usb.org/document-library/video-class-v15-document-set
within the ZIP I looked at USB_Video_Payload_Frame_Based_1.5.pdf
Page 9, Section 2.1 Payload Header
I'm basing all my findings on this, so if I got this wrong, everything else is. It could still lead to a solution though if you validated the same things.
Finding 1: HLE is wrong
HLE is the length of the header, not the image data. You're putting the whole image size there (all the RGB byte data). Table 2-1 describes PTS and SCR bits control whether PTS and SCR are present. This means that if they're 0 in BFH then the header is shorter. This is why HLE is either 2, 6, 12.
Confirmation source + the fact that the field is 1 byte long (each row of Table 2-1 is 1 byte/8 bits) which means the header can be only up to 255 bytes long.
Finding 2: all the header is misaligned
Since your putting HLE with putInt, you're writing 4 bytes, from this point on, everything is messed up in the header, the flags depend on image size, etc.
Finding 3: SCR and PTS flag inconsistencies
Assuming I was wrong about 1 and 2. You're still setting the SRC and PTS bit to 0, but pushing a long (8 bytes).
Finding 4: wrong source
Actually, something is really off at this point, so I looked at your referenced GitHub ticket and found a better example of what your code represents:
Sadly, I was unable to match up what your header structure is, so I'm going to assume that you are implementing something very similar to what I was looking at, because all PDFs had pretty much the same header table.
Finding 5: HLE is wrong or missing
Assuming you need to start with the image size, the HLE is still wrong because it's the image format's type, not in connection with SCR and PTS flags.
Finding 6: BFH is missing fields
If you're following one of these specs, the BFH is always one byte with 8 bits. This is confirmed by how the shls are putting it together in your code and the descriptions of each flag (flag = true|false / 1/0).
Finding 7: PTS is wrong
Multiplying something that is millisecond precise by 10000 looks strange. The doc says "at most 450 microseconds", if you're trying to convert between ms and us, I think the multiplier would be just 1000. Either way it is only an int (4 bytes) large, definitely not a long.
Finding 8: coding assistant?
I have a feeling after reading all this, that Copilot, ChatGPT or other generator wrote your original code. This sound confirmed by you looking for a reputable source.
Finding 9: reputable source example
If I were you I would try to find a working example of this in GitHub, using keyword search like this: https://github.com/search?q=hle+pts+sti+eoh+fid+scr+bfh&type=code the languages don't really matter since these are binary file/stream formats, so regardless of language they should be produced and read the same way.
Finding 10: bit order
Have a look at big endian / little endian byte order. If you look at Table 2-1 in the PDF I linked you can see which bit should map to which byte. You can specify the order you need easily on the buffer BEFORE writing to it, by the looks of the PDF it is header.order(ByteOrder.LITTLE_ENDIAN). I think conventionally 0 is the lowest bit and 31 is the highest. I can't cite a source on this, I seem to remember from uni. Bit 0 should be the 2^0 component (1) and bit 7 is the 2^7 (128). Reversing it would make things much harder to compute and comprehend. So PTS [7:0] means that byte is the lowest 8 bits of the 32 bit PTS number.
If you link to your specification source, I can revise what I wrote, but likely will find very similar guesses.
Im trying to return a byte array from an android native service(C++) to an android application using AIDL. Im calculating the byte array from two files and trying to split the final array at the client side using the length of byte array of one file.For Eg resultFinal= lengthof privKey + privKey + pubKey
std::ifstream _privKey("/etc/myPrivkey", std::ios::in | std::ios::binary);
std::vector<uint8_t> _privKeyContents((std::istreambuf_iterator<char>(_privKey)), std::istreambuf_iterator<char>());
std::ifstream _pubKey("/etc/myPubkey", std::ios::in | std::ios::binary);
std::vector<uint8_t> _pubKeyContents((std::istreambuf_iterator<char>(_pubKey)), std::istreambuf_iterator<char>());
vector<uint8_t> certFinal;
uint8_t keysize=_privKeyContents.size(); Verified this and the keysize is 161
resultFinal.insert(resultFinal.begin(),_pubKeyContents.begin(), _pubKeyContents.end());
resultFinal.insert(resultFinal.begin(),_privKeyContents.begin(), _privKeyContents.end());
resultFinal.insert(resultFinal.begin(),keysize);
I assume at the client side the first element of the byte array will be the size of the _privKeyContents and using that
value i can split the byte array in to two. I was expecting the first element of the byte array to be 161 but instead im getting -95
Can someone help me to identify the issue ? or Is my approach is wrong ? Please let me know if there any other input needed from my end
Thanks In advance
Ps: I dont have much idea about C++.
I'm developing a VoIP application that runs at the sampling rate of 48 kHz. Since it uses Opus, which uses 48 kHz internally, as its codec, and most current Android hardware natively runs at 48 kHz, AEC is the only piece of the puzzle I'm missing now. I've already found the WebRTC implementation but I can't seem to figure out how to make it work. It looks like it corrupts the memory randomly and crashes the whole thing sooner or later. When it doesn't crash, the sound is kinda chunky as if it's quieter for the half of the frame. Here's my code that processes a 20 ms frame:
webrtc::SplittingFilter* splittingFilter;
webrtc::IFChannelBuffer* bufferIn;
webrtc::IFChannelBuffer* bufferOut;
webrtc::IFChannelBuffer* bufferOut2;
// ...
splittingFilter=new webrtc::SplittingFilter(1, 3, 960);
bufferIn=new webrtc::IFChannelBuffer(960, 1, 1);
bufferOut=new webrtc::IFChannelBuffer(960, 1, 3);
bufferOut2=new webrtc::IFChannelBuffer(960, 1, 3);
// ...
int16_t* samples=(int16_t*)data;
float* fsamples[3];
float* foutput[3];
int i;
float* fbuf=bufferIn->fbuf()->bands(0)[0];
// convert the data from 16-bit PCM into float
for(i=0;i<960;i++){
fbuf[i]=samples[i]/(float)32767;
}
// split it into three "bands" that the AEC needs and for some reason can't do itself
splittingFilter->Analysis(bufferIn, bufferOut);
// split the frame into 6 consecutive 160-sample blocks and perform AEC on them
for(i=0;i<6;i++){
fsamples[0]=&bufferOut->fbuf()->bands(0)[0][160*i];
fsamples[1]=&bufferOut->fbuf()->bands(0)[1][160*i];
fsamples[2]=&bufferOut->fbuf()->bands(0)[2][160*i];
foutput[0]=&bufferOut2->fbuf()->bands(0)[0][160*i];
foutput[1]=&bufferOut2->fbuf()->bands(0)[1][160*i];
foutput[2]=&bufferOut2->fbuf()->bands(0)[2][160*i];
int32_t res=WebRtcAec_Process(aecState, (const float* const*) fsamples, 3, foutput, 160, 20, 0);
}
// put the "bands" back together
splittingFilter->Synthesis(bufferOut2, bufferIn);
// convert the processed data back into 16-bit PCM
for(i=0;i<960;i++){
samples[i]=(int16_t) (CLAMP(fbuf[i], -1, 1)*32767);
}
If I comment out the actual echo cancellation and just do the float conversion and band splitting back and forth, it doesn't corrupt the memory, doesn't sound weird and runs indefinitely. (I do pass the farend/speaker signal into AEC, I just didn't want to make the mess of my code by including it in the question)
I've also tried Android's built-in AEC. While it does work, it upsamples the captured signal from 16 kHz.
Unfortunately, there is no free AEC package that support 48khz. So, either move to 32khz or use a commercial AEC package at 48khz.
For a school project I am creating an android app that involves streaming image data. I've finished all the requirements about a month and a half early, and am looking for ways to improve my app. One thing I heard of is using the android NDK to optimize heavily used pieces of code.
What my app does is simulate a live video coming in over a socket. I am simultaneously reading the pixel data from a UDP packet, and writing it to an int array, which I then use to update the image on the screen.
I'm trying to decide if trying to increase my frame rate (which is about 1 fps now, which is sufficient for my project) is the right path to follow for my remaining time, or if I should instead focus on adding new features.
Anyway, here is the code I am looking at:
public void updateBitmap(byte[] buf, int thisPacketLength, int standardOffset, int thisPacketOffset) {
int pixelCoord = thisPacketOffset / 3 - 1;
for (int bufCoord = standardOffset; bufCoord < thisPacketLength; bufCoord += 3) {
pixelCoord++;
pixelData[pixelCoord] = 0xFF << 24 | (buf[bufCoord + 2] << 16) & 0xFFFFFF | (buf[bufCoord + 1] << 8) & 0xFFFF | buf[bufCoord] & 0xFF;
}
}
I call this function about 2000 times per second, so it definitely is the most used piece of code in my app. Any feedback on if this is worth optimizing?
Why not just give it a try? There are many guides to creating functions using NDK, you seem to have a good grasp of the reasoning to do so and understand the implications so it should be easy to translate this small function.
Compare the two approaches, you will no doubt learn something which is always good, and it will give you something to write about if you need to write a report to go with the project.
Iam trying to draw a graph of a sound file. I am doing in such a way that i take the bytes of the sound file using Fileinputstream and change it to the shorts and take the samples of of the sound and draw the graph according to that note:first 45 bytes are header so i skip these bytes.And i succeeded in doing this.
File-->skip header(44)-->Bytes-->to shorts-->seek to point-->take samples in Shorts-->DRAW THE GRAPH
But the problem is, I cant take the bytes of large audio file(2GB) to memory. Memory crash occurs.
So tried reading the shorts directly from the file using RandomAcessfile. But when I do like this am not getting a correct graph. I hope there is some sort change in samples am reading.
File-->skip header(44)-->seek to point-->take samples in Shorts-->DRAW THE GRAPH
My doubt is, any change occurs to the short samples of an audio data when we read directly from the file??RandomAcessFile is a good method? Is there any way to get the samples of a 2GB audio file without any change in samples.
Note:I skip the header first 44 bytes.
double mPerwidth = (double) iPixel / 960;
int SampletoDraw;
long totalLen = mRfs.length() - 44;
SampletoDraw = (int) (totalLen * mPerwidth) * 2;
System.out.println(iPixel);
// val=mRfs.readShort()/2;
mRfs.seek(SampletoDraw);
bData[0] = mRfs.readByte();
mRfs.seek(SampletoDraw + 1);
bData[1] = mRfs.readByte();
val = (short) ((bData[1] & 0xff) << 8 | (bData
[0] & 0xff));