I am working with multiple particle systems which I am attempting to composite together in a frame buffer to be used in another shader. Each compute shader is working when I run each of them on their own, but something breaks when I attempt to call more than one sequentially. I have tried this many ways so hopefully I am close with at least one of them.
Combining buffers via glBufferSubData:
VBO/VAO
//generate vbo and load data
GLES31.glGenBuffers(1, vbo, 0)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, vbo[0])
GLES31.glBufferData(GLES31.GL_SHADER_STORAGE_BUFFER, 4 * (particleCoords.size + particleCoords2.size + particleCoords3.size), null, GLES31.GL_STATIC_DRAW)
GLES31.glBufferSubData(GLES31.GL_SHADER_STORAGE_BUFFER, 0, particleCoords.size * 4, particleCoordBuffer)
GLES31.glBufferSubData(GLES31.GL_SHADER_STORAGE_BUFFER, particleCoords.size * 4, particleCoords2.size * 4, particleCoord2Buffer)
GLES31.glBufferSubData(GLES31.GL_SHADER_STORAGE_BUFFER, (particleCoords.size + particleCoords2.size) * 4, particleCoords3.size * 4, particleCoord3Buffer)
GLES31.glBindBufferBase(GLES31.GL_SHADER_STORAGE_BUFFER, 0, vbo[0])
// generate vao
GLES31.glGenVertexArrays(1, vao, 0)
Update positions with compute shader
GLES31.glBindVertexArray(vao[0])
GLES31.glBindBuffer(GLES31.GL_ARRAY_BUFFER, vbo[0])
GLES31.glEnableVertexAttribArray(ShaderManager.particlePositionHandle)
GLES31.glVertexAttribPointer(ShaderManager.particlePositionHandle, COORDS_PER_VERTEX, GLES31.GL_FLOAT, false, COORDS_PER_VERTEX * 4, computeOffsetForIndex(index) / COORDS_PER_VERTEX)//computeOffsetForIndex(index) * 4)
GLES31.glUseProgram(ShaderManager.particleComputeProgram)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, vbo[0])
GLES31.glUniform1i(ShaderManager.particleTimeHandle, time)
GLES31.glDispatchCompute((sizeForIndex(index) / COORDS_PER_VERTEX) / 128 + 1, 1, 1)
GLES31.glMemoryBarrier(GLES31.GL_ALL_BARRIER_BITS)
// cleanup
GLES31.glBindVertexArray(0)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, 0)
GLES31.glDisableVertexAttribArray(ShaderManager.particlePositionHandle)
GLES31.glBindBuffer(GLES31.GL_ARRAY_BUFFER, 0)
Draw particles
GLES31.glUseProgram(ShaderManager.particleDrawProgram)
GLES31.glClear(GLES31.GL_COLOR_BUFFER_BIT)
GLES31.glActiveTexture(GLES31.GL_TEXTURE0)
GLES31.glBindTexture(GLES31.GL_TEXTURE_2D, snowTexture[0])
GLES31.glUniform1i(ShaderManager.particleTextureHandle, 0)
GLES31.glEnableVertexAttribArray(ShaderManager.particlePositionHandle)
GLES31.glVertexAttribPointer(ShaderManager.particlePositionHandle, COORDS_PER_VERTEX, GLES31.GL_FLOAT, false, COORDS_PER_VERTEX * 4, computeOffsetForIndex(index) / COORDS_PER_VERTEX)
GLES31.glBindVertexArray(vao[0])
GLES31.glDrawArrays(GLES31.GL_POINTS, computeOffsetForIndex(index) / COORDS_PER_VERTEX, sizeForIndex(index) / COORDS_PER_VERTEX)
// cleanup
GLES31.glDisableVertexAttribArray(ShaderManager.particlePositionHandle)
GLES31.glBindVertexArray(0)
GLES31.glBindFramebuffer(GLES31.GL_FRAMEBUFFER, 0)
GLES31.glBindTexture(GLES31.GL_TEXTURE0, 0)
onDrawFrame
GLES31.glBindFramebuffer(GLES31.GL_FRAMEBUFFER, particleFramebuffer[0])
GLES31.glBindTexture(GLES31.GL_TEXTURE_2D, particleFrameTexture[0])
GLES31.glTexImage2D(GLES31.GL_TEXTURE_2D, 0, GLES31.GL_RGBA, screenWidth, screenHeight, 0, GLES31.GL_RGBA, GLES31.GL_UNSIGNED_BYTE, null)
GLES31.glTexParameteri(GLES31.GL_TEXTURE_2D, GLES31.GL_TEXTURE_MAG_FILTER, GLES31.GL_LINEAR)
GLES31.glTexParameteri(GLES31.GL_TEXTURE_2D, GLES31.GL_TEXTURE_MIN_FILTER, GLES31.GL_LINEAR)
GLES31.glFramebufferTexture2D(GLES31.GL_FRAMEBUFFER, GLES31.GL_COLOR_ATTACHMENT0, GLES31.GL_TEXTURE_2D, particleFrameTexture[0], 0)
for (i in 0 until (ShaderManager.particleShaderInfo?.computeIds?.size ?: 0)) {
updateParticles(i)
drawParticles(i)
}
This results in the first particle system animating and drawing as anticipated. Any systems updated after the first do not animate and all draw at (0,0,0). Starting from index 1 has the same results only properly updating the first system and not the remaining. This makes me think something is off when running the shaders one after the other, but seems it shouldn't be an issue as the data from each call is unrelated.
Multiple VBO/VAO
What I thought made the most sense was using a separate VBO/VAO for each system, though it seems I'm missing something when it comes to swapping the bound buffer on the GL_SHADER_STORAGE_BUFFER. This attempt resulted in none of my systems updating correctly. I understand that glVertexAttribPointer only cares about what is currently bound, but perhaps persistence of the changed buffer is lost then?
// generate vbo and load data
GLES31.glGenBuffers(1, vbo, 0)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, vbo[0])
GLES31.glBufferData(GLES31.GL_SHADER_STORAGE_BUFFER, 4 * (particleCoords.size), particleCoordBuffer, GLES31.GL_STATIC_DRAW)
GLES31.glBindBufferBase(GLES31.GL_SHADER_STORAGE_BUFFER, 0, vbo[0])
// generate vao
GLES31.glGenVertexArrays(1, vao, 0)
//-------2----------
// generate vbo and load data
GLES31.glGenBuffers(1, vbo2, 0)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, vbo2[0])
GLES31.glBufferData(GLES31.GL_SHADER_STORAGE_BUFFER, 4 * particleCoords2.size, particleCoord2Buffer, GLES31.GL_STATIC_DRAW)
GLES31.glBindBufferBase(GLES31.GL_SHADER_STORAGE_BUFFER, 0, vbo2[0])
// generate vao
GLES31.glGenVertexArrays(1, vao2, 0)
//--------3-----------
// generate vbo and load data
GLES31.glGenBuffers(1, vbo3, 0)
GLES31.glBindBuffer(GLES31.GL_SHADER_STORAGE_BUFFER, vbo3[0])
GLES31.glBufferData(GLES31.GL_SHADER_STORAGE_BUFFER, 4 * particleCoords3.size, particleCoord3Buffer, GLES31.GL_STATIC_DRAW)
GLES31.glBindBufferBase(GLES31.GL_SHADER_STORAGE_BUFFER, 0, vbo3[0])
// generate vao
GLES31.glGenVertexArrays(1, vao3, 0)
The update and draw functions for this approach are the same as above other than the VBO and VAO being swapped out each call for the appropriate objects for the index. I understand that binding a buffer replaces the previous buffer, so perhaps the persisted data is also lost at that point?
Hopefully I'm on the right track, but would welcome a better approach. Thanks for taking a look!
UPDATE
It appears my issue is not actually related to the compute shaders or the data structure being used. It seems the issue actually lies with the sampling of the frame buffer textures and mixing them in the final draw call.
I am currently mixing in a color based on the alpha channel in the frame buffer textures but the output is puzzling. Doing this with each of the three textures successfully adds tex1 and tex2, but not tex3.
outColor = mix(outColor, vec4(1.0), texture2D(tex1, f_texcoord).a);
outColor = mix(outColor, vec4(1.0), texture2D(tex2, f_texcoord).a);
outColor = mix(outColor, vec4(1.0), texture2D(tex3, f_texcoord).a);
Commenting out the mixing of tex1 results in the last two mixes working as expected. This very much confuses me and I'm not sure what the cause could be. Clearly the textures are assigned to the correct indexes as I can access them properly, it's just that I'm unable to add all three. I'm also sampling two other textures in this shader that are working as expected.
Any thoughts would be greatly appreciated!
I have a tflite model expecting an input shape of (1, 1000, 12). Just to test it, I'm intending to load a CSV file and run inference on it. Below is my code and the relevant portion of the error msg that I get when running it.
I assume I'm making a mistake in properly loading or reading the CSV file. I'm relatively new to Android and would gladly appreciate any help on this matter!
val testModel = myModel.newInstance(context)
// Creates inputs for reference.
val inputFeature0 = TensorBuffer.createFixedSize(intArrayOf(1, 1000, 12), DataType.FLOAT32)
val openRawResource = resources.openRawResource(R.raw.inputdata).readBytes()
val byteBuffer = ByteBuffer.wrap(openRawResource)
// inputFeature0.loadBuffer(byteBuffer)
inputFeature0.loadBuffer(byteBuffer)
// Runs model inference and gets result.
val outputs = testModel.process(inputFeature0)
val outputFeature0 = outputs.outputFeature0AsTensorBuffer
// Releases model resources if no longer used.
model.close()
Caused by: java.lang.IllegalArgumentException: The size of byte buffer and the shape do not match.
at org.tensorflow.lite.support.common.SupportPreconditions.checkArgument(SupportPreconditions.java:104)
at org.tensorflow.lite.support.tensorbuffer.TensorBuffer.loadBuffer(TensorBuffer.java:296)
at org.tensorflow.lite.support.tensorbuffer.TensorBuffer.loadBuffer(TensorBuffer.java:323)
at com.example.ecgclassifier.MainActivity.analyze(MainActivity.kt:47)
at com.example.ecgclassifier.MainActivity.onCreate(MainActivity.kt:23)
Pleas make sure the above raw resource byte array has a float buffer array for the tensor, shaped [1, 1000, 12].
The error message said that the given byte array from the byteBuffer is not matched with the size requirement for the float tensor with the [1, 1000, 12] shape.
I created a Tensorflow model which takes a single 700x700 48-dimension "image" as an input (input shape is {1, 700, 700, 48}).
To do so, I used Numpy's numpy.concatenate([array_of_images], -1), when array_of_images is an array of 16 700x700 JPEG images.
I converted the model to Tensorflow Lite and I'm running it on Android.
No conversion errors or anything - all ops are valid and supported.
My question is - where in Android (or how) can I create an N-dimensional object (or container) and use it as an input to the model?
I think you have 16 RGB images,
on android you load your bitmaps into image tensors like this :
var bitmap1 = Bitmap.load( from anywhere )
var tImage1 = TensorImage(DataType.FLOAT32)
tImage1.load(bitmap1)
for each image,
then
input = arrayof(tImage1.buffer, tImage2.buffer,........tImage16.buffer)
interpreteur.runForMultipleInputsOutputs(arrayOf(input), output)
i'm not sure but this can give you an idea
What I have developed thus far is the capability to write out various devices raw information using the standard DngCreator scheme as per below.
On one device that I am encountering however (HTC 10) the Image class contains planar information whose row stride is larger than the width. I so far have an understanding that this can happen with images, but I can't find out how to correct for it with the SDK available to us.
ByteBuffer byteBuffer = ByteBuffer.wrap(cameraImageF.getRawBytes());
byteBuffer.rewind();
dngCreator.writeByteBuffer(new FileOutputStream(rawLoggerFileF),
new Size(cameraImageF.getRawImageSize().getWidth(), cameraImageF.getRawImageSize().getHeight()),
byteBuffer, 0);
I have held onto the bytes from the original Image class and do some substantial calculations in between when I received them and when they were taken (this is the point of the application). So, I need to let go of the Image so that I can keep getting additional frames from the camera.
Now, this approach works fine for various devices (Samsung S7, Nexus 5, Nexus 6p, etc.). However on the HTC 10 the stride is 16 bytes longer per row and it seems as though I have no way of letting the DngCreator know that.
Underneath in the source code, the writeBuffer defaults to an internal rowStride = width * pixelStride. I do not have the capability to send in a different stride for a parameter. The rowStride does not equal the defaults.
The dngCreator.saveImage(Outputstream, Image) uses the internal Image's stride when it writes out to a buffer. However, I can't hold on to an Image class on the camera because it needs to be released and it is not a cloneable object.
I am a bit lost and trying to understand how to write out a valid .dng for a photograph that has rowStride > width.
You'll have to remove the extra bytes manually - that is, copy the raw image to a new ByteBuffer, and remove the extra bytes at the end of each row. So something like:
byte[] rawBytes = cameraImageF.getRawBytes();
ByteBuffer dst = ByteBuffer.allocate(cameraImageF.getRawImageSize().getWidth() * cameraImageF.getRawImageSize().getHeight() * 2);
for (int row = 0; row < cameraImageF.getRawImageSize().getHeight(); row++) {
dst.put(rawBytes,
row * cameraImageF.getRawImageRowStride(),
cameraImageF.getRawImageSize().getWidth() * 2);
}
dst.rewind();
dngCreator.writeByteBuffer(new FileOutputStream(rawLoggerFileF),
new Size(cameraImageF.getRawImageSize().getWidth(),
cameraImageF.getRawImageSize().getHeight()),
dst, 0);
That's of course not lovely for performance, but since DngCreator won't let you specify a row stride with the ByteBuffer interface, it's your only option.
Is there a reason you can't just increase your RAW ImageReader's maxCount to a higher one, so that you can hold on to the Image until you're done processing it?
I am not able to create 3D array with pixel value more then (100,100,3) in android. However it is working fine with array les then above mentioned dimension.
My code:
double mat[][][] = new double[400][400][3];
Error:java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw OutOfMemoryError; no stack available
However,
double mat[][][] = new double[100][100][3];
works fine. I am using emulated virtual machine to run android application.
It's probably memory. In Android double takes 64 bits, which are 8 bytes.
You are creating a 3D array 400x400x3 so the size will be
400 * 400 * 3 * 8 = 3.7MB
While the smaller size will be
100 * 100 * 3 * 8 = 234KB
It is more likely to get 234KB block of size rather than 3.7MB.
Tnx to Luca comment.