I'm very new to Android and I'm writing my first app which consists of drawing STL data (a bunch of triangles).
For a small number of triangles (~ 4000), my app works great. But as soon as I try to load large data (~ 100000 triangles), I get memory allocation problems. Here is basically what I do:
I am using a GLSurfaceView for my rendering
a) I read in my data, creating a list of triangles (3 * X/Y/Z and a normal vector for each triangle)
b) I create ByteBuffers using allocateDirect() for the vertex data, the normals and the indices.
c) I add the data into the ByteBuffers
d) I call glVertexPointer, glNormalPointer and I assign the bytebuffers
e) I call glDrawElements() using the indexBuffer
As soon as I try to allocate around 10MB by calling allocateDirect(), my application crashes. I tried to calll allocate() instead, which works, but then the glVertexPointer method crashes (even for a small number of triangles)
Am I doing something wrong?
Also, do I have to call glVertexPointer, glNormalPointer every time I redraw or is it enough to call it in the surfaceChanged method?
Thanks a lot,
Mark
You should only create the VBOs once. After that if you need to load new data into them you do it in local arrays, and then load it in to the VBOs all at once using a "put" call.
The largest type that you can use for the index VBO is "short", so you can only draw 64k vertexes at a time. If you are drawing 100k vertexes, you will have to do it in two passes.
Edit: regarding your memory limitations, find out what the heap limit is for your device. The heap limit can be 16, 24, 32, or 48 MB. You can use the "android:largeHeap="true"" option to greatly increase the limits, but I'm not sure if that option is available for pre-Honeycomb versions of Android or not.
I'm not expert with 3D rendering, but 10MB of vertex data strikes me as quite a lot, especially for a mobile application, so I'm not surprised you're encountering memory allocation issues! Even if they did successfully get allocated, you'd likely find it unusable due to poor performance.
Does the entire scene really need to be in video memory all the time? You may want to explore techniques to cull vertices that are far away from the camera and only upload/draw what is necessary.
Related
My app uses a proprietary implementation of Canny edge detection based on RenderScript. I tested this on numerous devices with various APIs and it worked very reliably. Now I got the new Samsung S7 working on API23. Here (and only here) I encountered a rather ugly problem. Some of the edge pictures are studded with thousands of artifacts that stem from the Magnitude gradient calculation kernel and are NOT based on actual image information. After trying with all kind of TargetAPIs, taking renderscript.support.mode on and off, etc. I finally found that the problem only arises, when the RenderScript (and Script) instances are used for the second or more times. It does not arise when using them for the first time.
For efficiency reasons I created the RenderScript and Script instances only once in the onCreate method of MainActivity and used it repeatedly thereafter. Of course I don't like to change that.
Does anyone have a solution of this problem? Thanks.
UPDATE: Crazy things are going on here. It seems that freshly created Allocations are NOT empty from the outset! When running:
Type.Builder typeUCHAR1 = new Type.Builder(rs, Element.U8(rs));
typeUCHAR1.setX(width).setY(height);
Allocation alloc = Allocation.createTyped(rs, typeUCHAR1.create());
byte se[] = new byte[width*height];
alloc.copyTo(se);
for (int i=0;i<width*height;i++){
if (se[i]!=0){
Log.e("content: ", String.valueOf(se[i]));
}
}
... the byte Array se is full of funny numbers.... HELP! Any idea, what is going on here?
UPDATE2: I stumbled over my own ignorance here - and really don't deserve a point for this masterpiece.... However, to my defense I must say that the problem was slightly more subtle that it appears here. The context was, that I needed to assign a global allocation (Byte/U8) which initially should be empty (i.e. zero) and then, within the kernel getting partially set to 1 (only where the edges are) via rsSetElementAt_uchar(). Since this worked for many months, I was not aware anymore of the fact, that I didn't explicitely assign the zeros in this allocation.... This only had consequences in API 23, so maybe this can help others not to fall into this trap.... So, note: other than numerical Arrays that are filled with 0 (as by Java default), Allocations cannot assumed to be full of zeros at initiation. Thanks, Sakridge.
Allocation data for primitive types (non-struct/object) is not initialized by default when an Allocation is created unless passed a bitmap using the createFromBitmap api. If you are expecting this then possibly you have a bug in your app which is not exposed when the driver initializes to 0s. It would help if you can post example code which reproduces the problem.
Initialize your allocations by copying from a bitmap or Java array.
I have an infinite loop in my Render Thread. I tried measuring assuming that every call to eglSwapBuffers draws a new frame, but that is giving me results like 200 fps, which is not possible, right? The refresh rate cannot exceed 60?
Now I am doing the same thing but also using surfaceTexture.getTimeStamp() of the surfaceTexture of the SurfaceView. I consider a frame as having been drawn only if the timestamp returned in the previous iteration is not the same as in the current. Is the an acceptable way to measure? This is showing 50-55fps when I do no drawing. ie the loop has only eglSwapBuffers() and the getTimeStamp calls.
The surfaceTexture.getTimeStamp() seems to be giving the correct result. I tested it by adding up all the differences between results returned by consecutive getTimeStamp() calls and it is equal to the total time the code ran for. This indicates that no frames are being left unconsidered etc.
Another solution I found is this Android app. I do not know how it works but it is giving approximately the same results as the above method.
I'm using the cvHaarDetectObjects C function to detect faces in my Android application, but the execution time is not fast enough to process a certain number of video frames per second. So, I'm thinking of commenting out code that is unnecessary for me, e.g. I've noticed a lot of branching conditions for the flags and memory allocation statements that can be commented out. The same thing can be done for the functions that are called from cvHaarDetectObjects.
Has anyone tried doing this sort of optimization before? Any help is much appreciated.
Code:
cascadeFile1 = (CvHaarClassifierCascade *) cvLoad(cascadeFace,0,0,0);
CvSeq *face = cvHaarDetectObjects(img1, cascadeFile1, storage,1.1, 3,CV_HAAR_DO_CANNY_PRUNING,cvSize(0,0));
As a first step you should try to tune the input parameters, as these have a big impact on the performance of the classifier.
You could try to:
reduce the source image resolution to a reasonable value
increase the scaleFactor parameter by small amounts (e.g 0.1 steps)
depending on your resolution, camera field of view and distance of faces, define values for the min_size and max_size parameters. This can dramatically influence the number of operations the algorithm needs to perform.
Second you could post your actual parameters and your profiling results and people around here can surely give some more hints on what to improve.
As a side note: I don't think that commenting out branching conditions will make a noticeable difference in speed if you want to leave the algorithm functioning.
I'm trying to load all the game data at the same time, in my game scene's constructor. But it fails, because texture loading works only in an opengl context, like if load method called from draw frame or surfacechanged. But i think it's ugly to load textures when the drawframe first called or something similar. So is it possible somehow to separate my loading part from opengl functions?
I have exactly the same problem.
My solution is using the proxy textures. It means that when you're creating textures using some data from memory or file you're creating the dummy texture that holds the copy of that memory data or the file path (you can preload the data into memory for faster loading).
After that the next time my renderer calls bind() (which is something like glBindTexture) I check whether there is data to load and if it exists I just create the new texture and load the data.
This approach fits best to me, because in my case textures could be created from any thread and any time.
But if you want to preload all textures you can just do that in onSurfaceCreated or onSurfaceChanged
The same applies to buffers.
Another approach is using the native activity (check the NDK example). In this case you can handle context manually but it requires API level 9.
But i think it's ugly to load textures when the drawframe first called or something similar.
Actually deferred texture loading is the most elegant methods. It's one of the key ingredients for games that allow for traveling the world without interrupting loading screens. Just don't try to load the whole world at once, but load things, as soon they are about to become visible. Use Pixel Buffer Objects to do things asynchronously.
I simply need to add floatArray1 to floatArray2 storing the result in floatArray2.. no third array.. all arrays are one dimensional but are very large... probibly as large as the os will let me get away with. Max i would need is two float arrays with 40,000 floats each... but i could get away with 1/10th that i suppose minimum.
Would love to do this in 1/30th or 1/60th of a second but that does not seem possible? Also if the code is JNI,NDK or OpenGL ES thats fine.. does android have an assembly language or like machine code i could use somehow?
Since a float is worth 32 bit and you have 40000 floats in each array you would need:
40000 * 32 * 2 = 2.560.000 bit
Which is 320.000 Byte. Not to much memory wise i would say since the default limit for an android app is 16MB.
Regarding performance you would definitely gain some speed using JNI. OpenGL would not give you enough benefit i would think since the OpenGL context creation takes some time as well.