I've been working on this game at the native Android /NDK level. To start off with I had only a single texture but as my textures hit 5, my fps slowly reduced to about 20 (with stutters) from around 60.
Currently im performing all my operations on a single thread. On the introduction of another thread using posix threads with a start_routine (which loops infinitely and has no implementation), my fps seemed to have hit about 40 for no apparent reason.
Another point here was that after introduction of that thread, the FPS was stable at 42-43. But without the thread, there were stutters (18-28 fps) causing jerky animation.
My doubts:
Why the above mentioned is happening (thread related)?
Also, the only difference between when I was using 1 texture was that the calculations in my fragment shader are more now. Does that mean the GPU is being overloaded and hence glSwapBuffers taking more time?
Assuming glSwapBuffers does take time, does that mean my game logic is always going to be ahead of my renderer?
How exactly do i go about feeding the render thread with the information needed to render a frame? As in do i make the render thread wait on a queue which is fed by my game logic thread? (Code related)
Code :
void * start_render (void * param)
{
while (1) {
}
return NULL;
}
void android_main(struct android_app* state) {
// Creation of this thread, increased my FPS to around 40 even though start_render wasnt doing anything
pthread_t renderthread;
pthread_create(&renderthread,NULL,start_render,NULL);
struct engine engine;
memset(&engine, 0, sizeof(engine));
state->userData = &engine;
state->onAppCmd = engine_handle_cmd;
state->onInputEvent = engine_handle_input;
engine.assetManager = state->activity->assetManager;
engine.app = state;
engine.texsize = 4;
if (state->savedState != NULL) {
// We are starting with a previous saved state; restore from it.
engine.state = *(struct saved_state*)state->savedState;
}
// loop waiting for stuff to do.
while (1) {
// Read all pending events.
int ident;
int events;
struct android_poll_source* source;
// If not animating, we will block forever waiting for events.
// If animating, we loop until all events are read, then continue
// to draw the next frame of animation.
while ((ident=ALooper_pollAll(engine.animating ? 0 : -1, NULL, &events,
(void**)&source)) >= 0) {
// Process this event.
if (source != NULL) {
source->process(state, source);
}
// Check if we are exiting.
if (state->destroyRequested != 0) {
engine_term_display(&engine);
return;
}
}
if (engine.animating) {
for (int i = 0; i < 4;i++)
{
float cur = engine.mytextures[i].currentposition;
if (cur < 1.0)
engine.mytextures[i].currentposition = cur + engine.mytextures[i].relativespeed;
else
engine.mytextures[i].currentposition = cur - 1.0;
}
// How do i enable the render thread (created above) to call the below function?
on_draw_frame(&engine);
}
}
}
void on_draw_frame(engine * engine) {
glUseProgram(program);
engine->texsize = 4;
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[0].textureid);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[1].textureid);
glActiveTexture(GL_TEXTURE2);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[2].textureid);
glActiveTexture(GL_TEXTURE3);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[3].textureid);
glUniform1i(u_texture_unit_location1,0);
glUniform1i(u_texture_unit_location2,1);
glUniform1i(u_texture_unit_location3,2);
glUniform1i(u_texture_unit_location4,3);
glUniform1f(timeCoord1,engine->mytextures[0].currentposition);
glUniform1f(timeCoord2,engine->mytextures[1].currentposition);
glUniform1f(timeCoord3,engine->mytextures[2].currentposition);
glUniform1f(timeCoord4,engine->mytextures[3].currentposition);
glUniform1i(texSize,engine->texsize);
glBindBuffer(GL_ARRAY_BUFFER, buffer);
glVertexAttribPointer(a_position_location, 2, GL_FLOAT, GL_FALSE,
4 * sizeof(GL_FLOAT), BUFFER_OFFSET(0));
glVertexAttribPointer(a_texture_coordinates_location, 2, GL_FLOAT, GL_FALSE,
4 * sizeof(GL_FLOAT), BUFFER_OFFSET(2 * sizeof(GL_FLOAT)));
glEnableVertexAttribArray(a_position_location);
glEnableVertexAttribArray(a_texture_coordinates_location);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glBindBuffer(GL_ARRAY_BUFFER, 0);
eglSwapBuffers(engine->display, engine->surface);
// FPS calculation
if (fps == 0)
clock_gettime(CLOCK_MONOTONIC, &starttime);
else
clock_gettime (CLOCK_MONOTONIC,&stoptime);
if (stoptime.tv_sec - starttime.tv_sec == 1) {
__android_log_print(ANDROID_LOG_VERBOSE, "GAME", "FPS %d",fps);
fps = 0;
} else
fps++;
}
Let me know if you need more information regarding the code.
I can't be completely certain, but this looks a lot like bad effects of power management on the device.
The symptoms you describe can be caused by a power management strategy that focuses on CPU usage. With a strategy like this, it can happen that if you have very low CPU usage (because you're mostly GPU limited), the whole system goes to a lower power state, and effectively slows down the GPU, even though the GPU is fully loaded.
In this situation, when you add additional CPU load by starting another thread that burns CPU time, you keep the system in a higher power state, and allow the GPU to run faster.
This kind of power management is completely broken, IMHO. Slowing down the GPU if it's fully busy just because CPU utilization is low does not make any sense to me. But power management on some devices is very primitive, so this kind of behavior is not uncommon.
If this is indeed your problem, there's not much you can do about it as an application developer, beyond filing bugs. Creating artificial CPU load to work around it is of course not satisfying. Using more power to defeat power management is not exactly what you want. Many games will probably generate a significant amount of CPU load to handle their game logic/physics, so they would not be affected.
Related
I'm facing the quite famous issue (for newbies like me) of game stuttering. It happens almost continuosly and I can't find the root cause.
I already found issues related to object creation and gc activity thanks to memory monitor of Android Studio and now they are solved.
Thanks to some answers found here on the site, I substituted the approach of Thread.sleep to control the frame rate with a frame dropping-like logic to avoid unresponsiveness due to os schedule.
Moreover, I noticed that switching to "Performance" on my Xiaomi MI 4, so with a stable and high processor frequency, it runs smoothly; this let me think of "unexpected" picks on the processor load. Sadly, lowering the target frame rate to 30 doens't change anything.
I would like to begin showing the main loop that includes the above mentioned logic of frame-dropping:
private static final float TARGET_FPS=60;
private final SurfaceHolder mSurfaceHolder;
private Paint mBackgroundPaint;
boolean mIsOnRun;
/**
* This is the main nucleus of our program.
* From here will be called all the method that are associated with the display in GameEngine object
* */
#Override
public void run()
{
long currTime, lastFrameTime=0;
float nFrameTot=1, nUpdateTot=1;
float totalRenderTime=0.002f, totalUpdateTime=0.002f;
float estimatedFrameRenderTime=0.002f, estimatedUpdateTime;
//Looping until the boolean is false
while (mIsOnRun)
{
currTime=System.nanoTime();
//Updates the game objects business logic
AppConstants.getEngine().update();
if ((nUpdateTot + 1) > Float.MAX_VALUE || (totalUpdateTime + 0.002f) > Float.MAX_VALUE) {
nUpdateTot=1;
totalUpdateTime=0.002f;
}
nUpdateTot++;
totalUpdateTime=totalUpdateTime + ((System.nanoTime() - currTime) / 1000000000f);
estimatedUpdateTime = totalUpdateTime / nUpdateTot;
currTime=System.nanoTime();
if ((currTime - lastFrameTime)/1000000000f > (1/TARGET_FPS)-estimatedFrameRenderTime-estimatedUpdateTime) {
lastFrameTime=currTime;
//locking the canvas
Canvas canvas = mSurfaceHolder.lockCanvas(null);
if (canvas != null)
{
//Clears the screen with black paint and draws object on the canvas
synchronized (mSurfaceHolder)
{
canvas.drawRect(0, 0, canvas.getWidth(), canvas.getHeight(), mBackgroundPaint);
AppConstants.getEngine().draw(canvas);
}
//unlocking the Canvas
mSurfaceHolder.unlockCanvasAndPost(canvas);
}
nFrame++;
if ((nFrameTot + 1) > Float.MAX_VALUE || (totalRenderTime + 0.002f) > Float.MAX_VALUE) {
nFrameTot=1;
totalRenderTime=0.002f;
}
nFrameTot++;
totalRenderTime=totalRenderTime + ((System.nanoTime() - lastFrameTime) / 1000000000f);
estimatedFrameRenderTime = totalRenderTime / nFrameTot;
}
}
}
Now, speaking of what the game does, it's a card game and a color-filled background and at most 16 bitmaps at a time are drawn; the only "animations" are these bitmaps moving around.
I honestly don't know whether I might be doing something easy in a stupid way and would be happy to share more code snippets if it can help you understand more on my problem.
Thank you for any time you lose on this!
I know that the default glReadPixels() waits until all the drawing commands are executed on the GL thread. But when you bind a PixelBuffer Object and then call the glReadPixels() it should be asynchronous and will not wait for anything.
But when I bind PBO and do the glReadPixels() it is blocking for some time.
Here's how I initialize the PBO:
mPboIds = IntBuffer.allocate(2);
GLES30.glGenBuffers(2, mPboIds);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, mPboIds.get(0));
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, mPboSize, null, GLES30.GL_STATIC_READ); //allocates only memory space given data size
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, mPboIds.get(1));
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, mPboSize, null, GLES30.GL_STATIC_READ);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
and then I use the two buffers to ping-pong around:
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, mPboIds.get(mPboIndex)); //1st PBO
JNIWrapper.glReadPixels(0, 0, mRowStride / mPixelStride, (int)height, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE); //read pixel from the screen and write to 1st buffer(native C++ code)
//don't load anything in the first frame
if (mInitRecord) {
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
//reverse the index
mPboIndex = (mPboIndex + 1) % 2;
mPboNewIndex = (mPboNewIndex + 1) % 2;
mInitRecord = false;
return;
}
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, mPboIds.get(mPboNewIndex)); //2nd PBO
//glMapBufferRange returns pointer to the buffer object
//this is the same thing as calling glReadPixel() without a bound PBO
//The key point is that we can pipeline this call
ByteBuffer byteBuffer = (ByteBuffer) GLES30.glMapBufferRange(GLES30.GL_PIXEL_PACK_BUFFER, 0, mPboSize, GLES30.GL_MAP_READ_BIT); //downdload from the GPU to CPU
Bitmap bitmap = Bitmap.createBitmap((int)mScreenWidth,(int)mScreenHeight, Bitmap.Config.ARGB_8888);
bitmap.copyPixelsFromBuffer(byteBuffer);
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
//reverse the index
mPboIndex = (mPboIndex + 1) % 2;
mPboNewIndex = (mPboNewIndex + 1) % 2;
This is called in my draw method every frame.
From my understanding the glReadPixels should not take any time at all, but it's taking around 25ms (on Google Pixel 2) and creating the bitmap takes another 40ms. This only achieve like 13 FPS which is worse than glReadPixels without PBO.
Is there anything that I'm missing or wrong in my code?
EDITED since you pointed out that my original hypothesis was incorrect (initial PboIndex == PboNextIndex). Hoping to be helpful, here is C++ code that I just wrote on the native side called through JNI from Android using GLES 3. It seems to work and not block on glReadPixels(...). Note there is only a single glPboIndex variable:
glBindBuffer(GL_PIXEL_PACK_BUFFER, glPboIds[glPboIndex]);
glReadPixels(0, 0, frameWidth_, frameHeight_, GL_RGBA, GL_UNSIGNED_BYTE, 0);
glPboReady[glPboIndex] = true;
glPboIndex = (glPboIndex + 1) % 2;
if (glPboReady[glPboIndex]) {
glBindBuffer(GL_PIXEL_PACK_BUFFER, glPboIds[glPboIndex]);
GLubyte* rgbaBytes = (GLubyte*)glMapBufferRange(
GL_PIXEL_PACK_BUFFER, 0, frameByteCount_, GL_MAP_READ_BIT);
if (rgbaBytes) {
size_t minYuvByteCount = frameWidth_ * frameHeight_ * 3 / 2; // 12 bits/pixel
if (videoFrameBufferSize_ < minYuvByteCount) {
return; // !!! not logging error inside render loop
}
convertToVideoYuv420NV21FromRgbaInverted(
videoFrameBufferAddress_, rgbaBytes,
frameWidth_, frameHeight_);
}
glUnmapBuffer(GL_PIXEL_PACK_BUFFER);
glPboReady[glPboIndex] = false;
}
glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);
...
previous unfounded hypothesis:
Your question doesn't show the code that sets the initial values of mPboIndex and mPboNewIndex, but if they are set to identical initial values, such as 0, then they will have matching values within each loop which will result in mapping the same PBO that has just been read. In that hypothetical/real scenario, even if 2 PBOs are being used, they are not alternated between glReadPixels and glMapBufferRange which will then block until the GPU completes data transfer. I suggest this change to ensure that the PBOs alternate:
mPboNewIndex = mPboIndex;
mPboIndex = (mPboNewIndex + 1) % 2;
I have this overheat issue, that it turns off my phone after running for a couple of hours. I want to run this 24/7, please help me to improve this:
I use Camera2 interface, RAW format followed by a renderscript to convert YUV420888 to rgba. My renderscript is as below:
#pragma version(1)
#pragma rs java_package_name(com.sensennetworks.sengaze)
#pragma rs_fp_relaxed
rs_allocation gCurrentFrame;
rs_allocation gByteFrame;
int32_t gFrameWidth;
uchar4 __attribute__((kernel)) yuv2RGBAByteArray(uchar4 prevPixel,uint32_t x,uint32_t y)
{
// Read in pixel values from latest frame - YUV color space
// The functions rsGetElementAtYuv_uchar_? require API 18
uchar4 curPixel;
curPixel.r = rsGetElementAtYuv_uchar_Y(gCurrentFrame, x, y);
curPixel.g = rsGetElementAtYuv_uchar_U(gCurrentFrame, x, y);
curPixel.b = rsGetElementAtYuv_uchar_V(gCurrentFrame, x, y);
// uchar4 rsYuvToRGBA_uchar4(uchar y, uchar u, uchar v);
// This function uses the NTSC formulae to convert YUV to RBG
uchar4 out = rsYuvToRGBA_uchar4(curPixel.r, curPixel.g, curPixel.b);
rsSetElementAt_uchar(gByteFrame, out.r, 4 * (y*gFrameWidth + x) + 0 );
rsSetElementAt_uchar(gByteFrame, out.g, 4 * (y*gFrameWidth + x) + 1 );
rsSetElementAt_uchar(gByteFrame, out.b, 4 * (y*gFrameWidth + x) + 2 );
rsSetElementAt_uchar(gByteFrame, 255, 4 * (y*gFrameWidth + x) + 3 );
return out;
}
This is where I call the renderscript to convert to rgba:
#Override
public void onBufferAvailable(Allocation a) {
inputAllocation.ioReceive();
// Run processing pass if we should send a frame
final long current = System.currentTimeMillis();
if ((current - lastProcessed) >= frameEveryMs) {
yuv2rgbaScript.forEach_yuv2RGBAByteArray(scriptAllocation, outputAllocation);
if (rgbaByteArrayCallback != null) {
outputAllocationByte.copyTo(outBufferByte);
rgbaByteArrayCallback.onRGBAArrayByte(outBufferByte);
}
lastProcessed = current;
}
}
And this is the callback to run image processing using OpenCV:
#Override
public void onRGBAArrayByte(byte[] rgbaByteArray) {
try {
/* Fill images. */
rgbaMat.put(0, 0, rgbaByteArray);
analytic.processFrame(rgbaMat);
/* Send fps to UI for debug purpose. */
calcFPS(true);
} catch (Exception e) {
e.printStackTrace();
}
}
The whole thing runs at ~22fps. I've checked carefully and there is no memory leaks. But after running this for some time even with the screen off, the phone gets very hot, and turn off itself. Note if I remove the image processing part, the issue still persists. What could be wrong with this? I could turn on the phone camera app and leave it running for hours without a problem.
Does renderscript cause the heat?
Does 22fps cause the heat? Maybe I should reduce it?
Does Android background service cause heat?
Thanks.
ps: I tested this on LG G4 with full Camera2 interface support.
In theory, your device should throttle itself if it starts to overheat, and never shut down. This would just reduce your frame rate as the device warms up. But some devices aren't as good at this as they should be, unfortunately.
Basically, anything that reduces your CPU / GPU usage will reduce power consumption and heat generation. Basic tips:
Do not copy buffers. Each copy is very expensive when you're doing it at ~30fps. Here, you're copying from Allocation to byte[], and then from that byte[] to the rgbaMat. That's 2x as expensive as just copying from the Allocation to the rgbaMat. Unfortunately, I'm not sure there's a direct way to copy from the Allocation to the rgbaMat, or to create an Allocation that's backed by the same memory as the rgbaMat.
Are you sure you can't do your OpenCV processing on YUV data instead? That'll save you a lot of overhead here; the RGB->YUV conversion is not cheap when not done in hardware.
There's also an RS intrinsic, ScriptIntrinsicYuvToRgb, which may give you better performance than your hand-written loop.
I want to write a text reader which has special effects with cocos2d-x, so the most time the graph will be static. If I use cocos2d-x, it's just heavily consuming battery power.
So is it possible to adjust cocos2d-x's frame rate by coding? And how? I want to reduce frame rate when text's static, and increase frame rate when paging up or down.
Or any good idea for this goal on Android? (Page turning animations and more efficient text rendering.)
You can change frame rate using cocos2d::Director::setAnimationInterval method.
https://github.com/cocos2d/cocos2d-x/blob/1361f2c6195ce338a70b65c17e3d46f38e6bcce2/cocos/base/CCDirector.h#L140-L141
/** Set the FPS value. */
virtual void setAnimationInterval(double interval) = 0;
However, I wonder, if you set framerate low, your C++ code wasn't called immediately when paging up or down because the framerate is low. So you might need to modify onDrawFrame to call Cocos2dxRenderer.nativeRender immediately when user touched the screen.
https://github.com/cocos2d/cocos2d-x/blob/1361f2c6195ce338a70b65c17e3d46f38e6bcce2/cocos/platform/android/java/src/org/cocos2dx/lib/Cocos2dxRenderer.java#L84-L106
#Override
public void onDrawFrame(final GL10 gl) {
/*
* No need to use algorithm in default(60 FPS) situation,
* since onDrawFrame() was called by system 60 times per second by default.
*/
if (sAnimationInterval <= 1.0 / 60 * Cocos2dxRenderer.NANOSECONDSPERSECOND) {
Cocos2dxRenderer.nativeRender();
} else {
final long now = System.nanoTime();
final long interval = now - this.mLastTickInNanoSeconds;
if (interval < Cocos2dxRenderer.sAnimationInterval) {
try {
Thread.sleep((Cocos2dxRenderer.sAnimationInterval - interval) / Cocos2dxRenderer.NANOSECONDSPERMICROSECOND);
} catch (final Exception e) {
}
}
/*
* Render time MUST be counted in, or the FPS will slower than appointed.
*/
this.mLastTickInNanoSeconds = System.nanoTime();
Cocos2dxRenderer.nativeRender();
}
}
I'm shooting for an animation in a live wallpaper. Here's the code. It pretty much follows the CubeWallpaper demo:
void drawFrame() {
final SurfaceHolder holder = getSurfaceHolder();
final BufferedInputStream buf;
final Bitmap bitmap, rbitmap;
Canvas c = null;
try {
c = holder.lockCanvas();
if (c != null) {
try {
buf = new
BufferedInputStream(assets.
open(folder+"/"
+imageList[ilen++])
);
bitmap = BitmapFactory.
decodeStream(buf);
rbitmap = Bitmap.createBitmap
(bitmap,
0,0,imageWidth,imageHeight,
transMatrix,false);
c.drawBitmap(rbitmap,
paddingX,
paddingY,
null);
if ( ilen >= imageCount ) ilen=0;
}
catch (Exception e) { e.printStackTrace(); }
}
} finally {
if (c != null) holder.unlockCanvasAndPost(c);
}
// Reschedule the next redraw
mHandler.removeCallbacks(mDrawCube);
if (mVisible) {
mHandler.postDelayed(mDrawCube, fps);
}
}
where "transMatrix" is a scaling and rotation matrix predefined before.
It's supposed to render at 30fps but of course it doesn't do that. My initial guess is that the BufferedInputStream is one factor. I should probably cache a few of these as I go along along with the Bitmaps. But any other ideas? Will I have to use the OpenGL hack for live wallpapers?
Yes, the BufferedInputStream and BitmapFactory really shouldn't be in drawFrame() at all. You're loading and creating resources on every single frame, and that's a huge waste. Like you said, cache as many as you can beforehand, and if you find the need to load more during drawing, use a separate thread to do it so it doesn't slow the drawing.
I had the same problem: slow canvas rendering in context of live wallpapers.
I agree with others saying that you shouldn't do any cpu/io heavy while rendering e.g. loading images especially on the UI thread.
However there is one more thing you should note. You request a redraw (mHandler.postDelayed(...)) AFTER the frame was rendered. If you desire a 30 fps and thus you request a redraw in (1000 / 30) 33ms then it will NOT result in 30 frames per sec. Why?
Let's assume it takes 28ms to render all your stuff to the canvas. After it's done you request a redraw after 33 millis. That is, the period between redraws is 61 ms which equals with 16 frames per sec.
You have two options to solve it:
1) Put the mHandler.postDelayed(...) code to the beginning of the drawFrame(...) method. This seems OK but it has some disadvantages: If your actual FPS is very close to the maximal possible FPS on an actual device - with other words the UI thread is busy all the time with you canvas rendering - then there won't be time for the UI thread to do other stuff. It doesn't necesseraly mean that your LWP or the home screen will lag but you (your LWP) might start missing some touch events (as my LWP did).
2) The better solution is to start a separate thread when the surface is created and pass it the reference to the SurfaceHolder. Do the rendering in this separate thread. The render method in this thread would look like this:
private static final int DESIRED_FPS = 25;
private static final int DESIRED_PERIOD_BETWEEN_FRAMES_MS = (int) (1000.0 / DESIRED_FPS + 0.5);
#Override
public void run() {
while (mRunning) {
long beforeRenderMs = SystemClock.currentThreadTimeMillis();
// do actual canvas rendering
long afterRenderMs = SystemClock.currentThreadTimeMillis();
long renderLengthMs = afterRenderMs - beforeRenderMs;
sleep(Math.max(DESIRED_PERIOD_BETWEEN_FRAMES_MS - renderLengthMs, 0));
}
}