Android Camera2 Renderscript overheat issue - android

I have this overheat issue, that it turns off my phone after running for a couple of hours. I want to run this 24/7, please help me to improve this:
I use Camera2 interface, RAW format followed by a renderscript to convert YUV420888 to rgba. My renderscript is as below:
#pragma version(1)
#pragma rs java_package_name(com.sensennetworks.sengaze)
#pragma rs_fp_relaxed
rs_allocation gCurrentFrame;
rs_allocation gByteFrame;
int32_t gFrameWidth;
uchar4 __attribute__((kernel)) yuv2RGBAByteArray(uchar4 prevPixel,uint32_t x,uint32_t y)
{
// Read in pixel values from latest frame - YUV color space
// The functions rsGetElementAtYuv_uchar_? require API 18
uchar4 curPixel;
curPixel.r = rsGetElementAtYuv_uchar_Y(gCurrentFrame, x, y);
curPixel.g = rsGetElementAtYuv_uchar_U(gCurrentFrame, x, y);
curPixel.b = rsGetElementAtYuv_uchar_V(gCurrentFrame, x, y);
// uchar4 rsYuvToRGBA_uchar4(uchar y, uchar u, uchar v);
// This function uses the NTSC formulae to convert YUV to RBG
uchar4 out = rsYuvToRGBA_uchar4(curPixel.r, curPixel.g, curPixel.b);
rsSetElementAt_uchar(gByteFrame, out.r, 4 * (y*gFrameWidth + x) + 0 );
rsSetElementAt_uchar(gByteFrame, out.g, 4 * (y*gFrameWidth + x) + 1 );
rsSetElementAt_uchar(gByteFrame, out.b, 4 * (y*gFrameWidth + x) + 2 );
rsSetElementAt_uchar(gByteFrame, 255, 4 * (y*gFrameWidth + x) + 3 );
return out;
}
This is where I call the renderscript to convert to rgba:
#Override
public void onBufferAvailable(Allocation a) {
inputAllocation.ioReceive();
// Run processing pass if we should send a frame
final long current = System.currentTimeMillis();
if ((current - lastProcessed) >= frameEveryMs) {
yuv2rgbaScript.forEach_yuv2RGBAByteArray(scriptAllocation, outputAllocation);
if (rgbaByteArrayCallback != null) {
outputAllocationByte.copyTo(outBufferByte);
rgbaByteArrayCallback.onRGBAArrayByte(outBufferByte);
}
lastProcessed = current;
}
}
And this is the callback to run image processing using OpenCV:
#Override
public void onRGBAArrayByte(byte[] rgbaByteArray) {
try {
/* Fill images. */
rgbaMat.put(0, 0, rgbaByteArray);
analytic.processFrame(rgbaMat);
/* Send fps to UI for debug purpose. */
calcFPS(true);
} catch (Exception e) {
e.printStackTrace();
}
}
The whole thing runs at ~22fps. I've checked carefully and there is no memory leaks. But after running this for some time even with the screen off, the phone gets very hot, and turn off itself. Note if I remove the image processing part, the issue still persists. What could be wrong with this? I could turn on the phone camera app and leave it running for hours without a problem.
Does renderscript cause the heat?
Does 22fps cause the heat? Maybe I should reduce it?
Does Android background service cause heat?
Thanks.
ps: I tested this on LG G4 with full Camera2 interface support.

In theory, your device should throttle itself if it starts to overheat, and never shut down. This would just reduce your frame rate as the device warms up. But some devices aren't as good at this as they should be, unfortunately.
Basically, anything that reduces your CPU / GPU usage will reduce power consumption and heat generation. Basic tips:
Do not copy buffers. Each copy is very expensive when you're doing it at ~30fps. Here, you're copying from Allocation to byte[], and then from that byte[] to the rgbaMat. That's 2x as expensive as just copying from the Allocation to the rgbaMat. Unfortunately, I'm not sure there's a direct way to copy from the Allocation to the rgbaMat, or to create an Allocation that's backed by the same memory as the rgbaMat.
Are you sure you can't do your OpenCV processing on YUV data instead? That'll save you a lot of overhead here; the RGB->YUV conversion is not cheap when not done in hardware.
There's also an RS intrinsic, ScriptIntrinsicYuvToRgb, which may give you better performance than your hand-written loop.

Related

Android game stutters; gc under control and frame dropping in place of Thread.sleep

I'm facing the quite famous issue (for newbies like me) of game stuttering. It happens almost continuosly and I can't find the root cause.
I already found issues related to object creation and gc activity thanks to memory monitor of Android Studio and now they are solved.
Thanks to some answers found here on the site, I substituted the approach of Thread.sleep to control the frame rate with a frame dropping-like logic to avoid unresponsiveness due to os schedule.
Moreover, I noticed that switching to "Performance" on my Xiaomi MI 4, so with a stable and high processor frequency, it runs smoothly; this let me think of "unexpected" picks on the processor load. Sadly, lowering the target frame rate to 30 doens't change anything.
I would like to begin showing the main loop that includes the above mentioned logic of frame-dropping:
private static final float TARGET_FPS=60;
private final SurfaceHolder mSurfaceHolder;
private Paint mBackgroundPaint;
boolean mIsOnRun;
/**
* This is the main nucleus of our program.
* From here will be called all the method that are associated with the display in GameEngine object
* */
#Override
public void run()
{
long currTime, lastFrameTime=0;
float nFrameTot=1, nUpdateTot=1;
float totalRenderTime=0.002f, totalUpdateTime=0.002f;
float estimatedFrameRenderTime=0.002f, estimatedUpdateTime;
//Looping until the boolean is false
while (mIsOnRun)
{
currTime=System.nanoTime();
//Updates the game objects business logic
AppConstants.getEngine().update();
if ((nUpdateTot + 1) > Float.MAX_VALUE || (totalUpdateTime + 0.002f) > Float.MAX_VALUE) {
nUpdateTot=1;
totalUpdateTime=0.002f;
}
nUpdateTot++;
totalUpdateTime=totalUpdateTime + ((System.nanoTime() - currTime) / 1000000000f);
estimatedUpdateTime = totalUpdateTime / nUpdateTot;
currTime=System.nanoTime();
if ((currTime - lastFrameTime)/1000000000f > (1/TARGET_FPS)-estimatedFrameRenderTime-estimatedUpdateTime) {
lastFrameTime=currTime;
//locking the canvas
Canvas canvas = mSurfaceHolder.lockCanvas(null);
if (canvas != null)
{
//Clears the screen with black paint and draws object on the canvas
synchronized (mSurfaceHolder)
{
canvas.drawRect(0, 0, canvas.getWidth(), canvas.getHeight(), mBackgroundPaint);
AppConstants.getEngine().draw(canvas);
}
//unlocking the Canvas
mSurfaceHolder.unlockCanvasAndPost(canvas);
}
nFrame++;
if ((nFrameTot + 1) > Float.MAX_VALUE || (totalRenderTime + 0.002f) > Float.MAX_VALUE) {
nFrameTot=1;
totalRenderTime=0.002f;
}
nFrameTot++;
totalRenderTime=totalRenderTime + ((System.nanoTime() - lastFrameTime) / 1000000000f);
estimatedFrameRenderTime = totalRenderTime / nFrameTot;
}
}
}
Now, speaking of what the game does, it's a card game and a color-filled background and at most 16 bitmaps at a time are drawn; the only "animations" are these bitmaps moving around.
I honestly don't know whether I might be doing something easy in a stupid way and would be happy to share more code snippets if it can help you understand more on my problem.
Thank you for any time you lose on this!

Renderscript fails on GPU enabled driver if USAGE_SHARED

We are using renderscript for audio dsp processing. It is simple and improves performance significantly for our use-case. But we run into an annoying issue with USAGE_SHARED on devices that have custom driver with GPU execution enabled.
As you may know, USAGE_SHARED flag makes the renderscript allocation to reuse the given memory without having to create a copy of it. As a consequence, it not only saves memory, in our case, improves performance to desired level.
The following code with USAGE_SHARED works fine on default renderscript driver (libRSDriver.so). With custom driver (libRSDriver_adreno.so) USAGE_SHARED does not reuse given memory and thus data.
This is the code that makes use of USAGE_SHARED and calls renderscript kernel
void process(float* in1, float* in2, float* out, size_t size) {
sp<RS> rs = new RS();
rs->init(app_cache_dir);
sp<const Element> e = Element::F32(rs);
sp<const Type> t = Type::create(rs, e, size, 0, 0);
sp<Allocation> in1Alloc = Allocation::createTyped(
rs, t,
RS_ALLOCATION_MIPMAP_NONE,
RS_ALLOCATION_USAGE_SCRIPT | RS_ALLOCATION_USAGE_SHARED,
in1);
sp<Allocation> in2Alloc = Allocation::createTyped(
rs, t,
RS_ALLOCATION_MIPMAP_NONE,
RS_ALLOCATION_USAGE_SCRIPT | RS_ALLOCATION_USAGE_SHARED,
in2);
sp<Allocation> outAlloc = Allocation::createTyped(
rs, t,
RS_ALLOCATION_MIPMAP_NONE,
RS_ALLOCATION_USAGE_SCRIPT | RS_ALLOCATION_USAGE_SHARED,
out);
ScriptC_x* rsX = new ScriptC_x(rs);
rsX->set_in1Alloc(in1Alloc);
rsX->set_in2Alloc(in2Alloc);
rsX->set_size(size);
rsX->forEach_compute(in1Alloc, outAlloc);
}
NOTE: This variation of Allocation::createTyped() is not mentioned in the documentation, but code rsCppStructs.h has it. This is the allocation factory method that allows providing backing pointer and respects USAGE_SHARED flag. This is how it is declared:
/**
* Creates an Allocation for use by scripts with a given Type and a backing pointer. For use
* with RS_ALLOCATION_USAGE_SHARED.
* #param[in] rs Context to which the Allocation will belong
* #param[in] type Type of the Allocation
* #param[in] mipmaps desired mipmap behavior for the Allocation
* #param[in] usage usage for the Allocation
* #param[in] pointer existing backing store to use for this Allocation if possible
* #return new Allocation
*/
static sp<Allocation> createTyped(
const sp<RS>& rs, const sp<const Type>& type,
RsAllocationMipmapControl mipmaps,
uint32_t usage,
void * pointer);
This is the renderscript kernel
rs_allocation in1Alloc, in2Alloc;
uint32_t size;
// JUST AN EXAMPLE KERNEL
// Not using reduction kernel since it is only available in later API levels.
// Not sure if support library helps here. Anyways, unrelated to the current problem
float compute(float ignored, uint32_t x) {
float result = 0.0f;
for (uint32_t i=0; i<size; i++) {
result += rsGetElementAt_float(in1Alloc, x) * rsGetElementAt_float(in2Alloc, size-i-1); // just an example computation
}
return result;
}
As mentioned, out doesn't have any of the result of the calculation.
syncAll(RS_ALLOCATION_USAGE_SHARED) also didn't help.
The following works though (but much slower)
void process(float* in1, float* in2, float* out, size_t size) {
sp<RS> rs = new RS();
rs->init(app_cache_dir);
sp<const Element> e = Element::F32(rs);
sp<const Type> t = Type::create(rs, e, size, 0, 0);
sp<Allocation> in1Alloc = Allocation::createTyped(rs, t);
in1Alloc->copy1DFrom(in1);
sp<Allocation> in2Alloc = Allocation::createTyped(rs, t);
in2Alloc->copy1DFrom(in2);
sp<Allocation> outAlloc = Allocation::createTyped(rs, t);
ScriptC_x* rsX = new ScriptC_x(rs);
rsX->set_in1Alloc(in1Alloc);
rsX->set_in2Alloc(in2Alloc);
rsX->set_size(size);
rsX->forEach_compute(in1Alloc, outAlloc);
outAlloc->copy1DTo(out);
}
Copying makes it to work, but in our testing, copying back and forth significantly degrades performance.
If we switch off GPU execution through debug.rs.default-CPU-driver system property, we could see that custom driver works well with desired performance.
Aligning memory given to renderscript to 16,32,.., or 1024, etc did not help to make the custom driver respect USAGE_SHARED.
Question
So, our question is this: How to make this kernel work for devices that use custom renderscript driver that enables GPU execution?
You need to have the copy even if you use USAGE_SHARED.
USAGE_SHARED is just a hint to the driver, it doesn’t have to use it.
If the driver does share the memory the copy will be ignored and performance will be the same.

Android renderscript never runs on the gpu

Exactly as the title says.
I have a parallelized image creating/processing algorithm that I'd like to use. This is a kind of perlin noise implementation.
// Logging is never used here
#pragma version(1)
#pragma rs java_package_name(my.package.name)
#pragma rs_fp_full
float sizeX, sizeY;
float ratio;
static float fbm(float2 coord)
{ ... }
uchar4 RS_KERNEL root(uint32_t x, uint32_t y)
{
float u = x / sizeX * ratio;
float v = y / sizeY;
float2 p = {u, v};
float res = fbm(p) * 2.0f; // rs.: 8245 ms, fs: 8307 ms; fs 9842 ms on tablet
float4 color = {res, res, res, 1.0f};
//float4 color = {p.x, p.y, 0.0, 1.0}; // rs.: 96 ms
return rsPackColorTo8888(color);
}
As a comparison, this exact algorithm runs with at least 30 fps when I implement it on the gpu via fragment shader on a textured quad.
The overhead for running the RenderScript should be max 100 ms which I calculated from making a simple bitmap by returning the x and y normalized coordinates.
Which means that in case it would use the gpu it would surely not become 10 seconds.
The code I am using the RenderScript with:
// The non-support version gives at least an extra 25% performance boost
import android.renderscript.Allocation;
import android.renderscript.RenderScript;
public class RSNoise {
private RenderScript renderScript;
private ScriptC_noise noiseScript;
private Allocation allOut;
private Bitmap outBitmap;
final int sizeX = 1536;
final int sizeY = 2048;
public RSNoise(Context context) {
renderScript = RenderScript.create(context);
outBitmap = Bitmap.createBitmap(sizeX, sizeY, Bitmap.Config.ARGB_8888);
allOut = Allocation.createFromBitmap(renderScript, outBitmap, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_GRAPHICS_TEXTURE);
noiseScript = new ScriptC_noise(renderScript);
}
// The render function is benchmarked only
public Bitmap render() {
noiseScript.set_sizeX((float) sizeX);
noiseScript.set_sizeY((float) sizeY);
noiseScript.set_ratio((float) sizeX / (float) sizeY);
noiseScript.forEach_root(allOut);
allOut.copyTo(outBitmap);
return outBitmap;
}
}
If I change it to FilterScript, from using this help (https://stackoverflow.com/a/14942723/4420543), I get several hundred milliseconds worse in case of support library and about double time worse in case of the non-support one. The precision did not influence the results.
I have also checked every question on stackoverflow, but most of them are outdated and I have also tried it with a nexus 5 (7.1.1 os version) among several other new devices, but the problem still remains.
So, when does RenderScript run on GPU? It would be enough if someone could give me an example on a GPU-running RenderScript.
Can you try to run it with rs_fp_relaxed instead of rs_fp_full?
#pragma rs_fp_relaxed
rs_fp_full will force your script running on CPU, since most GPUs don't support full precision floating point operations.
I can agree with your guess.
On Nexux 7 (2013, JellyBean 4.3) I wrote a renderscript and a filterscript, respectively, to calculate the famous Mandelbrot set.
Compared to an OpenGL fragment shader doing the same thing (all with 32 bit floats), the scripts were about 3 times slower. I assume OpenGL uses GPUs where renderscript (and filterscript !) do not.
Then I compared camera preview conversion (NV21 format -> RGB) with a renderscript, a filterscript and the ScriptIntrinsicYuvToRGB, respectively.
Here the Intrinsic is about 4 times faster than the self written scripts.
Again I see no differences in performance between renderscript and filterscript. In this case I assume the self written scripts again use CPUs only where the Intrinsic makes use of GPUs (too ?).

Android Renderscript, remove white/whiteish background from Bitmap

I'm working on a picture based app and I'm blocked on an issue with Renderscript.
My purpose is pretty simple in theory, I want to remove the white background from the images loaded by the user, to show them on another image i've set as background. More specifically what I want to to is to simulate the effect of printing a user uploaded graphic on paper canvas (also a picture) with a realistic effect.
I cannot assume the user is able to upload nice PNGs with alpha channels, and one of the requirement is to operate with JPGs.
I've been trying to solve this with RenderScripts, with something like this which sets alpha 0 to anything with R,G,and B all equals or greater than 240:
#pragma version(1)
#pragma rs java_package_name(mypackagename)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
const static float th = 239.f/256.f;
void root(const uchar4 v_in, uchar4 v_out, const void* usrData, uint32_t x,uint32_t y){
float4 f4 = rsUnpackColor8888(*v_in);
if(f4.r > th && f4.g > th && f4.b > th)
{
f4.a = 0;
}
*v_out = rsPackColorTo8888(f4);
}
void filter() {
rsForEach(gScript, gIn, gOut);
}
but the results are not satisfactory for mainly two reasons:
if a photo has a whiteish gradient not on the background the script causes an ugly noise effect
images with shadows close to the edges get a noise effects close to the edges
I understand that passing from alpha 0 to alpha 1 is too extreme and I've tried different solution involving linear increasing the alpha when the sum of the R,G,B components decrease but I still have noisy pixels and blocks around.
With plain white, or regular background (e.g. a snapshot of the Google home page) it works perfectly but with photos it's very far from anything acceptable.
I think that if I'd be able to process one "line" of pixels or one "block" of pixels instead that a single one it could be easier to detect flat backgrounds and to avoid hitting gradients but I don't know enough about renderscripts to do that.
Can anyone point me in the right direction?
PS
I can't use PorterDuff and multiply because the background and the foreground have different dimensions and moreover since I need to be able to drag the uploaded image around the background canvas once the effect is applied. If I multiply the image with a region of the background moving the result image around would cause a section of the background to move around as well.
If I get it right, you wants to determine whether the current pixel can is a white background based on a line/block of neighboring pixels.
You can try the use rsGetElementAt. For example, to process a line in your original code:
#pragma version(1)
#pragma rs java_package_name(mypackagename)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
const static float th = 239.f/256.f;
void root(const uchar4 v_in, uchar4 v_out, const void* usrData, uint32_t x,uint32_t y){
float4 f4 = rsUnpackColor8888(*v_in);
uint32_t width = rsAllocationGetDimX(gIn);
// E.g: Processing a line from x to x+5.
bool isBackground = true;
for (uint32_t i=0; i<=5 && x+i<width; i++) {
uchar4 nPixel_u4 = rsGetElementAt_uchar4(gIn, x+i, y);
float4 nPixel_f4 = rsUnpackColor8888(nPixel_u4);
if(nPixel_f4.r <= th || nPixel_f4.g <= th || nPixel_f4.b <= th) {
isBackground = false;
break;
}
}
if (isBackground) {
f4.a = 0.0f;
*v_out = rsPackColorTo8888(f4);
}
}
void filter() {
rsForEach(gScript, gIn, gOut);
}
This is just a naive example of how you can use rsGetElementAt to get the data from a given position in a global Allocation. There is a corresponding rsSetElementAt for saving the data to a global Allocation. I am hoping it helps your project.

Native Android multi-threaded odd occurrence

I've been working on this game at the native Android /NDK level. To start off with I had only a single texture but as my textures hit 5, my fps slowly reduced to about 20 (with stutters) from around 60.
Currently im performing all my operations on a single thread. On the introduction of another thread using posix threads with a start_routine (which loops infinitely and has no implementation), my fps seemed to have hit about 40 for no apparent reason.
Another point here was that after introduction of that thread, the FPS was stable at 42-43. But without the thread, there were stutters (18-28 fps) causing jerky animation.
My doubts:
Why the above mentioned is happening (thread related)?
Also, the only difference between when I was using 1 texture was that the calculations in my fragment shader are more now. Does that mean the GPU is being overloaded and hence glSwapBuffers taking more time?
Assuming glSwapBuffers does take time, does that mean my game logic is always going to be ahead of my renderer?
How exactly do i go about feeding the render thread with the information needed to render a frame? As in do i make the render thread wait on a queue which is fed by my game logic thread? (Code related)
Code :
void * start_render (void * param)
{
while (1) {
}
return NULL;
}
void android_main(struct android_app* state) {
// Creation of this thread, increased my FPS to around 40 even though start_render wasnt doing anything
pthread_t renderthread;
pthread_create(&renderthread,NULL,start_render,NULL);
struct engine engine;
memset(&engine, 0, sizeof(engine));
state->userData = &engine;
state->onAppCmd = engine_handle_cmd;
state->onInputEvent = engine_handle_input;
engine.assetManager = state->activity->assetManager;
engine.app = state;
engine.texsize = 4;
if (state->savedState != NULL) {
// We are starting with a previous saved state; restore from it.
engine.state = *(struct saved_state*)state->savedState;
}
// loop waiting for stuff to do.
while (1) {
// Read all pending events.
int ident;
int events;
struct android_poll_source* source;
// If not animating, we will block forever waiting for events.
// If animating, we loop until all events are read, then continue
// to draw the next frame of animation.
while ((ident=ALooper_pollAll(engine.animating ? 0 : -1, NULL, &events,
(void**)&source)) >= 0) {
// Process this event.
if (source != NULL) {
source->process(state, source);
}
// Check if we are exiting.
if (state->destroyRequested != 0) {
engine_term_display(&engine);
return;
}
}
if (engine.animating) {
for (int i = 0; i < 4;i++)
{
float cur = engine.mytextures[i].currentposition;
if (cur < 1.0)
engine.mytextures[i].currentposition = cur + engine.mytextures[i].relativespeed;
else
engine.mytextures[i].currentposition = cur - 1.0;
}
// How do i enable the render thread (created above) to call the below function?
on_draw_frame(&engine);
}
}
}
void on_draw_frame(engine * engine) {
glUseProgram(program);
engine->texsize = 4;
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[0].textureid);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[1].textureid);
glActiveTexture(GL_TEXTURE2);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[2].textureid);
glActiveTexture(GL_TEXTURE3);
glBindTexture(GL_TEXTURE_2D, engine->mytextures[3].textureid);
glUniform1i(u_texture_unit_location1,0);
glUniform1i(u_texture_unit_location2,1);
glUniform1i(u_texture_unit_location3,2);
glUniform1i(u_texture_unit_location4,3);
glUniform1f(timeCoord1,engine->mytextures[0].currentposition);
glUniform1f(timeCoord2,engine->mytextures[1].currentposition);
glUniform1f(timeCoord3,engine->mytextures[2].currentposition);
glUniform1f(timeCoord4,engine->mytextures[3].currentposition);
glUniform1i(texSize,engine->texsize);
glBindBuffer(GL_ARRAY_BUFFER, buffer);
glVertexAttribPointer(a_position_location, 2, GL_FLOAT, GL_FALSE,
4 * sizeof(GL_FLOAT), BUFFER_OFFSET(0));
glVertexAttribPointer(a_texture_coordinates_location, 2, GL_FLOAT, GL_FALSE,
4 * sizeof(GL_FLOAT), BUFFER_OFFSET(2 * sizeof(GL_FLOAT)));
glEnableVertexAttribArray(a_position_location);
glEnableVertexAttribArray(a_texture_coordinates_location);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glBindBuffer(GL_ARRAY_BUFFER, 0);
eglSwapBuffers(engine->display, engine->surface);
// FPS calculation
if (fps == 0)
clock_gettime(CLOCK_MONOTONIC, &starttime);
else
clock_gettime (CLOCK_MONOTONIC,&stoptime);
if (stoptime.tv_sec - starttime.tv_sec == 1) {
__android_log_print(ANDROID_LOG_VERBOSE, "GAME", "FPS %d",fps);
fps = 0;
} else
fps++;
}
Let me know if you need more information regarding the code.
I can't be completely certain, but this looks a lot like bad effects of power management on the device.
The symptoms you describe can be caused by a power management strategy that focuses on CPU usage. With a strategy like this, it can happen that if you have very low CPU usage (because you're mostly GPU limited), the whole system goes to a lower power state, and effectively slows down the GPU, even though the GPU is fully loaded.
In this situation, when you add additional CPU load by starting another thread that burns CPU time, you keep the system in a higher power state, and allow the GPU to run faster.
This kind of power management is completely broken, IMHO. Slowing down the GPU if it's fully busy just because CPU utilization is low does not make any sense to me. But power management on some devices is very primitive, so this kind of behavior is not uncommon.
If this is indeed your problem, there's not much you can do about it as an application developer, beyond filing bugs. Creating artificial CPU load to work around it is of course not satisfying. Using more power to defeat power management is not exactly what you want. Many games will probably generate a significant amount of CPU load to handle their game logic/physics, so they would not be affected.

Categories

Resources