I’m developing a project which requires complex Photoshop type blend effects. I am using custom RenderScript
scripts to solve this.
I’ve been testing it on a Samsung Galaxy S4
device running Kitkat, and everything works great and very quickly.
I then tried testing it on a Nexus 5 running Lollipop, and I noticed a sudden drop in performance.
I started timing separate sections in the code to see which parts slow down, and came up with this:
Allocation.createFromBitmap
- Runtime on Kitkat - ~5-10 millisec
- Runtime on Lollipop - ~100-150 millisec
mRenderScript.destory()
- Runtime on Kitkat - ~1-3 millisec
- Runtime on Lollipop - ~60-100 millisec
I’m curious as to why there is a sudden drop in performance when creating Allocation
objects and destroying RenderScript
objects on a device which should be stronger, and on an OS which should be more advanced.
Is there anything I can do specific to API 21 OS’s which can make these methods run faster?
Has anyone even encountered this issue or can reproduce it?
I should note that the actual running of the script (i.e., the ScriptC.forEach
method) runs very fast on both devices / OS’s. Also, I am using the native RenderScript
APIs and not any support libraries.
Any input would be appreciated.
Edit:
I copied here the relevant row from Androids Lollipop-release source code in Github of Allocation.java
static public Allocation createFromBitmap(RenderScript rs, Bitmap b) {
if (rs.getApplicationContext().getApplicationInfo().targetSdkVersion >= 18) {
return createFromBitmap(rs, b, MipmapControl.MIPMAP_NONE,
USAGE_SHARED | USAGE_SCRIPT | USAGE_GRAPHICS_TEXTURE);
}
return createFromBitmap(rs, b, MipmapControl.MIPMAP_NONE,
USAGE_GRAPHICS_TEXTURE);
}
Notice how when the target SDK is higher than 17, the Allocation is created by default with the USAGE_SHARED
flag. Could it be that these extra flags are causing the problem? Should I be using the USAGE_GRAPHICS_TEXTURE
flag instead?
Edit 2
Following R. Jason Sam‘s advice, I ran the following script when the Nexus 5 was connected to my computer:
adb shell setprop debug.rs.default-CPU-driver 1
After this, the runtime of the said functions is significantly faster (~30-40 millisec & 20-50 millisec respectively). Still not as fast as pre-lollipop devices, but within the accepted performance range.
My only issue with this solution is that, unless I don’t understand something, cannot be considered a solution, since it would require me to call this script on every problematic device before running the app on it.
Is there anything I can do in my code which can simulate this adb call?
Final Edit
Okay, so it seems like the issue stemmed from the fact that I was creating a new RenderScript object on each time I called the function which performed the blend effect using RenderScript.
I did some code refactoring and now, instead of creating a new RenderScript object on each call to the effect method, I reuse the same one each time. The 1st creation of the RenderScript object still takes much longer to create on Lollipop devices, but the problem is mitigated now since I continue to reuse the same object throughout multiple method calls.
I will add this as an answer.
2
Answers
It seems like the issue stemmed from the fact that I was creating a new
RenderScript
object on each time I called the function which performed the blend effect usingRenderScript
.I did some code refactoring and now, instead of creating a new
RenderScript
object on each call to the effect method, I reuse the same one each time. The 1st creation of theRenderScript
object still takes much longer to create on Lollipop devices, but the problem is mitigated now since I continue to reuse the same object throughout multiple method calls.I make sure to call
destory()
on the sharedRenderScript
object once I am sure I no longer need it to ensure there are no memory leaks.According to this post, it seems that it's fair practice to reuse
RenderScript
objects rather than creating a new one each time, but I would be glad to hear input from other people regarding their experience on the matter. It's a shame there's not much documentation on this topic online, but so far, everything seems to be working well across multiple devices / OS's.There was an Allocation Type that was added in Api 18 (USAGE_SHARED). If you are forcing Renderscript to copy the bitmap backing memory (instead of using it in place) that might account for the difference.