I am trying to extract user silhouette and put it above my images. I was able to make a mask and cut user from rgb image. But the contour is messy.
The question is how I can make the mask more precise (to fit real user). I’ve tried ERODE-DILATE filters, but they don’t do much. Maybe I need some Feather filter like in Photoshop. Or I don’t know.
Here is my code.
import SimpleOpenNI.*;
SimpleOpenNI context;
PImage mask;
void setup()
{
size(640*2, 480);
context = new SimpleOpenNI(this);
if (context.isInit() == false)
{
exit();
return;
}
context.enableDepth();
context.enableRGB();
context.enableUser();
context.alternativeViewPointDepthToImage();
}
void draw()
{
frame.setTitle(int(frameRate) + " fps");
context.update();
int[] userMap = context.userMap();
background(0, 0, 0);
mask = loadImage("black640.jpg"); //just a black image
int xSize = context.depthWidth();
int ySize = context.depthHeight();
mask.loadPixels();
for (int y = 0; y < ySize; y++) {
for (int x = 0; x < xSize; x++) {
int index = x + y*xSize;
if (userMap[index]>0) {
mask.pixels[index]=color(255, 255, 255);
}
}
}
mask.updatePixels();
image(mask, 0, 0);
mask.filter(DILATE);
mask.filter(DILATE);
PImage rgb = context.rgbImage();
rgb.mask(mask);
image(rgb, context.depthWidth() + 10, 0);
}
2
Answers
I've tried built-in erode-dilate-blur in processing. But they are very inefficient. Every time I increment blurAmount in img.filter(BLUR,blurAmount), my FPS decreases by 5 frames. So I decided to try opencv. It is much better in comparison. The result is satisfactory.
It’s good you’re aligning the RGB and depth streams.
There are few things that could be improved in terms of efficiency:
No need to reload a black image every single frame (in the draw() loop) since you’re modifying all the pixels anyway:
Also, since you don’t need the x,y coordinates as you loop through the user data, you can use a single for loop which should be a bit faster:
instead of:
Another hacky thing you could do is retrieve the
userImage()
from SimpleOpenNI, instead of theuserData()
and apply aTHRESHOLD
filter to it, which in theory should give you the same result as above.For example:
could be:
In terms of filtering, if you want to shrink the silhouette you should
ERODE
and bluring should give you a bit of that Photoshop like feathering.Note that some filter() calls take arguments (like
BLUR
), but others don’t like theERODE
/DILATE
morphological filters, but you can still roll your own loops to deal with that.I also recommend having some sort of easy to tweak interface (it can be fancy slider or a simple keyboard shortcut) when playing with filters.
Here’s a rough attempt at the refactored sketch with the above comments:
Unfortunately I can’t test with an actual sensor right now, so please use the concepts explained, but bare in mind the full sketch code isn’t tested.
This above sketch (if it runs) should allow you to use keys to control the filter parameters (e/E to decrease/increase erosion, d/D for dilation, b/B for blur). Hopefully you’ll get satisfactory results.
When working with SimpleOpenNI in general I advise recording an .oni file (check out the RecorderPlay example for that) of a person for the most common use case. This will save you some time on the long run when testing and will allow you to work remotely with the sensor detached. One thing to bare in mind, the depth resolution is reduced to half on recordings (but using a
usingRecording
boolean flag should keep things safe)The last and probably most important point is about the quality of the end result. Your resulting image can’t be that much better if the source image isn’t easy to work with to begin with. The depth data from the original Kinect sensor isn’t great. The Asus sensors feel a wee bit more stable, but still the difference is negligible in most cases. If you are going to stick to one of these sensors, make sure you’ve got a clear background and decent lighting (without too much direct warm light (sunlight, incandescent lightbulbs, etc.) since they may interfere with the sensor)
If you want a more accurate user cut and the above filtering doesn’t get the results you’re after, consider switching to a better sensor like KinectV2. The depth quality is much better and the sensor is less susceptible to direct warm light. This may mean you need to use Windows (I see there’s a KinectPV2 wrapper available) or OpenFrameworks(c++ collections of libraries similar to Processing) with ofxKinectV2