skip to Main Content

I am using CoreML with the DeepLabV3 model to remove the background from an image: https://developer.apple.com/machine-learning/models/

This is working well for removing the background from photos where the subject it a person/dog/car, but for other cases, such as skylines and some objects on a table (please see example images), it is unable to detect the object from the background.

Should I be using a different method for this?

Thank you

var imageSegmentationModel = DeepLabV3()
var request :  VNCoreMLRequest?

func setUpModel() {
        if let visionModel = try? VNCoreMLModel(for: imageSegmentationModel.model) {
                request = VNCoreMLRequest(model: visionModel, completionHandler: visionRequestDidComplete)
                request?.imageCropAndScaleOption = .scaleFill
        }
        else {
                fatalError()
        }
}

func predict() {
        DispatchQueue.global(qos: .userInitiated).async {
            guard let request = self.request else { fatalError() }
            let handler = VNImageRequestHandler(cgImage: (self.originalImage?.cgImage)!, options: [:])
            do {
                try handler.perform([request])
            }catch {
                print(error)
            }
        }
   }

func visionRequestDidComplete(request: VNRequest, error: Error?) {
        DispatchQueue.main.async {
            if let observations = request.results as? [VNCoreMLFeatureValueObservation],
                let segmentationmap = observations.first?.featureValue.multiArrayValue {
                
                self.maskImage = segmentationmap.image(min: 0, max: 255)
                print(self.maskImage!.size)
                
                self.maskImage = self.maskImage?.resizedImage(for: self.originalImage!.size)
                if let image:UIImage = self.maskOriginalImage(){
                    print("Success")
                    self.outputImageView.image = image
                }
            }
        }
            
    }

func maskOriginalImage() -> UIImage? {
        if(self.maskImage != nil && self.originalImage != nil){
            let maskReference = self.maskImage?.cgImage!
            let imageMask = CGImage(maskWidth: maskReference!.width,
                                    height: maskReference!.height,
                                    bitsPerComponent: maskReference!.bitsPerComponent,
                                    bitsPerPixel: maskReference!.bitsPerPixel,
                                    bytesPerRow: maskReference!.bytesPerRow,
                                    provider: maskReference!.dataProvider!, decode: nil, shouldInterpolate: true)
            
            let maskedReference = self.originalImage?.cgImage!.masking(imageMask!)
            return UIImage(cgImage: maskedReference!)
            
        }
        return nil
    }

Good Image Example

Bad Image Example

3

Answers


  1. If you look at the supported labels for the DeepLabV3 model, they include ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningTable", "dog", "horse", "motorbike", "person", "pottedPlant", "sheep", "sofa", "train", "tvOrMonitor"]. The model is not trained to segment a macbook.

    You can either train your own model to include desriable objects. Or look at the use of saliency which does a decent job at detecting the primary object in an image.

    Login or Signup to reply.
  2. you can see in output label of Squeeznet model
    these 999 labels are the object on which this model is trained.

    If you wish you can train your Custom Model my Using native Swifty "CreateML" Which comes integrated with Xcode

    Login or Signup to reply.
  3. You can use salient object detection neural network.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search