I am new to artificial intelligence and I am using TensorFlow object detection API to detect a product on images, so it already detecting the object, but I want to get coordinates Xmax, Xmin, Ymax, and Ymin for each object in the images.
That is the image with an object detected, in this case, 2 objects were detected in the image.
Image:
We can see that I got the coordinates of the objects but its not clear, there are more than 3 coordinates in the output and I just want to get the amount of coordinates as the number of objects that are in the image.
This the code which provide the output
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
print(detection_graph.get_tensor_by_name('detection_boxes:0'))
for image_path in TEST_IMAGE_PATHS:
boxes = detect_objects(image_path)
print(boxes)
Output
Tensor("detection_boxes:0", dtype=float32)
[[[0.16593058 0.06630109 0.8009524 0.5019088 ]
[0.15757088 0.5376015 0.8869156 0.9394863 ]
[0.5966009 0.88420665 0.6564093 0.9339011 ]
...
[0. 0. 0. 0. ]
[0. 0. 0. 0. ]
[0. 0. 0. 0. ]]]
I want to get something like that, but only the coordinates of the Bounding Box. We are assuming that they are the coordinates of the objects.
[0.16593058 0.06630109 0.8009524 0.5019088 ]
[0.15757088 0.5376015 0.8869156 0.9394863 ]
2
Answers
You should be aware of two things:
Therefore, in order to filter the detections by score, use
detection_scores
in order to decide which indices to filter out (they’re sorted), and you can multiply the normalized coordinates with the original image size in order to get the absolute coordinates. The normalized coordinates are given in the format of[ymin, xmin, ymax, xmax]
, therefore you should multiply the first and the third coordinates withy_size
and the second and the fourth withx_size
. You can computex_size
andy_size
by evaluating the shape ofimage_tensor
.Code: