Consider the following image:
enter image description here

and the following bounding contour( which is a smooth version of the output of a text-detection neural network of the above image ), so this contour is a given.
enter image description here

I need to warp both images so that I end up with a straight enough textline, so that it can be fed to a text recognition neural network:
enter image description here

using Piecewise Affine Transformation, or some other method. with an implementation if possible or key points of implementation in python.

I know how to find the medial axis, order its points, simplify it (e.g using Douglas-Peucker algorithm), and find the corresponding points on a straight line.

EDIT: the question can be rephrased -naively- as the following :
have you tried the "puppet warp" feature in Adobe Photoshop? you specify "joint" points on an image , and you move these points to the desired place to perform the image warping, we can calculate the source points using a simplified medial axis (e.g 20 points instead of 200 points), and calculate the corresponding target points on a straight line, how to perform Piecewise Affine Transformation using these two sets of points( source and target)?

EDIT: modified the images, my bad


Here’s a paper that does the needed result:
A Novel Technique for Unwarping Curved Handwritten Texts Using Mathematical Morphology and Piecewise Affine Transformation

another paper: A novel method for straightening curved text-lines in stylistic documents

Similar questions:



  1. Chosen as BEST ANSWER

    Full code also available in this notebook , runtime -> run all to reproduce the result.

    import cv2
    import matplotlib.pyplot as plt
    import numpy as np
    from PIL import Image
    from scipy import interpolate
    from scipy.spatial import distance
    from shapely.geometry import LineString, GeometryCollection, MultiPoint
    from skimage.morphology import skeletonize
    from sklearn.decomposition import PCA
    from warp import PiecewiseAffineTransform  #
    # Helper functions
    def extendline(line, length):
        a = line[0]
        b = line[1]
        lenab = distance.euclidean(a, b)
        cx = b[0] + ((b[0] - a[0]) / lenab * length)
        cy = b[1] + ((b[1] - a[1]) / lenab * length)
        return [cx, cy]
    def XYclean(x, y):
        xy = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1)), axis=1)
        # make PCA object
        pca = PCA(2)
        # fit on data
        # transform into pca space   
        xypca = pca.transform(xy)
        newx = xypca[:, 0]
        newy = xypca[:, 1]
        # sort
        indexSort = np.argsort(x)
        newx = newx[indexSort]
        newy = newy[indexSort]
        # add some more points (optional)
        f = interpolate.interp1d(newx, newy, kind='linear')
        newX = np.linspace(np.min(newx), np.max(newx), 100)
        newY = f(newX)
        # #smooth with a filter (optional)
        # window = 43
        # newY = savgol_filter(newY, window, 2)
        # return back to old coordinates
        xyclean = pca.inverse_transform(np.concatenate((newX.reshape(-1, 1), newY.reshape(-1, 1)), axis=1))
        xc = xyclean[:, 0]
        yc = xyclean[:, 1]
        return np.hstack((xc.reshape(-1, 1), yc.reshape(-1, 1))).astype(int)
    def contour2skeleton(cnt):
        x, y, w, h = cv2.boundingRect(cnt)
        cnt_trans = cnt - [x, y]
        bim = np.zeros((h, w))
        bim = cv2.drawContours(bim, [cnt_trans], -1, color=255, thickness=cv2.FILLED) // 255
        sk = skeletonize(bim > 0)
        skeleton_yx = np.argwhere(sk > 0)
        skeleton_xy = np.flip(skeleton_yx, axis=None)
        xx, yy = skeleton_xy[:, 0], skeleton_xy[:, 1]
        skeleton_xy = XYclean(xx, yy)
        skeleton_xy = skeleton_xy + [x, y]
        return skeleton_xy
    mm = cv2.imread('cont.png', cv2.IMREAD_GRAYSCALE)
    cnts, _ = cv2.findContours(mm.astype('uint8'), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cont = cnts[0].reshape(-1, 2)
    # find skeleton
    sk = contour2skeleton(cont)
    mm = np.zeros_like(mm)
    cv2.polylines(mm, [sk], False, 255, 2)
    # simplify the skeleton
    ln = LineString(sk).simplify(2)
    sk_simp = np.int0(ln.coords)
    mm = np.zeros_like(mm)
    for pt in sk_simp:, pt, 5, 255, -1)
    # extend both ends of the skeleton
    a, b = sk_simp[1], sk_simp[0]
    c1 = np.int0(extendline([a, b], 50))
    sk_simp = np.vstack([c1, sk_simp])
    a, b = sk_simp[-2], sk_simp[-1]
    c2 = np.int0(extendline([a, b], 50))
    sk_simp = np.vstack([sk_simp, c2])
    print(len(sk_simp)), c1, 10, 255, -1), c2, 10, 255, -1)
    # find the target points
    pts1 = sk_simp.copy()
    dists = [distance.euclidean(p1, p2) for p1, p2 in zip(pts1[:-1], pts1[1:])]
    zip1 = list(zip(pts1[:-1], dists))
    # find the first 2 target points
    a = pts1[0]
    b = a - (dists[0], 0)
    pts2 = [a, b, ]
    for z in zip1[1:]:
        lastpt = pts2[-1]
        pt, dst = z
        ln = [a, lastpt]
        c = extendline(ln, dst)
    pts2 = np.int0(pts2)
    ln1 = LineString(pts1)
    ln2 = LineString(pts2)
    GeometryCollection([ln1.buffer(5), ln2.buffer(5),
                        MultiPoint(pts2), MultiPoint(pts1)])
    # create translated copies of source and target points
    # 50 is arbitary
    pts1 = np.vstack([pts1 + [0, 50], pts1 + [0, -50]])
    pts2 = np.vstack([pts2 + [0, 50], pts2 + [0, -50]])
    # performing the warping
    im ='orig.png')
    dstIm =, im.size, color=(255, 255, 255))
    # Perform transform
    PiecewiseAffineTransform(im, pts1, dstIm, pts2)
    plt.figure(figsize=(10, 10))

    1- find medial axis , e.g using skimage.morphology.skeletonize and simplify it ,e.g using shapely object.simplify , I used a tolerance of 2 , the medial axis points are in white: enter image description here

    2- find the corresponding points on a straight line, using the distance between each point and the next: enter image description here

    3 - also added extra points on the ends, colored blue, so that the points fit the entire contour length enter image description here

    4- create 2 copies of the source and target points, one copy translated up and the other translated down (I choose an offset of 50 here), so the source points are now like this, please note that simple upward/downward displacement may not be the best approach for all contours, e.g if the contour is curving with degrees > 45: enter image description here

    5- using the code here , perform PiecewiseAffineTransform using the source and target points, here's the result, it's straight enough: enter image description here

  2. If the goal is to just unshift each column, then:

    import numpy as np
    from PIL import Image
    source_img ="73614379-input-v2.png")
    contour_img ="73614379-map-v3.png").convert("L")
    assert source_img.size == contour_img.size
    contour_arr = np.array(contour_img) != 0  # convert to boolean array
    col_offsets = np.argmax(
        contour_arr, axis=0
    )  # find the first non-zero row for each column
    assert len(col_offsets) == source_img.size[0]  # sanity check
    min_nonzero_col_offset = np.min(
        col_offsets[col_offsets > 0]
    )  # find the minimum non-zero row
    target_img ="RGB", source_img.size, (255, 255, 255))
    for x, col_offset in enumerate(col_offsets):
        offset = col_offset - min_nonzero_col_offset if col_offset > 0 else 0
            source_img.crop((x, offset, x + 1, source_img.size[1])), (x, 0)

    with the new input and the new contour from OP outputs this image:

    enter image description here

