问题

哪些图像处理技术可以用来实现一个应用程序,以检测以下图像中显示的圣诞树?

我正在寻找将在所有这些图像上工作的解决方案.因此,需要训练haar级联分类器或模板匹配的方法并不非常有趣.

我正在寻找可以用任何编程语言编写的东西,只要它只使用Open Source技术.必须使用在此问题上共享的图像测试解决方案.有6个输入图像,答案应显示处理每个图像的结果.最后,对于每个输出图像,必须有红线绘制来包围检测到的树.

如何以编程方式检测这些图像中的树?

  最佳答案

我有一个我认为有趣的方法,与其他方法有点不同.与其他方法相比,我的方法的主要区别在于图像分割步骤是如何穿孔的 – 我使用了来自Python的scikit-learn的 DBSCAN 聚类算法;它优化了找到可能不一定有单个清晰的质心的有点非晶的形状.

在顶层,我的方法相当简单,可以分解为大约3个步骤.首先,我应用一个阈值(或实际上,逻辑”或“两个单独和不同的阈值).与其他许多答案一样,我假设圣诞树将是场景中更亮的对象之一,所以第一个阈值只是一个简单的黑色亮度测试;任何值在0-255尺度上超过220的像素(其中0和白色是255)保存到二进制黑色-and-白色图像.第二个阈值试图寻找黄色和红色的灯,这在左上方和右下方的树中特别突出,所以六个图像的上方和右下方的图像必须显示出明亮的黑色,而且背景-bleand 1.0.0和黑色之间的饱和度必须大于-0.0.0

Christmas trees, after thresholding on HSV as well as monochrome brightness

您可以清楚地看到,每个图像都有一个大的像素集群,大致相当于每个树的位置,另外一些图像还有一些其他小的集群,要么与某些建筑的窗户中的灯光相对应,要么与地平线上的背景场景相对应.下一步是让计算机识别这些是单独的集群,并使用集群成员号正确标签每个像素.

对于这个任务,我选择了 DBSCAN .相对于其他聚类算法,DBSCAN通常是如何行为的,相对于其他聚类算法,可用的here .正如我早些时候说的,它对非晶形状很好. DBSCAN的输出,每个集群以不同的颜色绘制,如下所示:

DBSCAN clustering output

在查看这个结果时,有几件事需要注意.首先,DBSCAN要求用户设置“邻近”参数以调节其行为,这有效控制了如何分离一对点以便算法声明一个新的单独的集群,而不是将测试点聚集到已经存在的集群上.我将此值设置为沿每个图像对角线的大小0.04倍.由于图像大致从VGA到HD 1080不等,这种scale-report定义非常关键.

另一个值得注意的问题是,在scikit-learn中实现的DBSCAN算法具有内存限制,这对于本示例中的一些较大图像来说相当具有挑战性.因此,对于一些较大的图像,我实际上必须“大量”(即只保留每3或第4个像素并删除其他)每个集群以保持在这个限制范围内.由于这个划分过程,剩下的稀疏像素难以在一些较大的图像上看到.因此,为了仅仅显示目的,上面图像中的颜色扩展像素实际上已经“大”了,所以它们更好了.它纯粹是为了表面操作而进行的;尽管这些评论实际上没有提到任何扩展.

一旦集群被识别和标记,第三步和最后一步很容易:我只是在每个图像中采用最大的集群(在这种情况下,我选择按成员像素总数来衡量“大小”,虽然可以同样容易地使用某种类型的衡量物理范围的度量)并计算该集群的凸体.凸体然后成为树形边界.通过此方法计算的六个凸体如下所示:

Christmas trees with their calculated borders

源代码是为Python 2.7.6编写的,它依赖于 numpy , scipy , matplotlib scikit-learn .我将它分为两部分.第一部分负责实际图像处理:

 from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice
 

第二部分是一个userlevel脚本,它调用第一个文件并生成上面的所有图:

 #!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()