Watershed merging method for color images

Watershed transformation can be applied to color as well as to gray-scale images. A problem arises when dealing with color images. It is caused by the fact that pixels in such images are vectors that describe all color components whereas the watershed transformation requires a scalar height function as its input. There are multiple gradient magnitude definitions for color images that allow for the needed conversion. As in the case of gray-scale images, the image after watershed transformation is heavily over-segmented. One can blur the image before calculating the gradient magnitude, threshold the gradient image or merge the resulting watersheds. Unfortunately, the result is still over-segmented. A solution presented in this paper complements those mentioned above. It uses hierarchical cluster analysis methods for joining similar classes of the over-segmented image into a given number of clusters. After the image has been preprocessed and segmented, the over-segmentation is reduced by means of the cluster analysis. The attribute values for each watershed in each color component are calculated and clustering is performed. The resulting similarity hierarchy allows for the simple selection of the number of clusters in the final segmentation. Several clustering methods, including complete linkage and Ward’s methods with different sets of components, have been tested. Selected results are presented.


Introduction to the watershed transformation
Beucher and Lantuejoul [1] introduced watershed transformation -an image segmentation method that mimics pouring water onto a landscape. It divides an image into watersheds (or catchment-basins) whose edges are continuous. The method requires an image with scalar valued pixels as its input. The input image is treated as a height function. Higher values indicate the presence of edges, while local minima are where the catchment-basins originate (pouring water collects in valleys). Gray-scale images can directly be used as the watershed's transformation input, but this usually is not the case because in most cases high Two types of filters have been used for the purpose of this paper. The first type is a heuristic method that finds the gradient magnitude as the square root of the sum of the individual vector component derivative squared [5]. For a two dimensional image with three components, the formula takes the following form: 2 3 This method will be refered to as "sum-of-squares" further in this paper. The second type of filter used, is based on principal component analysis. It was proposed by Sapiro and Ringach [4]. It calculates the gradient magnitude as the difference between the two largest eigenvalues in the principal component where The λ + and λeigenvalues can be interpreted, correspondingly, as the maximal and minimal rates of change [4]. This method will be referred to as "PCA gradient" further in this paper. The main practical difference between these two ways of obtaining the gradient magnitude is that the PCA gradient brings out significant edges while diminishing those less important; the sum-of-squares gradient, on the other hand, is better when minor details are important. Usually the latter causes stronger over-segmentation.
There are many more methods for calculating color image gradient magnitude, including: luminance gradient, hue circular gradient, saturation weighing-based color gradient, supremum-based color gradient, chromatic gradient and perceptual gradient. These methods have been described by Angulo and Serra [7]. If the watershed transformation is performed on a gradient image (regardless of the type of the applied gradient filter) without any other processing, its result will be heavily over-segmented (see Fig. 2b). In most cases such a large number There are two simple methods (which can be used together) that allow for a significant reduction in the watershed number without a loss of segmentation quality. The first of these methods is preprocessing the gradient image with thresholding. It is performed on the gradient image before the watershed transformation. Its goal is to eliminate low gradient magnitude values. These are caused by noise, texture or weak edges. In this paper the applied threshold is expressed as the percentage of the maximum watershed depth in the gradient image [6]. The second method is a post-processing step which consists in merging neighbouring watersheds. One watershed floods its neighbour if the neighbour's depth is below a certain level called merging level [6]. Like the threshold it is expressed as the percentage of the maximum watershed depth. Figure 2 shows the effects of using the described methods. Fig. 2a includes the original image and Fig. 2b the result of applying an edge-preserving smoothing filter [8], a sum-of-squares gradient and a watershed transformation. The watersheds have been colored using a hashing scheme in order to make them more visible. Fig. 2c presents the effect of thresholding the gradient image with a 5% threshold and Fig. 2d depicts the effect of thresholding and merging watersheds up to a 10% level. As one can see, thresholding reduced the number of watersheds by a factor of 2.9, and thresholding combined with merging reduced this number by a factor of 5.1. It should be also noted that increasing the threshold and merging level may lead to watersheds belonging to completely different objects being merged (under-segmentation). That is why the described two methods are usually insufficient.

Watershed attributes used for merging
For the purpose of this paper, the results of using three different attributes were obtained. In the initial research on the usability of hierarchical cluster analysis for watershed merging in grayscale images [9], four attributes were used but only three have turned out to be useful. The watershed's size did not help create good segmentations. Other attributes will be described in more detail.
Since RGB images have three color components, each attribute is calculated separately for each of them. Consequently, each watershed is described with three times as many values as in the case of a gray-scale image. Of course, it is possible to calculate an attribute only for selected color components. This possibility will be investigated, especially using other color spaces than sRGB.
The first kind of watershed attribute is its mean value: Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.
and standard deviation: σ i . These attributes are, to some extent, sensitive to texture within the watershed. Even regions that are practically identical when their mean value is compared (figure 3b) may be visibly different when their variance or standard deviation is taken into account (figures 3c and 3d).

Cluster analysis used for watershed merging
Since basic methods for removing watersheds cannot usually remove oversegmentation, the use of cluster analysis for merging watersheds in color images is proposed. This approach was quite successful with gray scale images [9,10]. Cluster analysis is an iterative process where, in each iteration, the two most similar clusters are found and merged. The merges are based on a similarity/ dissimilarity matrix which is updated in each iteration. Three clustering methods were used for this preliminary comparison: complete linkage (CLINK), unweighted pair-group method using arithmetic averages (UPGMA) and Ward's minimum variance method. They were chosen because they performed the best in the comparison described in [10]. Single linkage (SLINK) has been left out because it did not give satisfactory results [9,10].
The CLINK and UPGMA methods are very similar [11]. They differ only in the way the similarity matrix is modified, that is, how the distances between the newly created cluster (newly merged cluster) and the remaining clusters are determined. With the CLINK method, the distance between two clusters is the same as that between the two most dissimilar objects (i.e. watersheds) they consist of [11]. The UPGMA method averages the distances between all possible pairs of objects (each pair must consist of objects belonging to different clusters Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 06/04/2022 14:20:45 U M C S before the merge) [11]. More formally, the CLINK method calculates new distances using the following equation while UPGMA uses 1 2 where: C 1 , C 2 -clusters, m -object that belongs to cluster C 1 , n -object that belongs to cluster C 2 , d nm -distance between objects m and n,-dissimilarity measure of clusters C 1 and C 2 (one of them is a cluster that just has been merged), -number of objects in cluster C 1 -number of objects in cluster C 2 . Ward's method is sometimes called the minimum variance method. It differs from CLINK and UPGMA not only in the way it finds clusters to merge but also in the sense that it does not need any additional distance measuring coefficient. In this method the fusion of two clusters is based on the size of an error sum of squares criterion [12]. The algorithm does not search for the most similar clusters (with the help of similarity matrix); instead, it attempts to find an optimal merger, such that it causes a minimal increase in the total within-cluster error sum of squares, E, given by where: E -total error sum of squares, E m -error sum of squares of m-th cluster, c -number of clusters, n -number of attributes, t m -number of objects in m-th cluster, -the average value of i-th attribute in m-th cluster, X ilm -value of the i-th attribute of the l-th object's in the m-th cluster. As a result, in each iteration the algorithm has to check all possible mergers. As mentioned above, the CLINK and UPGMA methods need a similarity/dissimilarity matrix as their input. Such a matrix can be obtained by using the Euclidean distance coefficient [11] given by where: j, k -numbers of objects, n -number of attributes, X i -value of i-th attribute of j-th object, X ik -value of i-th attribute of k-th object.
It is a dissimilarity measure which represents the distance between two points in the n-dimensional space. The complete segmentation process based on watershed transformation proceeds as follows: (1) the image is filtered using an edge preserving smoothing filter [8], (2) either the sum-of-squares or the PCA gradient is applied to the filtered image, (3) the obtained gradient magnitude image is thresholded, (4) the watershed transformation is applied, (5) the neighbouring watersheds are merged based on their depth, (6) the number of resulting watersheds is determined, (7) the attribute values are calculated for each watershed, (8) the similarity/dissimilarity matrix is determined using the distance coefficient (in the case of CLINK and UPGMA methods), (9) the algorithm finds the two most similar clusters and merges them; additionally, it updates the similarity hierarchy which is represented by a tree (10) the similarity/dissimilarity matrix is updated (in the case of CLINK and UPGMA methods), (11) if there is more than one cluster left, the algorithm goes back to step (9), (12) based on the similarity hierarchy (tree) and the requested number of classes, the final segmentation is generated. The final step is not time-consuming; hence, the class count can be changed interactively.
The following pseudo-code provides a more formal description of the proposed method: number of clusters (initially number of watersheds), A -array holding watershed attribute values, S -dissimilarity (or similarity) matrix, first/ second -numbers of clusters to be combined, distance -dissimilarity (or similarity) measure of clusters to be combined, tree -tree holding the similarity hierarchy, classesrequested number of classes. small (5 for example). The attribute set that was most unsuccessful was the watershed's average combined with its standard deviation. During testing usually the UPGMA and Ward's clustering methods gave good results (figures [5][6][7][8]. The CLINK method can also be used successfully as shown in figure 4. For methods that require a similarity measure, the Euclidean distance was used.
The number of classes in the presented segmentations was chosen arbitrarily. The goal was to create a segmented image where all significant objects are still visible while selecting the smallest number of classes. When the proper clustering parameters are used, increasing the number of classes in the segmented image visibly increases the number of details present. Improper parameters cause small insignificant classes to appear.

Conclusions
Clustering methods can be used for eliminating over-segmentation not only in black and white medical images as shown in [9,10] but also in color images depicting different types of objects. This approach is useful when oversegmentation occurs in the entire image (figs. 4 and 5), in a certain class (Fig. 7 -mortar between the bricks, or Fig. 6 -the seagull) or in a certain region of an image (Fig. 8 -heavily over-segmented upper right corner of the image due to camera sensor noise). Preliminary tests, whose results are shown in this paper, lead to the conclusion that using even the simplest set of attributes -the RGB averages and a UPGMA or a Ward's method -allows for eliminating oversegmentation. The main advantage of this approach is that is does not disregard the available information about color whereas, in the case of plain watershed segmentation, this information is no longer taken into account once the gradient image is calculated.