Complexity International      ISSN 1320-0682     
Volume 02 April 1995

AI and Medical Imagery: Strategy and Evaluation of Inexact Relational Matching

Stevan M. Vajdic
Department of Electrical and Electronic Engineering
University of Adelaide, SA 5005
Michael J. Brooks
Department of Computer Science
University of Adelaide, SA 5005
Andrew Downing
School of Engineering
Flinders University of SA, SA 5001
Henry E. Katz
ISCS Inc., 1B Forest Hills, NY 11375, USA

Email: svajdic@eleceng.adelaide.edu.au

Abstract:

Matching/fusion strategies are discussed and a set of functions for evaluation of relational inexact matching is introduced. They are implemented in the domain of 2D/3D medical imagery and are based on Artificial Intelligence (AI) paradigms - in particular, rule-based knowledge representation and tree search. The 2D reference and target images are selected from 3D sets and are segmented into non-touching and non-overlapping regions, using iterative thresholding and/or knowledge about the anatomical regional shapes of human organs. Selected image region attributes and their inter-relationships (relations) are calculated. Region matches are obtained using a tree search, and the error is minimised by evaluating a "goodness of matching" function based on similarities of region attributes/relationships. Once the matched regions are found and the spline geometric transform is applied to regional centres of gravity, images are ready for fusion into a single 2D and/or 3D image of higher clarity.



Introduction

Image fusion (composing images of the same objects taken by different means) has played a significant role in disciplines such as medical imaging, pattern recognition, computer vision and remote image sensing. The idea is to combine images of a single object obtained using either different sensors, different illumination conditions, with perhaps different sections of the object observed at different times, from various positions or viewing angles, under different illumination conditions or polarisations. The resultant single image would be of a better clarity and contain more information than could be obtained from any of the individual "matched" images.

The idea of combining images in medicine becomes very important because of the necessity to improve the diagnosis of diseases and the treatment and care of patients. 2D/3D fusion of medical images acquired by medical scanners - for example, Computerised Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET) and Single Photon Emission Computer Tomography (SPECT) - is of special interest for medical experts because of the different image characteristics produced by each device that are useful for combining.

It is known that CT and MRI can provide anatomical information (structure, texture, grey-level shapes) of human organs and PET and SPECT can provide functional information (tissue function, blood vessels, cell function, metabolism). Moreover, CT can provide good bone structure with less distortion but with less tissue texture, as opposed to MRI that can give a high quality structure description but with more distortion due to secondary magnetic field effects formed around the examined object. As a consequence, any one of the following five image pairs - (CT,PET), (CT,SPECT), (MRI,PET), (MRI,SPECT), (CT,MRI) is useful for fusion. The first four are classified as structure-function fusion, the last one as structure-structure fusion. The first component in each pair is referred to as a reference image (the one that is not changed), the second component is referred to as a target one (the one that is changed). The mutual characteristic for all four modalities is that they contain a finite set of 2D cross-sections (slices, scans) of a human organ, forming a 3D structure of the organ, with a chosen distance between the 2D slices.


Matching/Fusion strategy

Before the images are fused they have to be matched (registered); that is, the target image needs to be modified somehow to match the reference image. All image matching/fusion algorithms developed and classified  [1, 2, 3, 4] so far generally adopt a four-step strategy: a) extraction of key structures, b) matching of key structures, c) evaluation of similarities between structures, and d) transformation of target image.

Classification of matching/fusion algorithms is a very complex task in general, since it involves many classification criteria [2] such as image dimensionality (2D, 3D), image properties (attributes and their inter-relationships - extrinsic, intrinsic) and image property coupling (interpolation, approximation), domain (global, local) and elasticity (rigid, affine, projective, curved) of transformation, parameter determination (direct, search-oriented) and interaction (interactive, semi-automatic, automatic). A complex classification with all explanations is beyond the scope of this paper. Yet, it would be worth mentioning that different classifications would be possible from the following global set of principles of matching methods: a) control points, b) moments, c) knowledge base representation of object characteristics, d) similarity optimisation, e) stochastic sign change, f) Fourier analysis, g) cross correlation, h) eigenvalue decomposition, i) warping, j) 3D surface-fitting, k) anatomical atlas, l) internal landmarks, m) external landmarks, n) principal axes, o) inexact relational matching, and p) neural network segmentation.

Which method is to be chosen depends on the type of images to be matched/fused (for example, natural scene images, stereo images, Landsat images or medical images). Moreover, a set of possible problems related to the matching task and type of images would be given as: a) choosing a matching method, b) choosing the reference image, c) selecting image properties, d) imposing constraints, e) evaluating "goodness of matching", f) choosing an appropriate geometric transformation, g) coping with distortion, and h) choosing a fusion method.

This project is concerned with medical images. Hence, with regard to the suggested four-step matching methodology and related problems, the following key steps are chosen: a) image regions, similar to the anatomical shapes of human organs, and their attributes, are extracted from 2D slices to make less complex the image-matching procedure and the finding of the best match; b) CT or MRI image modality is selected to be the reference since they provide anatomical information which is closely related to segmented regions; c) tree search is employed as the best method for finding the region matches, and similarity between regions (for finding the best match) is evaluated by applying functions that incorporate comparisons of region attributes; d) a spline geometric transform is applied to correct the local distortion. Based on these key steps and knowledge base representation (IF-THEN rule-based analysis) 2D and 3D generalised approaches were proposed and discussed by the authors of this paper [5, 6]. In order to get a single 2D (3D) composite image out of two modality components, two (three) major procedures needed to be performed: matching and fusion (and 3D visualisation). These procedures were broken into steps: a) selection of two 3D medical image data sets, b) resampling of the 3D target modality using bilinear interpolation (getting the same dimension of both 3D sets (number and size of slices)), c) selection of 2D slices of interest, d) generation of non-touching regions (region segmentation via iterative thresholding and/or using knowledge base representation of the anatomical shapes of human organs), d) calculation of region attributes and their inter-relationships, e) application of inexact relational region matching, f) application of the geometric transformation (spline transform applied to the region centres of gravity as the best transform for eliminating local distortions [7], and g) fusion of images [8] - composition of weighted average of corresponding pixels (voxels) of either any chosen pair of 2D slices or complete 3D sets. In the case of 3D fusion, a 3D visualisation was also performed. Two distinct, sequential step-designed approaches were proposed: a) initially supervised segmentation based on iterative thresholding to form the image regions, region-based relational matching, and a tree search for finding the best match, and b) unsupervised knowledge-based segmentation to form image regions, region-based relational matching, and a tree search for finding the best match. The shapes of closed region boundaries, produced by segmentation, were as similar as possible to those based on the anatomical structure of human organs. Segmentation by iterative thresholding was manually supervised only for the initial trials, either until the shapes of significant parts of organs in all of the matching modalities became similar to those initially suggested, or until the number and sizes of regions became close enough to match. Once the set of threshold values tex2html_wrap_inline446 (where m represents modality - CT, MRI, PET, SPECT) was found, it could be used as the set of initial values for any future iterative threshold procedures. That would provide an automatic approach without significant human intervention. Also, separate sets of threshold values would have to be found and stored for different organs or structures such as the head, chest, spine or neck. An automatic knowledge-based unsupervised segmentation procedure was based on prior knowledge about the regions that expected to be obtained; for example, in the case of brain images, a definition of knowledge-based shapes of characteristic parts - thalamus, sinus, skull, cerebellum, white matter, gray matter, etc., was demanded. Each small region, obtained using initial low-level segmentation, was checked against the prior knowledge about the chosen anatomical region shapes and merged into larger regions. The prior knowledge was represented in the form of masks consisting of anatomical shapes of specific organs, and was driven by rule-based (IF-THEN) facts that used the image features and guided the region merging process. Afterwards, segmentation regions of both images were to be matched.

In this paper the emphasis is given to evaluation of inexact relational matching (match evaluation functions based on a similarity measure between region attributes and their inter-relationships), and finding the best match within a set of "good" matches (AI-based tree search methods).


Relational region matching

  Among many image primitives (points, lines, regions), regions are adopted because of the special segmentation methods applied for their extraction (iterative thresholding or knowledge-based segmentation). More attributes can be extracted from regions than from points and lines. Nevertheless, by applying Hough or medial axis transformations, points and lines could also be used with the same algorithm proposed here, but the accuracy of matching may well be below the level that can be reached with regions.

Image region attributes including perimeter, area, shape, mean/max/min radius, centre of gravity, and amplitude, histogram and average gray values are computed. Either the attributes of the regions, or calculable inter-relationships between them, can be used. If the inter-relationships are used, then attributes of each region within either the reference or target images are recalculated in terms of their relationships with other regions in order to provide uniformity. A major experimental search for the best combination of region attributes and/or relationships is underway. The main criterion for a good choice is invariance under linear transformations (translation, rotation, scaling, skewing) and possibly non-linear distortions. The non-linear distortions are to be corrected mainly by applying the spline geometric transformation using the centres of gravity of matched regions after the region matching has been performed.

Let sets tex2html_wrap_inline450 and tex2html_wrap_inline452 represent regions as image primitives in the reference and target images, where m and n are the numbers of regions in each (in general, m is not equal to n). Each region is described by a finite number tex2html_wrap_inline462 of region properties tex2html_wrap_inline464 with values tex2html_wrap_inline466 , k=1,2,...,r (the term "property" will stand for attributes and their inter-relationships (usually called relations) - numerical or symbolical) to form hierarchical structural image pattern descriptions - representations. All numerical property values are scaled into the interval [0,1] using the expression

  equation35

or, in the case of Gaussian-distributed properties, normalised using the expression

  equation44

where tex2html_wrap_inline472 and tex2html_wrap_inline474 are the minimum and maximum values, and tex2html_wrap_inline476 and tex2html_wrap_inline478 are the mean value and standard deviation of property tex2html_wrap_inline464 . Properties are weighted depending on what pair of modalities (CT, MRI, PET, SPECT) is to be fused. The property weights differ from each other, especially when a structure-function fusion takes place, due to different regions being extracted in the segmentation process. For example, properties "area" and "perimeter" (or "relative area" and "relative perimeter" in the case of forming relations), are given a higher weight than shape; the property "mean radius" (or "relative mean radius"), is considered more influential than property "distance between region centres of gravity", etc. Unfortunately, there is no mathematical apparatus that can theoretically describe the way of assigning weights to properties  - experiments are the only alternative researchers have for the time being.

The matching procedure is complete when l regions ( tex2html_wrap_inline484 ) are found in both matched images. Relational region matching is solved by means of tree search methods. The matching procedure is guided by the match evaluation function that compares similarities between weighted region property values in the reference and the target images in order to find the best match. A tree search is applied only to find region matches prior to the application of a geometric transformation.


Evaluation of matching

The match evaluation function has to be applied to two data descriptions (here, reference and target image regions as primitives and their properties  - attributes and their relationships-relations) in order to guide the search method in finding the best match. The best match between two sets of image region primitives is to be found as the mapping for which the regions show the best similarity in their property values. The major problems of definition of similarity of properties  - what property is more important and/or how to combine numerical and symbolical properties  - need to be overcome. Thus, the property values have to be weighted and their inter-relationships defined in terms of all the relations over the set of property values, and/or a conditional probability function of properties needs to be employed.

Three types of evaluation function that compare similarities of two descriptions (here, sets of regions) have been proposed so far: a distance measure (cost function) and a merit function, both involving mainly comparison of pairs of primitive property values, and the function that involves more sophisticated parameters - for example, the number of primitives, the number of properties and the number of successful matchings of each pair of properties  [9]. The distance measure (distance function) is to be minimised, the merit function is to be maximised, and the third function could be defined in such a way to allow either minimisation or maximisation. In the case of the two sets of region primitives defined in Section 3, the cost function of similarity between regions tex2html_wrap_inline486 and tex2html_wrap_inline488 is usually defined as the sum of the absolute or square value differences, or even as the square root of the sum of the square value differences of all region properties. These forms can be applied to numerical properties only. In the case of either only symbolical properties or a combination of both symbolical and numerical properties, a conditional probability function can be successfully applied [11], or some penalty for non-similarity needs to be incorporated into an evaluation function expression. The conditional probability (density) function (or conditional information) tex2html_wrap_inline490 of property tex2html_wrap_inline464 , is defined as the likelihood that the property tex2html_wrap_inline494 of the primitive tex2html_wrap_inline488 in the target image will have the value tex2html_wrap_inline498 if it is known that the property tex2html_wrap_inline500 of the corresponding primitive tex2html_wrap_inline486 in the reference image has the value tex2html_wrap_inline504 . The contribution to the distance measure between these two primitives from property tex2html_wrap_inline464 could then be defined as:

  equation81

Minimising the sum of all distance measures will lead to a maximising of the product of all probabilities and, consequently, a maximising of the likelihood mapping. The maximum likelihood mapping can be obtained by minimising the sum of the square root property value differences in the case of Gaussian-distributed properties or by minimising the sum of the absolute property value differences in the case of Laplacian-distributed properties. In the case of symbolical properties, beside adding a penalty value to the total distance measure if compared property values are not the same, which is rather a rough method since it does not distinguish the level of similarity, the only smooth method involves the above mentioned probabilistic approach.

Boyer and Kak [10] developed a cost function whereas Vosselman [11] developed a merit function based on probabilistic information theory and the formula defined in Equation (3). At issue here is the calculation of property conditional probabilities. All methods (analytical, numerical and hand-training) require either complex functions to be found (analytical and numerical methods) or many matches to be done by hand to guarantee correctness.

Two well-known cost (distance) functions were designed by Sanfeliu and Fu [12] and by Shapiro and Haralick [13, 14]. The first one defines a distance measure on graphs (primitives are graph nodes, branches are inter-relationships/relations between primitives). The function primarily includes a measure of differences between normalised attribute values, and some heuristics involving structural differences between graphs. The second one is based on comparing weighted primitives, weighted attributes (using attribute value thresholds) and weighted relation vectors/tuples, and adding a normalised distance (between 0 and 1) for each corresponding pair of primitive properties (relation vectors/tuples) that is inexactly matched. Price and Reddy [15] and Faugeras and Price [16] extended this weighted attributes comparison to assigning a strength for attribute contribution in particular matching tasks; for example, the same attributes may have more or less influence on the evaluation of matching depending on the type of images being used and primitives being extracted. Difficulties here are the need for an experimental (non-theoretical) search for weights and strengths.

Barrow and Popplestone [9] proposed an evaluation function involving more parameters than a comparison of attributes and their relationships:

  equation104

where tex2html_wrap_inline508 is the number of successful matchings of relations (inter-relationships between region attributes), tex2html_wrap_inline510 and tex2html_wrap_inline512 are respectively the number of relations and the number of regions in the reference image region descriptions. Relations between reference and target images ( tex2html_wrap_inline514 and tex2html_wrap_inline516) are considered to be matched if their difference is within some threshold. The threshold is chosen experimentally to be tex2html_wrap_inline518 , where tex2html_wrap_inline520 is the standard deviation of the relation tex2html_wrap_inline514. The drawback is that neither attribute/relation weights have been introduced, nor penalties applied, for combining numerical and symbolical region properties.

To overcome some of the problems mentioned above the authors propose a set of new evaluation functions:

  equation118

  equation138

  equation158

where tex2html_wrap_inline524 , tex2html_wrap_inline526 and tex2html_wrap_inline528 are the weights that define the level of contribution of the number of regions segmented in reference and target images and the number of the properties used; tex2html_wrap_inline530 and tex2html_wrap_inline532 are the numbers of regions in reference and target images; tex2html_wrap_inline534 is the number of initially suggested anatomical regions that should be segmented; tex2html_wrap_inline536 is the number of properties extracted; and tex2html_wrap_inline538 is the number of successful matchings of property values. The factor tex2html_wrap_inline540 represents the sum of penalty values for each pair of unmatched properties. The first two factors contribute to the function as the relative difference between the number of extracted and initially suggested regions in both images. All the functions are to be minimised for good matches. All properties are weighted and are compared with regard to the threshold given by tex2html_wrap_inline542 , k=1,...,r, where tex2html_wrap_inline546 is a constant to be found experimentally, and tex2html_wrap_inline548 is the standard deviation of all region property values of one kind in the reference image. If the difference between two property values is above the threshold, a penalty, equal to the weight of that property, is added to the evaluation function. Properties of regions of both images can either be attributes or relations (all attributes of one kind for all regions are recalculated with regard to each other forming inter-relationships/relations).


Tree search

Although detailed explanations about tree search algorithms and related problems are not the prime concern of this paper, some pointers related to evaluation of matching will be given, to make the matching/fusion algorithm complete.

Tree search methods have been well developed in AI. Here, they are used to search the space of similarities between region property values of reference and target images and find the best match based on the proposed evaluation function. In general, the search for the best match is very time-consuming since there are many regions and their properties of both images involved. The search time could be reduced by imposing some constraints. Hard constraints limit the set of all mappings within the search space to those for which the values of the evaluation function are inside experimentally predefined thresholds. Soft constraints define a relative preference over the space of possible mappings by implementing some heuristics in the evaluation function that direct the search method towards "better mappings". To minimise the search time for the best match, both hard and soft constraints could be employed. For example, with regard to the defined evaluation functions (Equations (5), (6) and (7)), the authors propose soft constraints that direct the search algorithm to compare the properties with high weights first; after each comparison the partial value of the evaluation function of currently compared regions is calculated and compared with the stored minimum value of the function of some previously completed region comparisons; the process of the calculation of the current function (and, consequently, the process of the comparison of the rest region properties) is stopped when its value exceeds the minimum stored value of another pair of compared regions; the current match is then assigned the tag "not the best one"; the track of "the best match" is always kept.


Conclusion

Evaluation of inexact relational matching in 2D/3D medical image matching/fusion based on tree search principles has been considered. Improvements are given in considering the evaluation function that involves the weighted attribute/relation comparison, the weighted penalty for non-similarity, the number of attributes/relations, the number of successful attribute/relation matches, and the weighted relative difference between the segmented and initially suggested anatomical regions of human organs of both the reference and the target images. Although both relaxation labelling ([17, 18]) and relational matching methods can be used for matching relational image descriptions, the latter has been chosen for this work because it always finds the best mapping between the primitives of image descriptions (in this case, the image regions). The trade-off between the computational cost and accuracy also needs to be taken into account whereby the more image regions and region attributes/relationships adopted, the better the accuracy, but the greater the computation.


Acknowledgements

The authors are grateful to Prof. Marilyn Noz of the Department of Nuclear Medicine, New York University, for her unreserved support in supplying matched medical image data sets used in this project.


References

1
Brown L. G. (1992), "A survey of image registration techniques", ACM Computing Surveys, 24(4), pp. 325-376.

2
Van den Elsen P. A., Pol D. E. & Viergever M. A. (1993), "Medical image matching - A review with classification", IEEE Engineering in Medicine and Biology, pp. 26-38.

3
Chen Q. (1993), Image Registration and its Applications in Medical Imaging, PhD thesis, Vrije University Brussel, Faculty of Applied Sciences, Brussels.

4
Chapnick J. V., Noz M. E., Maguire G. Q. Jr., Kramer E. L., Sanger J. J., Birnbaum B. A. & Megibow A. J. (1993), "Techniques for multimodality image registration", Proceedings of the 1993 IEEE 19th Annual Northeast Bioengineering Conference, pp. 221-222.

5
Vajdic S. M., Katz H. E., Downing A. R. & Brooks M. J. (1994), "Artificial Intelligence and registration of functional and structural images: Generalised 2D approaches", Proceedings of the 1994 IEEE International Conference on Systems, Man and Cybernetics, to appear, October 1994.

6
Vajdic S. M., Katz H. E., Downing A. R. & Brooks M. J. (1994), "AI-based relational matching and multi-modal medical image fusion: Generalised 3D approaches", Proceedings of the 1994 IEEE International Conference on Visual Communications and Image Processing, to appear, September 1994.

7
Goshtasby A. (1988), "Registration of images with geometric distortions", IEEE Transaction of Geoscience and Remote Sensing, 26, pp. 60-64.

8
Porter T. & Duff T. (1984), "Compositing digital images", Computer Graphics, 18(3), pp. 253-259.

9
Barrow H. G. & Popplestone R. J., (1971), "Relational description in picture processing", Machine Intelligence, 6, pp. 377-396.

10
Boyer K. L. & Kak A. C., (1988), "Structural stereopsis for 3-D vision", IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(2), pp. 144-166.

11
Vosselman G., (1992), Relational Matching, Berlin Heidelberg: Springer-Verlag.

12
Sanfeliu A. & Fu K. S., (1983), "A distance measure between attributed relational graphs for pattern recognition", IEEE Transaction on Systems, Man and Cybernetics, 13(3), pp. 353-362.

13
Shapiro L. G. & Haralick R. M., (1981), "Structural descriptions and inexact matching", IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-3(5), pp. 505-519.

14
Shapiro L. G. & Haralick R. M., (1987), "Relational matching", Applied Optics, 26, pp. 1845-1851.

15
Price K. & Reddy R., (1979), "Matching segments of images", IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(1), pp. 110-118.

16
Faugeras O. D. & Price K. E., "Semantic description of aerial images using stochastic labeling", IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-3(6), pp. 633-642.

17
Rosenfeld A., Hummel R. A. & Zucker S. W. (1976), "Scene labeling by relaxation operations", IEEE Transaction on Systems, Man and Cybernetics, SMC-6, pp. 420-453.

18
Medioni G. & Nevatia R. (1984), "Matching images using linear features", IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-6(6).

About this document ...

AI and Medical Imagery: Strategy and Evaluation of Inexact Relational Matching

This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html vajdic.tex.

The translation was initiated by Pam Milliken on Mon Oct 28 15:19:03 EST 1996


Complexity International (1995) 2