(ProQuest: ... denotes non-US-ASCII text omitted.)
Dong Hua 1 and Dechang Chen 2 and Fang Liu 3 and Abdou Youssef 1
Recommended by Zhenqiu Liu
1, Department of Computer Science, The George Washington University, 801 22nd Street NW, Washington, DC 20052, USA
2, Division of Epidemiology and Biostatistics, Uniformed Services University of the Health Sciences, 4301 Jones Bridge Road, Bethesda, MD 20814, USA
3, Department of Computer Science, University of Texas-Pan American, 1201 W. University Drive, Edinburg, TX 78539, USA
Received 1 December 2008; Accepted 17 March 2009
1. Introduction
Hand X-ray shown in Figure 1 is commonly used for skeletal age assessment in pediatric radiology. A discrepancy between skeletal maturity and the chronical age may indicate the presence of some abnormality in skeletal growth. This abnormality has been found to be related to various diseases such as endocrine disorders [1], metabolic/growth abnormalities [2], malformations and bone dysplasias [3], and gonadal dysgenesis [4]. Therefore, the assessment of skeletal maturity has become more and more important clinically. Clearly the accuracy in assessment is of the first concern.
Figure 1: Hand X-ray used in skeletal age assessment.
[figure omitted; refer to PDF]
Features encoded in ossification centers form the basis for assessment. If we know the exact characteristics of the features with regard to different stages of ages, we can do the best job on assessment. In reality, one needs a mechanism to capture such characteristics of features. Given data of a feature with respect to skeletal ages, a simple and common approach is to fit a line or a curve, which in turn is used for future prediction of new patients or assisting radiologists to understand the variation rules of the feature.
For instance, Figure 2(a) shows the variation of a ratio feature [5, 6] in vertical axis with regard to the increasing skeletal age along the horizontal axis from newborn to 19 year old boys. (More details on this ratio are provided in Section 3.2.) Here in the figure, a single line is used for fitting the values of the feature. Obviously, a line is not enough to capture the characteristic of the values of the feature. A quadratic curve, shown in Figure 2(c), does not do a good job either. Fitting a more complex curve does not seem to be a feasible approach. This is because sometimes there are available only a small amount of data which could restrict the learning of complex curves, and local properties (with respect to the time) of the feature are often lost when fitting a global complex curve, and thus leading to inaccurate future prediction.
Examples of fitting the variation of the ratio feature. The horizontal axis represents the skeletal age and the vertical axis corresponds to the values of the feature.
(a) [figure omitted; refer to PDF]
(b) [figure omitted; refer to PDF]
(c) [figure omitted; refer to PDF]
(d) [figure omitted; refer to PDF]
In this paper, we propose to fit the variation of features of the skeleton age via a multistage fitting approach. With our approach, we divide the skeletal age axis into several stages or phases, and within each stage, a relative simple model (line or curve) is employed for the purpose of fitting. Usually, the variation of a feature does not follow a simple rule when skeletal age increases. Instead, it often shows different variation patterns among different stages of age. As shown in Figures 2(b) and 2(d), multistage fitting not only can capture the entire pattern of feature variation but also carry the local properties regarding the skeletal age. A critical question is then, how does one determine the appropriate positions to separate the stages? The proposed Bayesian cut in this paper provides an answer via a Bayesian approach.
The rest of the paper is organized as follows. In Section 2, we describe our models for fitting, where the Bayesian cut is introduced. In Section 3, we present our experimental results on multi-stage fitting for artificial and real data. We conclude our paper in Section 4.
2. The Proposed Method
In this section, we first describe our proposed method for a simple case and then extend it to a general scenario.
Given a sequence of values f1 ,f2 ,...,fn , which denotes the skeletal age f in an ascending order, consider the linear relationship between f and one feature y found in the hand X-ray (e.g., length of digit). Usually, such a linear relationship varies as the skeletal age increases. That is, one linear form established for one interval of the skeletal age may not hold for the next interval, where a different linear form should be used. The time where two linear forms differ is called a change point . Our model that takes into account linear relationships and change points is stated as follows: [figure omitted; refer to PDF] where t1 ,...,tk-1 (correspondingly f1 ,...,fk-1 ) indicate the sequential change points, tj -tj-1 ≥3 (j=1,...,k ), and ...ji (for all i ) are independent N(0,σj2 ) and ...ji (for all i, j ) are independent of each other. In the model, the parameters βj1 , βj2 , σj2 , tj are all unknown, which will be estimated in light of the given data. The interval [tj -tj-1 ] represents the j th stage or phase, denoted by phj . The main task here is to estimate the times tj . Given the estimates of tj , the linear forms and the associated parameters can be obtained through the traditional regression technique. We note that the requirement tj -tj-1 ≥3 (j=1,...,k) is needed for estimation of the regression lines. When k=2 , the model will be reduced to the two-phase regression with a single change point in [7].
The above model that uses only one dependent variable f can be generalized to include multiple independent variables. This generalization leads to the following model: [figure omitted; refer to PDF] where fi is a p -dimensional vector of variables, β...j (j=1,...,k ) is a p -dimensional vector of parameters, tj -tj-1 ≥p+1 , and ...ji are as the same as before. We refer p as the cardinality of the input vector fi , denoted by C(fi ) , and the number of sample points in phj as the cardinality of [tj -tj-1 ] , denoted by C(phj ) . We note that though linear regression is used for each phase in model (2), this model certainly encompasses other nonlinear cases such as polynomial forms.
We now describe a Bayesian approach to estimate the change points. Denote (ftj-1 +1 ,...,ftj)T by Fj , (F1T ,...,FkT)T by F , (ytj-1 +1 ,...,ytj)T by yj , (y1 T ,...,yk T)T by y , and (t1 ,...,tk-1 ) by t . For simplicity, we assume the noninformative or uniform prior for β...j (j=1,...,k ), ln(σj 2 ) and t . Noninformative priors are used when information about parameters is completely unknown or when proper priors such as conjugate priors do not apply. (For a vigorous discussion on the choice of priors, see [8].) We can show the following main result (see the Appendix). Given the data y and the uniform prior for β...j (j=1,...,k ), ln(σj 2 ) and t , where the number k is predetermined, the posterior probability that change points occur at t is [figure omitted; refer to PDF] where J=(∑t ...2(n-kp)/2∏j ...|FjTFj|-1/2 Γ((tj -tj-1 -p)/2)×Sj-(tj -tj-1 -p)/2)-1 , and Sj =(yj -Fjβ......j)T (yj -Fjβ......j ) with β......j =(FjTFj)-1FjTyj denoting the least-squares estimator of β...j . Using this result, we estimate t by t* at which p(t|"y) has its maximum, that is, t* =arg maxt p(t|"y) . We call t* the Bayesian cut , and the value 2(n-kp)/2∏j ...|FjTFj|-1/2 Γ((tj -tj-1 -p)/2)Sj-(tj -tj-1 -p)/2 the proportional posterior (pp ).
3. Experiments
In this section, we perform the Bayesian cut on two data sets: one is synthesized and the other is real. We use the synthesized data for performance evaluation in terms of recovery of changing points. The real data are used to discover the Bayesian cut and describe the feature in a multistage way which has more accurate prediction of the skeletal age compared with fitting by a single line or curve. Both linear and nonlinear regression are used for comparison. For convenience, we call the fitting with a single line or curve the single fitting and the fitting with the Bayesian cut the Bayesian cut fitting .
3.1. Synthesized Data
We consider five cases or models describing the relationship between the dependent and independent variables. These are shown in Table 1 where the input vector fi for models m1 , m2 , m3 , m4 , and m5 is (1,fi)T , (1,fi ,fi2)T , (1,fi ,fi2 ,fi3)T , (1,fi ,fi2 ,fi3 ,fi4)T , and (1,fi ,fi2 ,fi3 ,fi4 ,fi5)T , respectively. The data are generated according to the setting given in Table 2. Specifically, βji is randomly chosen from (-5.0 , 5.0 ). ...ji is generated from a normal distribution with mean 0 and variance σj2 randomly selected from (0,5C(fi )-1 ) . The number of sample points of the jth phase C(phj ) is randomly selected from the set {(C(fi )+1),...,(C(fi )+1)+s} , where s is predetermined. fi takes the value of i for i=1,2,...,tk . Note that we use a variable bound for σj2 for taking into account the influence of the highest degree of the polynomial. Also, we use the variable number of sample points for each phase by introducing unbalance and scalability factors such that the performance evaluation will be more objective.To present a quantity on the performance of the Bayesian cut, we use the metric absolute deviation (AD), defined as [figure omitted; refer to PDF] where tj* represents the jth element of t* (the Bayesian cut). Intuitively, the smaller AD is, the closer is the Bayesian cut t* to the true change points t .
Table 1: Models for testing the performance of the Bayesian cut.
m1 | yi =βj1 +βj2fi +...ji , |
t=(t1 ,...,tk-1 ) | |
m2 | yi =βj1 +βj2fi +βj3fi 2 +...ji , |
t=(t1 ,...,tk-1 ) | |
| |
m3 | yi =βj1 +βj2fi +βj3fi 2 +βj4fi 3 +...ji , |
t=(t1 ,...,tk-1 ) | |
| |
m4 | yi =βj1 +βj2fi +βj3fi 2 +βj4fi 3 +βj5fi 4 +...ji , |
t=(t1 ,...,tk-1 ) | |
| |
m5 | yi =βj1 +βj2fi +βj3fi 2 +βj4fi 3 +βj5fi 4 +βj6fi 5 +...ji , |
t=(t1 ,...,tk-1 ) |
Table 2: Experimental setting.
βji | (-5.0,5.0) |
...ji | ~N(0,σj2 ),σj2 ∈(0,5C(fi )-1 ) |
k | 2, 3, 4 |
C(phj ) | (C(fi )+1),...,(C(fi )+1)+s |
scale | 1,...,10 |
t0 | 0 |
tj | tj-1 +C(phj-1 ) |
fi | 1,...,tk |
Table 3 shows the AD values. They are obtained by ranging k from 2 to 4 and s from 1 to 10 . For given k , s , and a given model, 50 trials are performed to generate data, leading to 50 datasets {(F, y)} . We find the Bayesian cut t* for each (F, y) and a given model. The final AD score is obtained by averaging the 50 runs.
Table 3: AD scores for models in Table 1.
k | s | m1 | m2 | m3 | m4 | m5 |
2 | 1 | 0.280 | 0.340 | 0.320 | 0.080 | 0.180 |
2 | 0.300 | 0.460 | 0.360 | 0.200 | 0.100 | |
3 | 0.260 | 0.400 | 0.320 | 0.100 | 0.100 | |
4 | 0.640 | 0.380 | 0.260 | 0.180 | 0.180 | |
5 | 0.480 | 0.680 | 0.480 | 0.100 | 0.060 | |
6 | 0.380 | 0.300 | 0.560 | 0.220 | 0.100 | |
7 | 0.540 | 0.520 | 0.340 | 0.280 | 0.100 | |
8 | 0.900 | 0.520 | 0.440 | 0.120 | 0.020 | |
9 | 0.740 | 0.340 | 0.080 | 0.040 | 0.020 | |
10 | 0.740 | 0.720 | 0.160 | 0.200 | 0.020 | |
| ||||||
3 | 1 | 0.230 | 0.360 | 0.210 | 0.240 | 0.090 |
2 | 0.440 | 0.390 | 0.190 | 0.080 | 0.060 | |
3 | 0.590 | 0.340 | 0.210 | 0.220 | 0.060 | |
4 | 0.820 | 0.590 | 0.260 | 0.060 | 0.010 | |
5 | 0.970 | 0.690 | 0.530 | 0.020 | 0.090 | |
6 | 0.670 | 0.580 | 0.120 | 0.060 | 0.070 | |
7 | 1.220 | 0.750 | 0.160 | 0.080 | 0.190 | |
8 | 1.260 | 0.680 | 0.650 | 0.040 | 0.030 | |
9 | 1.210 | 0.860 | 0.370 | 0.380 | 0.010 | |
10 | 1.340 | 0.360 | 0.680 | 0.020 | 0.020 | |
| ||||||
4 | 1 | 0.333 | 0.300 | 0.133 | 0.040 | 0.053 |
2 | 0.440 | 0.433 | 0.227 | 0.060 | 0.033 | |
3 | 0.867 | 0.480 | 0.113 | 0.080 | 0.033 | |
4 | 0.780 | 0.513 | 0.093 | 0.080 | 0.133 | |
5 | 1.020 | 0.887 | 0.453 | 0.133 | 0.173 | |
6 | 1.360 | 0.760 | 0.193 | 0.093 | 0.180 | |
7 | 1.007 | 0.593 | 0.353 | 0.047 | 0.040 | |
8 | 0.727 | 0.587 | 0.453 | 0.093 | 0.113 | |
9 | 1.080 | 1.240 | 0.867 | 0.360 | 0.087 | |
10 | 1.213 | 0.873 | 0.333 | 0.120 | 0.140 |
Our findings can be summarized as follows. Regardless of linear or nonlinear regression, the Bayesian cut performs well with low AD scores. Introducing the unbalance and scalability factors does not deteriorate the performance of the Bayesian cut significantly. The Bayesian cut scales well when the number of change points increases.
3.2. Real Data
In this part, we apply the Bayesian cut fitting to some real data from our database shown in Table 4. This table describes feature values with regard to the increasing skeletal age that ranges from newborn to 19 -year-old boys (shown in column 1) labeled by radiology experts. In order to obtain features independent of the size and the length of digits, two ratio features are used according to the paper [5]. One is L1 /L2 , the ratio of the length of distal phalanx L1 to that of middle phalanx L2 of the middle digit, and the other is L2 /L3 , the ratio of the length of middle phalanx L2 to that of proximal phalanx L3 . See Figure 3 for illustration of L1 , L2 , and L3 . These two features correspond to columns 2 and 3 which are generated in the light of the algorithm in [6]. Columns 4 and 5 represent normalized values of L1 /L2 and L2 /L3 , respectively. This normalization is done according to (x-μ)/σ, where μ is the expectation of x and σ is the variance. In our experiments, only normalized values are used. Figure 4 shows some of the Bayesian cut fitting, where features n(L1 /L2 ) and n(L2 /L3 ) are used, models describing the relationship between the feature and the skeletal age are m1 and m2 from Table 1, and k takes values of 2 , 3 , and 4 . In Figure 4, the horizontal axis represents the age and the horizontal axis indicates the feature. For model m1 , the blue straight line across the entire age range is from the single (line) fitting. For model m2 , the blue curve across the entire age range is from the single (quadratic) fitting. All red (broken) lines are from the Bayesian cut fitting.
Table 4: Some features of the skeletal age.
Age (yr) | L1 /L2 | L2 /L3 | n(L1 /L2 ) | n(L2 /L3 ) |
0 | 0.6795 | 0.7016 | 41.8212 | 51.1987 |
3 | 0.6307 | 0.5853 | 6.4071 | -17.6281 |
3.5 | 0.6220 | 0.6298 | 0.1020 | 8.6933 |
4.0 | 0.6060 | 0.5993 | -11.4491 | -9.3140 |
4.5 | 0.6111 | 0.5708 | -7.7721 | -26.1616 |
5.0 | 0.6172 | 0.5070 | -3.3303 | -63.8970 |
6.0 | 0.5675 | 0.5924 | -39.3612 | -13.4245 |
7.0 | 0.5947 | 0.6626 | -19.6939 | 28.0937 |
8.0 | 0.5820 | 0.6097 | -28.9032 | -3.1878 |
9.0 | 0.5939 | 0.5968 | -20.2149 | -10.7828 |
10.0 | 0.5680 | 0.6643 | -39.0383 | 29.1323 |
11.0 | 0.5776 | 0.6696 | -32.0541 | 32.2560 |
11.5 | 0.5845 | 0.6550 | -27.0602 | 23.6424 |
12.5 | 0.5979 | 0.6266 | -17.3472 | 6.8003 |
13.0 | 0.6292 | 0.5670 | 5.3295 | -28.4227 |
13.5 | 0.6000 | 0.6219 | -15.8024 | 4.0436 |
14.0 | 0.6436 | 0.6065 | 15.7982 | -5.0842 |
15.0 | 0.6703 | 0.6319 | 35.1558 | 9.9431 |
15.5 | 0.6843 | 0.5937 | 45.2891 | -12.6564 |
16.0 | 0.6746 | 0.5843 | 38.2966 | -18.2156 |
17.0 | 0.6632 | 0.6153 | 30.0081 | 0.1412 |
18.0 | 0.6589 | 0.6236 | 26.8770 | 5.0546 |
19.0 | 0.6452 | 0.6316 | 16.9420 | 9.7754 |
Figure 3: Illustration Of L1 , L2 and L3 .
[figure omitted; refer to PDF]
Illustration of the Bayesian cut fitting applied to the real data on features of the skeletal age.
(a) [figure omitted; refer to PDF]
(b) [figure omitted; refer to PDF]
(c) [figure omitted; refer to PDF]
(d) [figure omitted; refer to PDF]
(e) [figure omitted; refer to PDF]
(f) [figure omitted; refer to PDF]
(g) [figure omitted; refer to PDF]
(h) [figure omitted; refer to PDF]
(i) [figure omitted; refer to PDF]
(j) [figure omitted; refer to PDF]
(k) [figure omitted; refer to PDF]
(l) [figure omitted; refer to PDF]
4. Conlcusion
In this paper, we propose the Bayesian cut fitting to describe features in response to the skeletal age. In the semantic space derived by our approach, the axis of skeletal age is divided into meaningful stages, within each of which the variation pattern of a feature is consistent so that a traditional regression technique can apply to model the relationship between the skeletal age and the feature. Our approach is inspired by the observation that the variation pattern of a feature can differ in different periods of the skeletal age. A critical issue is to determine the times or change points when the variation pattern of a feature changes. This is handled by the Bayesian cut proposed in this paper. Simulations have been used to demonstrate the efficiency of the Bayesian cut fitting in terms of recovery of change points. The experiments on real data show that given a type of relationship (e.g., linear or quadratic) between the skeletal age and a feature, the Bayesian cut fitting surpasses the traditional single fitting when the consistency of the variation pattern (over the entire skeletal age range) of the feature is suspected. One major issue which is not addressed in this paper is the determination of k , the number of stages. Selection of k depends on the given data and the practical need. We leave this as our future research work.
Acknowledgments
Dechang Chen was partially supported by the National Science Foundation grant CCF-0729080.
[1] D. B. Darling, chapter 6 Radiography of Infants and Children , Charles C. Thomas, Springfield, Ill, USA, 1979., 1st.
[2] A. K. Poznanski, S. M. Garn, J. M. Nagy, J. C. Gall Jr., "Metacarpophalangeal pattern profiles in the evaluation of skeletal malformations," Radiology , vol. 104, no. 1, pp. 1-11, 1972.
[3] D. R. Kirks, chapter 6 Practical Pediatric Imaging: Diagnostic Radiology of Infants and Children , Little, Brown, Boston, Mass, USA, 1984., 1st.
[4] J. Kosowicz, "The roentgen appearance of the hand and wrist in gonadal dysgenesis," The American Journal of Roentgenology, Radium Therapy and Nuclear Medicine , vol. 93, pp. 354-361, 1965.
[5] E. Pietka, M. F. McNitt-Gray, M. L. Kuo, H. K. Huang, "Computer-assisted phalangeal analysis in skeletal age assessment," IEEE Transactions on Medical Imaging , vol. 10, no. 4, pp. 616-620, 1991.
[6] E. Pietka, A. Gertych, S. Pospiech, F. Cao, H. K. Huang, V. Gilsanz, "Computer-assisted bone age assessment: image preprocessing and epiphyseal/metaphyseal ROI extraction," IEEE Transactions on Medical Imaging , vol. 20, no. 8, pp. 715-729, 2001.
[7] D. Chen, M. Fries, J. M. Lyon, "A statistical method of detecting bioremediation," Journal of Data Science , vol. 1, no. 1, pp. 27-41, 2003.
[8] G. E. P. Box, G. C. Tiao Bayesian Inference in Statistical Analysis , John Wiley & Sons, New York, NY, USA, 1992.
Appendix
A. Derivation of (3)
Proof.
According to the Pythagorean theorem, we have the following likelihood [figure omitted; refer to PDF] where Sj =(yj -Fjβ......j)T (yj -Fjβ......j ) and β......j =(FjTFj)-1FjTyj . Since ...ji are independent of each other, the likelihood function of β...1 ,...,β...k , σ12 ,...,σk2 ,t is then [figure omitted; refer to PDF] Due to the assumption of the uniform prior for β...j , ln(σj2 ) and t , we have [figure omitted; refer to PDF] Using (A.2) and (A.6), we have [figure omitted; refer to PDF]
Note that [figure omitted; refer to PDF] This equation exploits the fact [figure omitted; refer to PDF] from the normal density for the p -dimensional random vector X [figure omitted; refer to PDF] where μ... is the expected value of X and Σ is the variance-covariance matrix of X .
Substituting (A.10) into (A.7), we have [figure omitted; refer to PDF]
In addition, we have [figure omitted; refer to PDF] from the probability density function of X=aU [figure omitted; refer to PDF] where the constant a>0 and U-1 ~χm2 .
By applying (A.9) to (A.11), we get [figure omitted; refer to PDF] where J=(∑t ...2(n-kp)/2∏j ...|FjTFj|-1/2 Γ((tj -tj-1 -p)/2)×Sj-(tj -tj-1 -p)/2)-1 . This completes the proof.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2009 Dong Hua et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Accurate assessment of skeletal maturity is important clinically. Skeletal age assessment is usually based on features encoded in ossification centers. Therefore, it is critical to design a mechanism to capture as much as possible characteristics of features. We have observed that given a feature, there exist stages of the skeletal age such that the variation pattern of the feature differs in these stages. Based on this observation, we propose a Bayesian cut fitting to describe features in response to the skeletal age. With our approach, appropriate positions for stage separation are determined automatically by a Bayesian approach, and a model is used to fit the variation of a feature within each stage. Our experimental results show that the proposed method surpasses the traditional fitting using only one line or one curve not only in the efficiency and accuracy of fitting but also in global and local feature characterization.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer