Optimal Estimation of Calibration Parameters in Polarimetric Microwave Radiometers
Methods for internal calibration of a certain class of microwave polarimetric radiometers are presented by Piepmeier. In that work, the calibration parameters are estimated algebraically. We demonstrate that Bayesian estimation decreases the root-mean-square error of the estimates by a factor of two. This improvement is obtained by using knowledge of the noise structure of the measurements and by utilizing all of the information provided by the measurements. Drawbacks are the increased complexity of the method and an increase in computation. We also extend the method to estimate several hardware component parameters of interest in system calibration.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 3223 Optimal Estimation of Calibration Parameters in Polarimetric Microwave Radiometers Derek Hudson and David G. Long, Fellow, IEEE Abstract Methods for internal calibration of a certain class of microwave polarimetric radiometers are presented by Piepmeier. In that work, the calibration parameters are estimated algebraically. We demonstrate that Bayesian estimation decreases the root-mean-square error of the estimates by a factor of two. This improvement is obtained by using knowledge of the noise structure of the measurements and by utilizing all of the information provided by the measurements. Drawbacks are the increased complexity of the method and an increase in computation. We also extend the method to estimate several hardware component parameters of interest in system calibration. Index Terms Polarimetric radiometer, radiometer calibration, Stokes parameters. I. INTRODUCTION RADIOMETER calibration is the process of estimating radiometer channel gains and the internal noise (represented by an equivalent noise temperature) generated by the radiometer. Because the gains and noise temperatures can change rapidly during operation, internal calibration is performed frequently during radiometer operation. Internal calibration is accomplished by applying known inputs and measuring the voltage outputs of the radiometer. For the calibration scheme considered in this paper, the known inputs are internal sources (hot and cold sources of temperature TH and TC) rather than external targets. For this calibration scheme, the estimation of antenna gains is a separate process and, hence, is not addressed in this paper. Internal calibration of radiometers which measure the third Stokes parameter (TU) can be accomplished using an additional known input TCN, which is split and fed into both the vertical and horizontal channels so that the fluctuations in the electric fields of the two channels are correlated, simulating a third Stokes parameter input (see [1, Fig. 1]). This technique is introduced in [1] for microwave radiometers which use a hybrid coupler to synthesize 45 linear polarizations from vertical and horizontal signals. The noise-free forward model for the calibration measurements of such a radiometer is given in Section II. The algebraic method of [1] for estimating channel Manuscript received April 10, 2007; revised November 21, 2007. Current version published October 1, 2008. This work was supported by the NASA Goddard Space Flight Center. D. Hudson is with Brigham Young University, Provo, UT 84602 USA, and also with the NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA (e-mail: Derek.L.Hudson@nasa.gov). D. G. Long is with the Department of Electrical and Computer Engineering, Brigham Young University, Provo, UT 84602 USA (e-mail: long@ee.byu.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2008.921636 gains and temperatures using this noise-free forward model is summarized in Section III. In the remainder of this paper, we improve on the method of [1]. In Section IV, we expand the forward model to include the noise in the measurements and then solve the calibration problem using Bayes theorem rather than algebraically as was done in [1]. This results in a joint probability distribution function (pdf) for the calibration parameters. This joint pdf itself is the most complete answer to the calibration problem and is explored in Section VI. In Section V, we compare numerical estimates extracted from this pdf with the algebraic estimates of [1]. We show that our estimates are optimal in the sense of minimizing the root-mean-square error (rmse). They are unbiased, and their rmse is approximately half the rmse of the algebraic estimates. Section VII explores an extension. We use Bayes theorem to find the first ever estimates and pdfs for the hardware parameters that comprise the end-to-end radiometer channel gains. Finally, conclusions are offered in Section VIII. II. CALIBRATION FORWARD PROBLEM The noise-free forward model for a single cycle of the full (Case 4) polarimetric radiometer calibration algorithm described in [1] can be written as vv,C vv,H vv,CH vv,CN vh,C vh,H vh,CH vh,CN vp,C vp,H vp,CH vp,CN vm,C vm,H vm,CH vm,CN = Gvv 0 0 0 Ghh 0 Gpv Gph GpU Gmv Gmh GmU TC+T1 TH+T1 TC+T1 TC+1 2TCN+T1 TC+T2 TH+T2 TH+T2 TC+1 2TCN+T2 0 0 0 TCN . (1) This forward model is in [1, eq. (3)], with ov, oh, op, and om being given in [1, eq. (8)] and with the vector [Tv Th TU]T being replaced by a matrix. On the left side are the 16 voltages which are measured in one radiometer calibration cycle: For each of the four calibration looks (cold load, hot load, mixed load, and cold load plus correlated noise load), the voltage outputs of the four polarimetric channels (v, h, p, and m) are measured. On the right side of (1) are ten calibration parameters which are unknown to some degree: eight radiometer gains Gxx plus two receiver noise temperatures (mostly determined by the noise figures of the first-stage amplifiers) T1 and T2. The four columns of the rightmost matrix in (1) represent the brightness temperature inputs used in the four calibration 0196-2892/$25.00 2008 IEEE Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3224 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 Fig. 1. Marginal pdfs for the eight radiometer gains and two noise temperatures, along with various estimates of the parameters. The height scale is arbitrary, being the number of samples in each bin. As we use more samples to generate the empirical pdfs, the empirical pdfs match the Gaussian fits more closely. The true parameter values used for this simulation were randomized, rather than the typical values in Table I. subcycles. By rows, the top row has temperature inputs to the vertically polarized (v) channel, the second row has inputs to the horizontally polarized (h) channel, and the third row has third Stokes parameter inputs (representing correlation between the vertically and horizontally polarized signals TU) (see [1, Fig. 1]). For this paper, we pretend that TC, TH, and TCN are perfectly known. Optimal estimation of them from thermometer measurements, on-ground calibration, and the voltages on the left side of (1) is left for future work. Various expansions of this model are possible. An additional column, the vector [Tv + T1 Th + T2 TU]T, could be added to the right side, and a column of corresponding voltages could also be added to the left side. This vector contains the Tv, Th, and TU brightnesses of the scene under observation, whose estimation is the goal of radiometry. We have left off these columns in order to simplify the problem, focusing only on the estimation of the calibration parameters. However, this paper can readily be extended to the larger problem. Other possible extensions include the following: 1) allowing for nonzero gains where there are zeros (although approximating them as zero is fairly accurate); 2) adding a small TU term, generated by the radiometer, to the last row of the temperature matrix; 3) jointly estimating many consecutive sets of the calibration parameters, exploiting the correlation between them to improve the estimates; and 4) adapting the method expounded in this paper to other classes of radiometers by using their forward models in place of (1). III. ALGEBRAIC ESTIMATION (PIEPMEIER S METHOD [1]) Estimating the ten calibration parameters on the right side of (1) by the method of [1] can be summarized as follows. It is algebraic, making no attempt to model the noise in the measurements. Gvv is estimated using vv,C and vv,H, which is the conventional cold/hot calibration method of nonpolarimetric radiometers. From (1), the equations for these two voltages have the same form as in [1, eq. (20)], viz., vv,C vv,H = TC 1 TH 1 Gvv GvvT1 . (2) Solving for Gvv, the estimate is G vv = vv,H vv,C TH TC (3) as in [1, eq. (21)]. T1 is estimated by solving the same equations for T1, yielding T1 = TCvv,H + THvv,C vv,H vv,C . (4) Estimation of Ghh and T2 from vh,C and vh,H is similar. Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3225 The gains Gpv, Gph, and GpU are found in [1, eq. (41)], which are reproduced here (with op being expanded and the minor correction that Gpm is replaced by Gph) vp,C vp,H vp,CH vp,CN = TC TC 0 1 TH TH 0 1 TC TH 0 1 TC + 1 2TCN TC + 1 2TCN TCN 1 Gpv Gph GpU GpvT1 + GphT2 . (5) Note that this also follows from (1). To obtain Gpv, Gph, and GpU , both sides of (5) are multiplied on the left by the inverse of the 4 4 matrix in (5). Gmv, Gmh, and GmU are found similarly, using the measurements vm,C, vm,H, vm,CH, and vm,CN. Note that both the 4 4 matrix in (5) and its inverse have a condition number (for the two-norm) of about 3000 when we use the following load temperatures typical of NASA s nearfuture Aquarius radiometer: TC = 288 K and TH = TCN = 800 K [2]. This suggests that noise in the voltage measurements can disturb the estimates significantly. This may explain why our optimal estimates of the last six gains are the most improved compared with algebraic estimates (see Table II, as will be described later). Also, note that four of the available measurements, vv,CH, vv,CN, vh,CH, and vh,CN, are not used in this method. Nevertheless, the method of Piepmeier [1] works well (evidence for this is given by the first row of numbers in Table II, as will be described later). IV. BAYESIAN ESTIMATION The algebraic estimation method sketched in the previous section is not unique. For example, Gvv could also be estimated using G vv = vv,H vv,CH TH TC . (6) A better estimate would be an average of (3) and (6) because the noise in each estimate is somewhat different and is therefore reduced by averaging. Information on Gvv is also contained in the measurement vv,CN, and similarly with the other nine calibration parameters. How can we combine all of the information in the voltages to get the most accurate estimates of the ten calibration parameters? The answer is to approach the problem using probability theory rather than algebraically. By using Bayes theorem, a pdf can express all the information on a parameter that is available from various measurements, from the noise-free forward model, and from a probabilistic description of the noise [3]. Let v be the vector of measurements (voltages) in (1), and let m be the vector of calibration parameters (the eight gains plus T1 and T2). Then, Bayes theorem tells us that the pdf for the parameters, given the voltages, is p(m|v) = p(v|m)p(m) p(v) . (7) The pdf p(m) represents prior or external information on the model parameters, and similarly with p(v). In this paper, we assume that no prior or external information is available. The absence of information can be represented by a constant pdf, so that p(m) and p(v) are both constants. Then, p(m)/p(v) is another constant, c , and p(m|v) = c p(v|m). (8) The remainder of this section finds p(m|v) explicitly. A. Pdf for the Voltages, Given the Parameters, p(v|m) We first elaborate the probability distribution for voltages, given a set of calibration parameters, p(v|m). 1) Noise Model: Equation (1) is a noise-free forward model for the 16 voltages from the calibration parameters. However, the temperature inputs in the last matrix of (1) are only mean values. Actual thermal emissions fluctuate. These random fluctuations can be treated as Gaussian noise (commonly called NE T) that is added to each true (mean-value) temperature in the forward model. Therefore, the forward model with noise included is (9), shown at the bottom of the page, where the ni terms are added noise. The noise term n1 represents the total fluctuation away from TC + T1 during the first calibration subcycle. It is zero-mean Gaussian noise with a standard deviation (STD) that is equal to the mean temperature divided by the root of the time bandwidth product, (TC + T1)/ B c, where B is the sensor bandwidth and c is the calibration integration time [4, eq. (6.51)]. n2 n9 are similar. The noise terms n1 n6 are all independent of one another (and of n7 n9) for one or both of the following reasons: 1) They originate from different sources (the v-channel hot and cold inputs, as well as the amplifiers producing T1, are separate from the h-channel inputs and the amplifiers producing T2, see [1, Fig. 1]), and 2) they are realized during different calibration subcycles (i.e., are different realizations of the noise, and the rapidity of the fluctuations means that the realization vv,C vv,H vv,CH vv,CN vh,C vh,H vh,CH vh,CN vp,C vp,H vp,CH vp,CN vm,C vm,H vm,CH vm,CN = Gvv 0 0 0 Ghh 0 Gpv Gph GpU Gmv Gmh GmU TC + T1 + n1 TH + T1 + n3 TC + T1 + n5 TC + T1 + 1 2TCN + n7 TC + T2 + n2 TH + T2 + n4 TH + T2 + n6 TC + T2 + 1 2TCN + n8 0 0 0 TCN + n9 (9) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3226 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 during one interval is independent of the realization during the next interval [5]). The noise terms n7 n9 are not independent of one another because they all originate (at least in part) from the correlated calibration source; they are treated in Appendix A2. Even though n1 n6 are independent of one another, there is correlation among the voltages on the left side of (9). For example, vp,C and vm,C are both functions of n1 and n2 and are therefore correlated. All the correlations that exist among the voltages can be summarized in a covariance matrix C that is derived in Appendix A. As a result, the probability distribution for v, given m, is a 16-dimensional Gaussian pdf, with mean being given by the right-hand side of (1) denoted g(m), p(v|m) = 1 (2 )16|C| e 12 (v g(m))TC 1(v g(m)). (10) 2) Eigendecomposition of Singular C: The covariance matrix C is a function of the calibration parameters m. Numerical calculations using arbitrary values for m show that, although C is 16 16, its rank r is consistently only nine. From a theoretical standpoint, this corresponds to the nine noise sources on the right-hand side of (9). Because C is not full rank, it cannot be inverted. It also has 16 r = 7 eigenvalues with value zero so that |C| = 0. How do we evaluate (10) in this situation? Consider the eigendecomposition of C C = [VDV T] = [V1 V2] diag([ 0]) V T 1 V T 2 (11) where the r = 9 columns of V1 are eigenvectors with nonzero eigenvalues , whereas the seven columns of V2 are eigenvectors with eigenvalues that are equal to zero, and diag( ) is a diagonal matrix with the elements of the argument along its diagonal. Matrix theory tells us that, because C is symmetric, V [V1 V2] is unitary (because V is real, it is also an orthogonal matrix). Therefore, V Tv = V T 1 V T 2 v is a rotation of the voltages. Consider a partition of these rotated voltages into V T 1 v and V T 2 v. The covariance of V T 2 v is Cov V T 2 v = V T 2 v V T 2 v V T 2 v V T 2 v T (12) =V T 2 (v v ) (v v )T V2 (13) =V T 2 Cov(v)V2 (14) =V T 2 CV2 = [0] (15) where the last step follows from the fact that the columns of V2 are eigenvectors of C with eigenvalues that are equal to zero. Because the variances of the elements of V T 2 v (diagonal elements of Cov(V T 2 v)) are zero, V T 2 v is a constant vector. That is, for a givenm, V T 2 v will be the same for any realization of the voltages. With m being given, we can find the particular v = g(m) corresponding to the voltages obtained when all nine noise sources happen to be zero. Then, because V T 2 v is constant for a given m as just shown, we can write V T 2 v = V T 2 g(m) for any realization of v. This last expression is a constraint when evaluating p(v|m). It says that the pdf is concentrated on a manifold (in the 16-dimensional space of voltages) defined by V T 2 v = V T 2 g(m). For any v which does not satisfy this constraint (i.e., does not lie on the manifold), p(v|m) = 0. Next, consider the rotated voltages V T 1 v. By the same steps as in (12) (15), we find that Cov V T 1 v =V T 1 CV1 = V T 1 [V1 V2] diag([ 0]) V T 1 V T 2 V1 =V T 1 V1diag( )V T 1 V1 = diag( ). (16) This covariance matrix is invertible, and its determinant is the product of the elements of .We can eliminate the singularities in (10) by reducing the dimensionality of the Gaussian pdf p(v|m) from 16 to r = 9, using V T 1 v and its nonsingular covariance matrix in place of v and C (also, note that the mean of V T 1 v is V T 1 g(m)). In summary, to deal with the singularities in (10), we rotate and partition our measurements v into measurements (V T 2 v) which provide a constraint and those (V T 1 v) which provide a smaller dimensional pdf, so that (10) can be rewritten as (17), shown at the bottom of the page. In Appendix B, we derive V2 analytically and show that the constraint V T 2 v = V T 2 g(m) reduces to Gpv =Gvv (vp,Cvh,H vh,Cvp,H) (vv,Cvh,H vh,Cvv,H) Gph =Ghh (vp,Cvv,H vv,Cvp,H) (vh,Cvv,H vv,Cvh,H) (18) Gmv =Gvv (vm,Cvh,H vh,Cvm,H) (vv,Cvh,H vh,Cvv,H) Gmh =Ghh (vm,Cvv,H vv,Cvm,H) (vh,Cvv,H vv,Cvh,H) (19) GmU =GpU GmvGhhvv,CN+GmhGvvvh,CN GvvGhhvm,CN GpvGhhvv,CN+GphGvvvh,CN GvvGhhvp,CN . (20) p(v|m) = 1 (2 )9|diag( )| e 12 [V T 1 v V T 1 g(m)]T diag( ) 1 [V T 1 v V T 1 g(m)], if V T 2 v = V T 2 g(m) 0, if V T 2 v = V T 2 g(m) (17) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3227 Fig. 2. Dots showing the Gvv and T1 coordinates of samples of p(m|v). The density of dots illustrates the 2-D joint pdf p(Gvv, T1|v). The samples are from the same simulation that produced Fig. 1. B. Posterior Pdf p(m|v) We are now ready to present a key result of this paper, p(m|v), which is the pdf for the calibration parameters, given a set of measured voltages. As shown in (8), we obtain it from (17) by simply reversing the roles of input and output and multiplying by a normalizing constant (we will not attempt to find the constant because it is not necessary for finding estimates and because the pdf can be displayed in unnormalized form). Explicitly, p(m|v) is given in (21), shown at the bottom of the page, where we show that V1 and are functions of the unknown m (through C) and where c is an unknown constant. V. MAP ESTIMATION A. Theory In general, the most complete answer to an estimation problem is a joint pdf on the variables of interest [6], in our case, (21). A joint pdf usually contains much more information than merely reporting numbers and STDs for the variables (this is illustrated in Section VI-B2). The increase in computing power over the last few decades enables us to begin the use of pdfs, such as p(m|v), as inputs and outputs to algorithms, rather than simple estimates and their uncertainties. It is our hope that science and engineering will move in that direction. In this section, however, we follow the tradition by reporting simple numerical estimates and their uncertainties (rmse). From the joint pdf p(m|v), what should we extract and report as estimates of the calibration parameters? Of all possible estimates of m from v, the minimum-mean-square-error (mmse) estimate is the mean of p(m|v), i.e., its expected value with respect to m [7]. Fig. 3. Depiction of the 2-D joint pdf implied by reporting only marginal means and variances using the means and variances of the samples in Fig. 2. Fig. 4. Dots showing the Gvv and Gpv coordinates of samples of p(m|v). The density of dots illustrates the 2-D joint pdf p(Gvv,Gpv|v). Finding the mean of p(m|v) is difficult, but a shortcut is available. Marginal 1- and 2-D pdfs for one or two of the parameters are obtained by integrating p(m|v) with respect to the remaining parameters. Typical examples of these pdfs are shown in Figs. 1 4 (the generation of such figures is explained in Section VI). From such examples, it appears that p(m|v) is unimodal and symmetric in most (if not all) cases. This is confirmed by the Gaussian structure of (21). These properties signify that the mean of p(m|v) is the same as its mode (at least as a good approximation if not exactly). Finding the mode is easier than finding the mean; therefore, we use the mode as our primary estimate. We examine the properties of the mode in detail in this section and return to consider the mean in Section VI-B3. p(m|v) = c |diag( (m))| 12 e 12 [V T 1 (m)v V T 1 (m)g(m)]T diag( (m)) 1 [V T 1 (m)v V T 1 (m)g(m)], if (18) (20) are satisfied 0, otherwise (21) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3228 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 TABLE I GAIN-RELATED PARAMETERS TYPICAL OF AN L-BAND RADIOMETER AND USED IN SIMULATIONS p(m|v) is often called the posterior distribution because it is the distribution for m after measuring data v. The mode of p(m|v) is the set of parameters m, which maximizes the posterior distribution. Therefore, it is referred to as the maximum a posteriori (MAP) estimate. Because our p(m|v) is equal to a constant times p(v|m) (due to the absence of prior information), the MAP estimate is equivalent to the maximumlikelihood estimate, which is obtained by considering p(v|m) to be a function of m and finding the m which maximizes it. Finding the value of m which maximizes p(m|v) can be cast as a standard multidimensional optimization problem. It is readily accomplished by a blackbox minimization algorithm such as MATLAB s fminsearch (although more advanced techniques could find it with less computation). The search can be initialized using the algebraic estimate of m (see Section III). To limit the search to m which satisfies the constraint, we only search over the five parameters Gvv, Ghh, GpU , T1, and T2. When the other five are needed, they are generated from the constraint equations (18) (20). Note that it may be possible to find a closed-form expression for the mean or the mode of p(m|v). If found, then the MAP estimate would require much less computation than is required by the brute force optimization method of the previous paragraph. However, our attempts to find a closed-form expression suggest that the task is difficult. B. Simulation and Results To compare MAP estimates with algebraic estimates, we simulate the estimation process. Typical values of radiometer hardware parameters defined in [1], which are obtained from [1, Table I] and [2], are shown in Table I. These are used in [1, eq. 17] to calculate typical values for the eight gains Gxx, which are also shown in Table I. For typical T1 and T2, we use 310 K; these, plus the eight gains Gxx in Table I, are considered the true values in our simulations (mtrue). For TC, TH, and TCN, we use 288, 800, and 800 K, respectively. For c, we use 9 ms (these T1, T2, TC, TH, TCN, and c are the anticipated values for NASA s upcoming Aquarius radiometer [2]). From the aforementioned parameters, we next generate simulated voltages. As discussed in Section IV-A2, we can find V T 2 v as V T 2 g(mtrue). We also need realizations of V T 1 v. Because V T 1 v is a linear combination of v, it is Gaussian as is v. We know that V T 1 v has amean of V T 1 g(m) and a covariance matrix of diag( ), so we can readily generate random realizations of it by adding the mean to zero-mean Gaussian noise with that covariance. Then, with V T 2 v and V T 1 v in hand, we can form v by derotating (left multiplying by V ) because V V T 1 v V T 2 v = V (V Tv) = (V V T)v = v. (22) With the simulated voltages, we then use the method of Piepmeier [1], as summarized in Section III, to algebraically estimate the ten calibration parameters. We repeat this for 106 different realizations of the voltages (the means stay the same, but the noise changes). The bias of these 106 estimates is computed as the difference between their mean andmtrue. The STD of these 106 estimates is also computed. Finally, the rmse of these 106 estimates is computed as the root sum square of the bias and STD (this is equivalent to rmse = ( m mtrue)2 , where the averaging is over 106 realizations of noise). This process is repeated for MAP estimates, and the results are compared. For both estimation methods, and for all ten parameters, the bias is less than 0.01% of the true parameter values. The STD is therefore the same as the rmse, to four significant digits. In the first two rows of Table II, we report the rmse for each method and for each of the ten parameters, as percentages of the true parameter values. In the third row, we report the factor by which the rmse of MAP estimates is lower than the rmse of algebraic estimates. To summarize our results with a single number, we take the average of these ten improvement factors, which is 2.04. That is, the rmse of MAP estimates is about two times smaller than the rmse of algebraic estimates. To establish the accuracy of this number, the entire procedure of the last four paragraphs is repeated 100 times to provide 100 estimates of the average improvement factor. The average of these 100 numbers is 2.041, with an STD of 0.001. Note that this improvement is independent of the available integration time c. For example, if c is quadrupled (equivalent to averaging four contiguous calibration looks), then the STD (and hence rmse) of both methods drops by a factor of two. The improvement factors reported in Table II remain the same. We verified this for seven sets of 106 estimates. The cost for the increased accuracy of MAP estimates is an increase in computation. On average, a MAP estimate requires 40 000 times more computation than an algebraic estimate. This could be reduced if pains were taken to increase the efficiency of the search algorithm or of the eigendecomposition. Even so, this is still quite tractable: a 2.4-GHz workstation finds the MAP estimate of all ten parameters in about 0.06 s. 1) Improvement as a Function of mtrue: The improvement in accuracy of MAP estimates (over algebraic estimates) is a weak function of mtrue. If we repeat the earlier procedure for 100 different values of mtrue s, chosen randomly within Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3229 TABLE II RMSE RESULTS FOR 106 ESTIMATES FROM VOLTAGES GENERATED BY TYPICAL mtrue. RMSE IS GIVEN AS A PERCENT OF THE TRUE PARAMETER VALUE expected ranges,1 we find that the rmse of MAP estimates is consistently the same as the values in Table II. The rmse of algebraic estimates is slightly lower than the values in Table II, resulting in an average improvement factor that ranges from 1.86 to 2.03, with the mean being 1.90. VI. SAMPLING THE POSTERIOR PDF We now turn to exploration of the more complete answer to the calibration problem, the posterior pdf p(m|v). If samples of p(m|v) are available, they can be used to visualize the posterior pdf, to report it, or to calculate its mean mmmse. In this section, we describe how to generate such samples and then use them for the aforementioned purposes. A. Sampling p(m|v) by the Rejection Method Samples of the posterior pdf p(m|v) can be generated by the well-known rejection method [6], [7]. First, samples of Gvv, Ghh, and GpU are proposed from independent uniform distributions. These distributions are centered on an initial guess, such as the MAP estimate. With the proposed Gvv, Ghh, and GpU coordinates, we next find the proposed Gph, Gmh, Gpv, Gmv, and GmU coordinates from the constraint equations (18) (20). Also, T1 and T2 coordinates are proposed from independent uniform distributions centered on the initial guess. Each set of ten proposed coordinates now comprises one proposed sample of p(m|v), which we denote by mprop. Because the constraint equations were used, each mprop is a sample from a uniform distribution over a region of the constraint manifold. Let the constant value of this uniform distribution be denoted by k. In order to correctly generate samples of p(m|v), we must accept each mprop with probability P where P = p(mprop|v)/k maxm (p(m|v)/k) = p(mprop|v) maxm (p(m|v)) . (23) The numerator of (23) is readily calculated using (21), with V1 and being found from a numerical eigendecomposition of C and C being calculated from mprop. The denominator of (23) is the peak value of p(m|v). The search for this peak is 1cv, ch, cp, and cm are chosen independently from normal distributions with mean of 450 and STD of 17 mV/mW. G1 is chosen from a normal distribution with mean of 1.8 107 and STD of 2.7 106 W/W; 10 log10(g) is chosen from a normal distribution with mean of zero and STD of 1 dB; s is chosen from a normal distribution with mean of 1/ 2 and STD of 0.02/ 2; e is chosen from a normal distribution with mean of 0.934 and STD of 0.01; and T1 and T2 are chosen independently from normal distributions with mean of 310 and STD of 1 K. Also, TC is chosen from a normal distribution with mean of 288 and STD of 0.5 K, whereas TH and TCN are chosen independently from normal distributions with mean of 800 and STD of 2 K. the same search that finds the MAP estimate of m i.e., the denominator is simply p(mMAP |v).2 B. Using the Samples 1) Marginal Posterior Pdfs: Once we have a number of samples of p(m|v), we can immediately obtain plots of the marginal probability distribution for the ith parameter by simply binning the ith coordinate values of the samples (we believe that this follows from [6]). Some marginal pdfs obtained in this manner are shown in Fig. 1. Fig. 1 shows that the marginal pdfs are symmetric. This verifies the claim made earlier that our MAP estimates (modes of marginal pdfs) are the same as mmse estimates (means of marginal pdfs). As further proof, when mmse estimates (see hereafter) are made, they have the same error statistics as MAP estimates, as reported in Table II and Section V-B. Fig. 1 also shows that the marginal pdfs are Gaussian (or at least very nearly so). The Gaussians that are plotted have STD from the second line of Table II. Hence, each marginal pdf can be completely characterized by its mode (the MAP estimate) and STD (= rmse of MAP estimates, given in Table II). 2) Additional Information in Joint Pdfs: Joint pdfs can convey many times more information than marginal pdfs. For example, consider Gvv and T1. The 2-D joint pdf for these two parameters (made from the same samples as Fig. 1) is shown in Fig. 2. This joint pdf contains significant correlation information. For example, it shows that there is a fair chance that Gvv 24.5 and T1 311 but almost no chance that Gvv 24.5 and T1 304. If we had reported only the 1-D marginal pdfs in Fig. 1, both possibilities would have appeared equally likely. The current practice of reporting only the mean and variance for each parameter is equivalent, in effect, to reporting independent 1-D marginal Gaussian pdfs for the parameters [6]. The only joint pdf that can logically be reconstructed from marginal pdfs is the product of the marginal pdfs (this follows from [6]). An example of such a reconstruction is shown in Fig. 3. All correlation information is lost, as well as any non-Gaussian characteristics of the posterior pdf. The situation is even more pronounced for the parameters whose estimation is 100% correlated, due to the constraint equations. For example, the 2-D joint pdf for Gvv and Gpv is shown in Fig. 4. The pdf is completely concentrated along a 1-D line in the 2-D space. If only means and variances were reported, the appearance would be similar to Fig. 3. We have demonstrated that the most complete answer to an estimation problem is a joint pdf, but how can a 10-D posterior pdf be reported? We can report the equation for it, such as (21). 2Numerical note: The sampling process is much faster when all the Gxx (or voltages) are scaled by an appropriate factor. A factor of 107 was used in generating the figures. Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3230 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 A numerical alternative is to simply report a large number of samples because most, if not all, calculations done with a posterior pdf can be done using these samples [6]. A very succinct alternative is available when a multidimensional pdf is sufficiently Gaussian, as in the present case. In such cases, all of the information can be conveyed by a vector of means and a covariance matrix. 3) MMSE Estimation: Another use for the samples of a distribution is in calculating its mean. For example, to find the ith coordinate of the mean, we simply average the ith coordinate of the samples, which converges to the mean as the number of samples increases [6]. The means found by this method are shown in Fig. 1, from which it is also seen that MAP and mmse estimates extracted from p(m|v) are equivalent. VII. INFORMATION ON HARDWARE PARAMETERS As a tangential but useful extension of the previous results in this paper, we can obtain information on the radiometer hardware parameters that comprise the eight gains Gxx. The definition of these gains in terms of hardware parameters, found by comparing (1) and [1, eq. 17], is reproduced in (24), shown at the bottom of the page. By simply replacing the eight gains Gxx 3 in (21) with their component definitions on the right side of (24), we immediately obtain the joint posterior pdf for the eight hardware parameters G1, G2, e, cv, ch, cp, cm, and s, plus T1 and T2 (the bandwidth B could also be considered a parameter, but for this paper, we treat it as a known constant; the parameter k is Boltzmann s constant). For brevity, we simply summarize our discoveries about this pdf as follows. First, the transformed constraint equations leave the following six hardware parameters unconstrained: G1, G2, e, cv, T1, and T2. Next, the constraint equations dictate that the parameter s is completely determined by the voltages s = AD + ADBC AD BC (25) where A vp,Cvh,H vh,Cvp,H D vm,Cvv,H vv,Cvm,H B vp,Cvv,H vv,Cvp,H C vm,Cvh,H vh,Cvm,H. (26) There is no uncertainty in this estimate it is not affected by NE T (in simulation, therefore, this estimate is the exact value of the true s; however, in real practice, the uncertainties not captured by our model in (9), such as imperfect knowledge of TC and TH, will cause this estimate to have some error). The 3Gxx are found both inmand in the constraint equations. constraint also dictates that ch, cp, and cm are constrained as follows: ch =cv AC BD (27) cp =cv A E D ( AD BC) (28) cm = cv C E B ( AD BC) (29) where E vv,Cvh,H vh,Cvv,H. One interpretation of these equations is that the ratio of any two cx is perfectly resolved, which is equal to a function of the voltages. Pdfs (2-D) for the unconstrained hardware parameters, which are obtained using the technique of Section VI with the necessary modifications, are shown in Fig. 5. Numerical simulation verifies that s is perfectly resolved; therefore, pdfs for s are omitted. Pdfs for ch, cp, and cm are not shown either, because such pdfs are simply multiples of the pdfs involving cv, due to (27) (29). Intuition suggests that the ability to perfectly resolve s must be compensated by a lack of ability to resolve other parameters. This is indeed the case: As shown by Fig. 5, cv, G1, and G2 cannot be separately resolved. We can only resolve their pairwise products (for example, cvG1 is a constant times Gvv, which was well resolved in the earlier sections of this paper). Marginal pdfs for cv, G1, and G2 are essentially uniform.4 Our final observation is that e is resolved well. The average rmse of MAP estimates of e is 0.33%, whether using the typical values given in Table I or the randomized values for the true hardware parameters. T1 and T2 are resolved as accurately as before (see Table II and Fig. 1). VIII. CONCLUSION In this paper, we have demonstrated optimal estimation of calibration parameters in polarimetric microwave radiometers which use hybrid coupler-based correlators to measure the third Stokes parameter, such as NASA s upcoming Aquarius radiometer. By exploiting statistical knowledge of measurement noise using Bayesian estimation, the rmse is reduced by a factor of two compared to estimation without such knowledge. Most of the principles that we have employed are well known in estimation theory; however, this paper is their first published application to microwave radiometer calibration. Many extensions of this paper can be made by expanding the forward model to include other unknowns and/or other measurements. 4These uniform marginal pdfs are only limited by prior knowledge. However, if we have tighter prior knowledge on one parameter, for example, on cv, then good resolution of a product such as cvG1 can cause tighter bounds on the other parameter G1. This effect can be seen in several subplots of Fig. 5. Gvv 0 0 0 Ghh 0 Gpv Gph GpU Gmv Gmh GmU kB cvG1 0 0 0 chG2 0 cps2G1 cp(1 s2)G2 cps 1 s2 e G1G2 cm(1 s2)G1 cms2G2 cms 1 s2 e G1G2 (24) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3231 Fig. 5. Two-dimensional pdfs for the four unconstrained radiometer hardware parameters and two noise temperatures. The density of dots illustrates the pdfs, whereas asterisks show the location of the true parameter values. The bounding boxes show the bounds of the uniform pdfs from which samples were proposed. These plots illustrate that it is only possible to obtain useful estimates of the hardware parameters e, T1, and T2 (in addition to s, which is not illustrated). This paper has also illustrated the fact that much more information can be conveyed by a posterior probability distribution for a set of parameters than by simple estimates comprised only of marginal means and variances. Finally, we have demonstrated the acquisition of information on the eight hardware parameters that comprise the overall channel gains in the class of radiometer which is analyzed in this paper. Two of these hardware parameters can be accurately estimated from calibration measurements; for the other six, only products or ratios of pairs of hardware parameters can be resolved. We note that the forward model (1) is based on a number of stated assumptions see Section II and our simulations incorporate those assumptions. In future work, we hope to assess the validity of these assumptions and the consequent predictions of this paper by testing the proposed method on real radiometer data. Future work can also assess the improvement in scene brightness temperature accuracy that can be achieved by using the methodology of this paper rather than algebraic estimation. APPENDIX A A. Derivation of Covariance Matrix C Using the information stated in Section IV-A1 about the nine noise variables, we derive here the variances and covariances of the voltages in (9). Each column of voltages is independent of the voltages in the other columns, so we proceed column by column. 1) First Three Columns: Consider the first column of voltages in (9). There are just two noise terms, and they are independent of one another because they are from different sources. Therefore, vv,C and vh,C are independent. All other relationships have nonzero correlation Cov(vv,C, vp,C) =Cov(Gvvn1,Gpvn1) =E GvvGpvn21 =GvvGpv (TC + T1)2 B c . (30) Similarly we have Cov(vh,C, vp,C) =GhhGph (TC + T2)2 B c (31) Cov(vv,C, vm,C) =GvvGmv (TC + T1)2 B c (32) Cov(vh,C, vm,C) =GhhGmh (TC + T2)2 B c . (33) For the final covariance for this column Cov(vp,C, vm,C)=E ((Gpvn1 + Gphn2)(Gmvn1 + Gmhn2)) =E GpvGmvn21 + GphGmhn22 =GpvGmv (TC + T1)2 B c + GphGmh (TC + T2)2 B c . (34) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3232 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 By inspection, the four variances (diagonal terms) are G2 vv((TC + T1)2/B c), G2 hh((TC + T2)2/B c), G2 pv((TC + T1)2/B c) + G2 ph((TC + T2)2/B c), and G2 mv((TC + T1)2/B c) + G2 mh((TC + T2)2/B c). The variances and covariances of the second column of voltages in (9) are the same as those of the first column, except replacing (TC + T1) and (TC + T2) with (TH + T1) and (TH + T2), respectively. The variances and covariances of the third column are the same as those of the first one, except replacing (TC + T2) with (TH + T2). 2) Fourth Column: The variances and covariances of the fourth column are quite different because TCN is a component of all three inputs. We first rewrite the calibration inputs (TC for both the vertical and horizontal channels, TCN/2, T1, and T2) in terms of electric fields. The v-channel cold load emits an electric field, which we denote by c1. Its second moment (defined as c21 , where is the ensemble average) is TC (here and hereafter, we ignore the constant that converts the product of two electric fields to a brightness temperature). The v-channel low-noise-amplifier (LNA) noise is another source, whose equivalent electric field (referred to the input of the first LNA so that it is on the same level as c1) is denoted by r1. Its second moment is T1. Similarly, the h-channel cold load outputs c2, with second moment TC, and the h-channel LNA noise is r2, whose second moment is T2. The correlated calibration source (shown in [1, Fig. 1]) emits an electric field n with second moment TCN. When the energy from this source is split between the vertical and horizontal channels, the electric field in each channel is then n/ 2, whose second moment is TCN/2. The five electric fields just described (c1, r1, c2, r2, and n) are independent of one another because of their distinct origins. They are all zero-mean normal random variables. The voltages in the fourth column on the left side of (1) are found by summing these electric fields, squaring, integrating, and multiplying by a channel gain vv,CN =GvvI (35) vh,CN =GhhJ (36) vp,CN =GpvI + GphJ + GpUK (37) vm,CN =GmvI + GmhJ + GmUK (38) where I 1 c c 0 c1 + n 2 + r1 2 dt (39) J 1 c c 0 c2 + n 2 + r2 2 dt (40) K 1 c c 0 n2dt. (41) Note that K arises from the correlation of the inputs to the hybrid coupler. Because all these voltages are expressed in terms of I, J, and K, all the variances and covariances can also be expressed in terms of the variances and covariances of I, J, and K. As shown in [5], I, J, and K can be rewritten as sums of independent samples I = 1 Nc Nc i=1 c1,i + ni 2 + r1,i 2 (42) where Nc = 2B c, and similarly with J and K. In the remainder of this section, we use (42) to find the means, variances, and covariances of I, J, and K and then of vv,CN, vh,CN, vp,CN, and vm,CN. These derivations are similar to those in [8, Appendix]. Means and variances of I, J, and K: First, using the independence of each of the Nc samples from one another and of the five electric fields from one another, the means of I, J, and K are I = (c1 + n/ 2 + r1)2 = c21 + n2/2 + r2 1 + 2c1n/ 2 + 2c1r1 + 2nr1/ 2 =TC + TCN/2 + T1 TT1 (43) J = (c2 + n/ 2 + r2)2 =TC + TCN/2 + T2 TT2 (44) K = n2 = TCN. (45) These means coincide with the final column of the temperature matrix in (1), verifying our formulation of the problem in terms of electric fields. To determine variance Var(I) = 1 Nc Nc i=1 c1,i + ni 2 + r1,i 2 1 Nc Nc j=1 c1,j + nj 2 + r1,j 2 TT12 (46) = 1 N2 c Nc i=1 Nc j=1 c1,i + ni 2 + r1,i 2 c1,j + nj 2 + r1,j 2 TT12. (47) Separating the expected value operation into terms for which i = j and for which i = j Var(I) = 1 N2 c Nc i=1 Nc j=1( =i) c1,i + ni 2 + r1,i 2 c1,j + nj 2 + r1,j 2 + Nc i=1 c1,i + ni 2 + r1,i 4 TT12. (48) Using the independence of samples i and j, the independence of c1, n, and r1 from everything but themselves, and the known Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3233 fourth moment of zero-mean normal random variables Var(I)= 1 N2 c Nc i=1 c1,i + ni 2 + r1,i 2 Nc j=1( =i) c1,j + nj 2 + r1,j 2 + Nc i=1 3 c1,i + ni 2 + r1,i 2 2 TT12. (49) Then, using (43) Var(I) = 1 N2 c NcTT1(Nc 1)TT1 + 3NcTT12 TT12 = 2 Nc TT12 (50) and we finally arrive at the ensemble variance Var(I) = TT12 B c and similarly, Var(J) = TT22 B c . (51) Also Var(K) = 1 Nc Nc i=1 n2i 1 Nc Nc j=1 n2j T2 CN = 1 N2 c Nc i=1 Nc j=1 n2i n2j T2 CN (52) = 1 N2 c Nc i=1 Nc j=1( =i) n2i n2j + Nc i=1 n4i T2 CN (53) = 1 N2 c Nc i=1 n2i Nc j=1( =i) n2j + Nc i=1 3 n2i 2 T2 CN (54) = 1 N2 c Nc(Nc 1)T2 CN +3NcT2 CN T2 CN =T2 CN B c . (55) Covariances of I, J, and K: Derivation of the covariances of I, J, and K is similar Cov(I, J) = 1 Nc Nc i=1 c1,i + ni 2 + r1,i 2 1 Nc Nc j=1 c2,j + nj 2 + r2,j 2 TT1 TT2 (56) = 1 N2 c Nc i=1 Nc j=1 c1,i + ni 2 + r1,i 2 c2,j + nj 2 + r2,j 2 TT1 TT2. (57) The expected value operation separates into terms for which i = j and for which i = j Nc i=1 Nc j=1( =i) c1,i + ni 2 + r1,i 2 c2,j + nj 2 + r2,j 2 + Nc i=1 c1,i + ni 2 + r1,i 2 c2,i + ni 2 + r2,i 2 . (58) Using the independence of samples i and j, the independence of c1, c2, r1, and r2 from everything but themselves, and the known fourth moment of zero-mean normal random variables, this becomes NcTT1(Nc 1)TT2 + Nc TT1 TT2 + T2 CN 2 = N2 c TT1 TT2 + Nc T2 CN 2 . (59) Putting this back into (57) Cov(I, J) = T2 CN 4B c . (60) Similarly we have Cov(I,K) = 1 Nc Nc i=1 c1,i + ni 2 + r1,i 2 1 Nc Nc j=1 n2j TT1 TCN (61) = 1 N2 c Nc i=1 Nc j=1 c1,i + ni 2 + r1,i 2 n2j TT1 TCN (62) = 1 N2 c Nc i=1 c1,i + ni 2 + r1,i 2 Nc j=1( =i) n2j + Nc i=1 c1,i + ni 2 + r1,i 2 n2i TT1 TCN = 1 N2 c NcTT1(Nc 1)TCN + Nc TCTCN + n4 2 + T1TCN TT1 TCN (63) = 1 N2 c NcTT1TCN + NcTCN TC + 3 2TCN + T1 = TT1 TCN Nc + (TT1 + TCN)TCN Nc (64) leaving Cov(I,K)= T2 CN 2B c and similarly Cov(J,K)= T2 CN 2B c . (65) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3234 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 Variances of the voltages: Now that the means, variances, and covariances of I, J, and K are known, we can find the means, variances, and covariances of the voltages in (66) (70), shown at the bottom of the page. vp,CN is identical to vm,CN same realizations of noise except for multiplication by different Gxx. Therefore, variances and covariances of the first will be identical to those of the second if we simply replace Gpx with Gmx, as indicated in (71), shown at the bottom of the page. Covariances of the voltages: The covariances of vv,CN with the other voltages are found as follows: Cov(vv,CN, vh,CN) = GvvGhhCov(I, J) = GvvGhh T2 CN 4B c (72) Cov(vv,CN, vp,CN) = Cov(GvvI,GpvI + GphJ + GpUK) (73) = Gvv [GpvVar(I) + GphCov(I, J) + GpUCov(I,K)] (74) = Gvv Gpv TT12 B c + Gph T2 CN 4B c + GpU T2 CN 2B c (75) = Gvv 4GpvTT12 + GphT2 CN + 2GpUT2 CN 4B c (76) and similarly Cov(vv,CN, vm,CN) = Gvv 4GmvTT12 + GmhT2 CN + 2GmUT2 CN 4B c . (77) Likewise, the covariances of vh,CN with the vp,CN and vm,CN are Cov(vh,CN, vp,CN) = Cov(GhhJ,GpvI + GphJ + GpUK) (78) = Ghh [GpvCov(I, J) + GphVar(J) + GpUCov(J,K)] (79) = Ghh Gpv T2 CN 4B c + Gph TT22 B c + GpU T2 CN 2B c (80) = Ghh GpvT2 CN + 4GphTT22 + 2GpUT2 CN 4B c (81) Cov(vh,CN, vm,CN) = Ghh GmvT2 CN + 4GmhTT22 + 2GmUT2 CN 4B c . (82) The final covariance is longer Cov(vp,CN, vm,CN) =Cov(GpvI+GphJ+GpUK,GmvI+GmhJ+GmUK) (83) = GpvGmvVar(I) + GphGmhVar(J) + GpUGmUVar(K) + (GpvGmh + GphGmv)Cov(I, J) + (GpvGmU + GpUGmv)Cov(I,K) + (GphGmU + GpUGmh)Cov(J,K) (84) Var(vv,CN)=G2 vvVar(I)= G2 vv TT12 B c (66) Var(vh,CN) =G2 hhVar(J)= G2 hh TT22 B c (67) Var(vp,CN) =Var(GpvI + GphJ + GpUK) =Var(GpvI)+Var(GphJ)+Var(GpUK)+2[GpvGphCov(I, J)+GpvGpUCov(I,K)+GphGpUCov(J,K)] (68) =G2 pv TT12 B c + G2 ph TT22 B c + G2 pU T2 CN B c + 2GpvGph T2 CN 4B c + 2GpvGpU T2 CN 2B c + 2GphGpU T2 CN 2B c (69) = G2 pvTT12 + G2 phTT22 + G2 pU + GpvGph/2 + GpvGpU + GphGpU T2 CN B c (70) Var(vm,CN) = G2 mvTT12 + G2 mhTT22 + G2 mU + GmvGmh/2 + GmvGmU + GmhGmU T2 CN B c (71) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3235 which expands and then simplifies to Cov(vp,CN, vm,CN) = 1 4B c 4GpvGmvTT12 + 4GphGmhTT22 + (4GpUGmU + GpvGmh + GphGmv + 2GpvGmU + 2GpUGmv + 2GphGmU + 2GpUGmh)T2 CN ! . (85) 3) Summary: Entire C: To recapitulate C = CC 0 0 0 0 CH 0 0 0 0 CCH 0 0 0 0 CCN (86) where CC, CH, CCH, and CCN are given in (87) (90), shown at the bottom of the page, (note that the matrices in these equations are all symmetric; to make them fit better, only the upper triangular elements are given),where P G2 pvTT12 + G2 phTT22 + G2 pU + GpvGph/2 + GpvGpU + GphGpU T2 CN M G2 mvTT12 + G2 mhTT22 + G2 mU+GmvGmh/2+GmvGmU+GmhGmU T2 CN Y GpvGmvTT12 + GphGmhTT22 + GpUGmU + GpvGmh 4 + GphGmv 4 + GpvGmU 2 + GpUGmv 2 + GphGmU 2 + GpUGmh 2 T2 CN. B. Derivation of the Constraint Equations 1) Finding V2 (Eigenvectors of C for which = 0): Due to the block diagonal structure of C, it has four eigenvectors of the form [a b c d 0 0 0 0 0 0 0 0 0 0 0 0]T, where [a b c d]T is an eigenvector of CC; four of the form [0 0 0 0 a b c d 0 0 0 0 0 0 0 0]T, where [a b c d]T is an eigenvector of CH; four of the form [0 0 0 0 0 0 0 0 a b c d 0 0 0 0]T, where [a b c d]T is an eigenvector of CCH; and four of the form [0 0 0 0 0 0 0 0 0 0 0 0 a b c d]T, where [a b c d]T is an eigenvector of CCN. Each of CC, CH, and CCH has two eigenvalues ( ) that are equal to zero, whereas CCN has only one. This is easily confirmed numerically; theoretically, it is CC = 1 B c G2 vv(TC + T1)2 0 GvvGpv(TC + T1)2 GvvGmv(TC + T1)2 G2 hh(TC + T2)2 GhhGph(TC + T2)2 GhhGmh(TC + T2)2 G2 pv(TC + T1)2 + G2 ph(TC + T2)2 GpvGmv(TC + T1)2 +GphGmh(TC + T2)2 G2 mv(TC + T1)2 +G2 mh(TC + T2)2 (87) CH = 1 B c G2 vv(TH + T1)2 0 GvvGpv(TH + T1)2 GvvGmv(TH + T1)2 G2 hh(TH + T2)2 GhhGph(TH + T2)2 GhhGmh(TH + T2)2 G2 pv(TH + T1)2 + G2 ph(TH + T2)2 GpvGmv(TH + T1)2 +GphGmh(TH + T2)2 G2 mv(TH + T1)2 +G2 mh(TH + T2)2 (88) CCH = 1 B c G2 vv(TC + T1)2 0 GvvGpv(TC + T1)2 GvvGmv(TC + T1)2 G2 hh(TH + T2)2 GhhGph(TH + T2)2 GhhGmh(TH + T2)2 G2 pv(TC + T1)2 + G2 ph(TH + T2)2 GpvGmv(TC + T1)2 +GphGmh(TH + T2)2 G2 mv(TC + T1)2 +G2 mh(TH + T2)2 (89) CCN = 1 B c G2 vvTT12 GvvGhh T2 CN 4 Gvv 4GpvTT12+GphT2 CN+2GpUT2 CN 4 Gvv 4GmvTT12+GmhT2 CN+2GmUT2 CN 4 G2 hhTT22 Ghh GpvT2 CN+4GphTT22+2GpUT2 CN 4 Ghh GmvT2 CN+4GmhTT22+2GmUT2 CN 4 P Y M (90) Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. 3236 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 10, OCTOBER 2008 because the first three columns of (9) have two noise sources each, whereas the last one has three. Eigenvectors of CC, CH, and CCH for which = 0: CC, CH, and CCH can all be written in the abbreviated forms 1 B c v2T 0 vpT vmT 0 h2U hqU hnU vpT hqU p2T + q2U pmT+ qnU vmT hnU pmT + qnU m2T + n2U (91) where the only difference between CC, CH, and CCH is whether T and U are defined using TC or TH. The eigenvectors of this matrix are found by using the defining equation of an eigenvector with = 0, v2T 0 vpT vmT 0 h2U hqU hnU vpT hqU p2T + q2U pmT+ qnU vmT hnU pmT + qnU m2T + n2U a b c d = 0000 . (92) From the first two of the four equations in (92) a = cp + dm v b = cq + dn h . (93) Using these to substitute for a and b in the third and fourth equations in (92), these equations reduce to 0c + 0d = 0 0c + 0d = 0. (94) Any c and d will satisfy these equations. We choose simple but nontrivial values: c = 1, d = 0 and c = 0, d = 1. Then, using (93), the eigenvectors are p/v q/h 10 m/v n/h 01 . (95) Eigenvectors can be scaled arbitrarily. Scaling by vh and undoing the abbreviations, the eigenvectors with = 0 are GhhGpv GvvGph GvvGhh 0 GhhGmv GvvGmh 0 GvvGhh . ( 96) Note that these are exactly the same for CC, CH, and CCH because they do not depend on T and U. Eigenvector of CCN for which = 0: A process similar to that of the previous subsection, although more tedious, leads to Eigenvector of CCN with = 0 is GpvGmU GpUGmv Gvv GphGmU GpUGmh Ghh GmU GpU . (97) 2) Forming Constraint Equations From V2: For fixed v and unknown m, the constraint is V T 2 (m)v = V T 2 (m)g(m) (98) where we explicitly show that V2 is formed from the unknown m (through C). First six constraint equations: Having found the eigenvectors that form V2 in Appendix A.2.1, we can expand the first six of the seven equations in (98) and simplify them to obtain GhhGpvvv,C + GvvGphvh,C =GvvGhhvp,C (99) GhhGpvvv,H + GvvGphvh,H =GvvGhhvp,H (100) GhhGpvvv,CH + GvvGphvh,CH =GvvGhhvp,CH (101) GhhGmvvv,C + GvvGmhvh,C =GvvGhhvm,C (102) GhhGmvvv,H + GvvGmhvh,H =GvvGhhvm,H (103) GhhGmvvv,CH + GvvGmhvh,CH =GvvGhhvm,CH. (104) Multiplying (99) by vv,H/vv,C and subtracting (100) leaves Gph(vh,Cvv,H/vv,C vh,H) = Ghh(vp,Cvv,H/vv,C vp,H). (105) Similarly, from (99) and (101), we obtain Gph(vh,Cvv,CH/vv,C vh,CH) = Ghh(vp,Cvv,CH/vv,C vp,CH). (106) Solve (106) for Ghh = Gph (vh,Cvv,CH/vv,C vh,CH) (vp,Cvv,CH/vv,C vp,CH) (107) then substitute it into (105) to obtain Gph(vh,Cvv,H/vv,C vh,H) = Gph (vh,Cvv,CH/vv,C vh,CH) (vp,Cvv,CH/vv,C vp,CH) (vp,Cvv,H/vv,C vp,H). (108) This last equation cannot be solved for Gph it merely reveals redundancy in the data and that (105) and (106) are redundant constraint equations. Similarly, from (102) and (103), we obtain Gmh(vh,Cvv,H/vv,C vh,H)=Ghh(vm,Cvv,H/vv,C vm,H) (109) or an expression that is numerically equivalent from (102) and (104). A similar procedure produces constraint equations for Gpv and Gmv in terms of Gvv. In summary, we end up with the four constraint equations Gpv =Gvv (vp,Cvh,H vh,Cvp,H) (vv,Cvh,H vh,Cvv,H) Gph =Ghh (vp,Cvv,H vv,Cvp,H) (vh,Cvv,H vv,Cvh,H) (110) Gmv =Gvv (vm,Cvh,H vh,Cvm,H) (vv,Cvh,H vh,Cvv,H) Gmh =Ghh (vm,Cvv,H vv,Cvm,H) (vh,Cvv,H vv,Cvh,H) (111) or two alternative (but numerically equivalent) sets obtained by finding, for example, the equation relating Gph and Ghh from Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply. HUDSON AND LONG: OPTIMAL ESTIMATION OF CALIBRATION PARAMETERS 3237 GpvGmU GpUGmv Gvv vv,CN + GphGmU GpUGmh Ghh vh,CN GmUvp,CN + GpU vm,CN = GpvGmU GpUGmv Gvv GvvTT1 + GphGmU GpUGmh Ghh GhhTT2 GmU(GpvTT1 + GphTT2 + GpUTCN) + GpU (GmvTT1 + GmhTT2 + GmUTCN) (112) (99) and (101) or from (100) and (101). Thus, of (99) (101), any one is completely redundant, and the same goes for (102) (104). Hence, rather than six equations in six unknowns, we only have four equations in six unknowns. Last constraint equation: Using the last eigenvector with = 0, as derived in Appendix B1, the final constraint equation is expressed in (112), shown at the top of the page. The entire right side cancels itself out. Then, multiplying both sides by GvvGhh (GpvGmU GpUGmv)Ghhvv,CN+ (GphGmU GpUGmh)Gvvvh,CN GmUGvvGhhvp,CN + GpUGvvGhhvm,CN = 0. (113) Solving this for GmU = GpU GmvGhhvv,CN + GmhGvvvh,CN GvvGhhvm,CN GpvGhhvv,CN + GphGvvvh,CN GvvGhhvp,CN . (114) REFERENCES [1] J. R. Piepmeier, Calibration of passive microwave polarimeters that use hybrid coupler-based correlators, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 2, pp. 391 400, Feb. 2004. [2] J. R. Piepmeier, 2006. private communication. [3] E. T. Jaynes, Probability Theory: The Logic of Science. Cambridge, U.K.: Cambridge Univ. Press, 2003. [4] F. T. Ulaby, R. K. Moore, and A. K. Fung, Microwave Remote Sensing: Active and Passive, vol. 1. Norwood, MA: Artech House, 1981. [5] F. T. Ulaby, R. K. Moore, and A. K. Fung, Microwave Remote Sensing: Active and Passive, vol. 2. Norwood, MA: Artech House, 1986, ch. 7, pp. 476 494. [6] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation. Philadelphia, PA: SIAM, 2005, pp. 44 49. 173, and 179-180. [Online]. Available: http://www.ipgp.jussieu.fr/ %7Etarantola/Files/Professional/SIAM/index.html [7] A. Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. New York: McGraw-Hill, 1991, p. 175. 229. [8] D. L. Hudson, J. R. Piepmeier, and D. G. Long, Polarization rotation correction in radiometry: An error analysis, IEEE Trans. Geosci. Remote Sens., vol. 45, no. 10, pp. 3212 3223, Oct. 2007. Derek Hudson received the B.S. and M.S. degrees in electrical engineering from Brigham Young University (BYU), Provo, UT, in 2003, where he is currently working toward the Ph.D. degree. He has been with the Microwave Earth Remote Sensing (MERS) research group, BYU, since 1999. He is also currently with the NASA Goddard Space Flight Center, Greenbelt, MD. His research experience is in the application of inverse problem theory and in radar hardware design. David G. Long (S 80 M 82 SM 98 F 07) received the Ph.D. degree in electrical engineering from the University of Southern California, Los Angeles, in 1989. From 1983 to 1990, he was with the Jet Propulsion Laboratory (JPL), National Aeronautics and Space Administration (NASA), where he developed advanced radar remote sensing systems. While at JPL he was the Project Engineer on the NASA Scatterometer (NSCAT) project, which flew from 1996 to 1997. He also managed the SCANSCAT project, the precursor to SeaWinds, which was launched in 1999 and 2002. He is currently a Professor with the Department of Electrical and Computer Engineering, Brigham Young University, Provo, UT, where he teaches upperdivision and graduate courses in communications, microwave remote sensing, radar, and signal processing, and is the Director of the BYU Center for Remote Sensing. He is the Principal Investigator on several NASA-sponsored research projects in remote sensing. He is the author or coauthor of more than 275 publications in signal processing and radar scatterometry. His research interests include microwave remote sensing, radar theory, space-based sensing, estimation theory, signal processing, and mesoscale atmospheric dynamics. Dr. Long is an Associate Editor of the IEEE GEOSCIENCE AND REMOTE SENSING LETTERS. He has received the NASA Certificate of Recognition several times. Authorized licensed use limited to: Brigham Young University. Downloaded on February 2, 2009 at 12:02 from IEEE Xplore. Restrictions apply.