Hotelling-Williams T-test (1)

Recently, I am trying to compare the performance of two measures. It turns out a problem of comparing two correlation coefficients \rho_{12} and \rho_{13}, where the subscript 1 is denoting the observation group, 2 and 3 is denoting the measures. To be honest, I don't have any idea at the very beginning. Many thanks to my supvisor Dr. Dennis Cheung, he sent me a PPT about correlation coefficients, which Hotelling-Williams T test [Steiger] is also included.

The formula of Hotelling-Williams T test is here:

t_{(N-3)} = (r_{12}-r_{13})\sqrt{\frac{(N-1)(1+r_{23})}{2(N-1)|R|/(N-3)+\bar{r}^{2}(1-r_{23})^{2}}}

  • N = Number of Observation
  • r_{12} = sample correlation between Observation and measure 2
  • r_{13} = sample correlation between Observation and measure 3
  • r_{23} = sample correlation between measures
  • |R| = 1 - r_{12}^2 - r_{13}^2 - r_{23}^2 + 2(r_{12})(r_{13})(r_{23})
  • \bar{r} = (r_{12} + r_{13})/2
  • \rho means population correlation and r is denoting sample correlation

Hotelling-Williams T Test performs well in my hypothesis testing. It proofs that there is a significant difference between two measures, which explained the phenomenons I have observed. It is linear in my case, but I doubt that whether Hotelling-Williams T test appropriate for non-linear case, like log case . I found that in [crr] blog, there is a post about solving a similar problem --the correlations between the frequency measures and word processing time. Their post is very detailed and two more similar testing techniques are also introduced. One is the Vuong Test[Vuong, 1989], this test was suggested when dealing with a nonlinear problem, for example, the word processing time and log frequency. This will require we should use non-linear regression model. Vuong was suggested for this case for it based on a comparison of the log-likelihood. Another method is developed by Clarke (2007)[Clarke], he suspected that Vuong test is considered conservative for small N. However, after conducting a simulation test conducted by the [crr] blogger, they concluded that Hotelling-Williams T test is the best one and the latter is Vuong test. The Vuong test will be suggested unless the correlation between variables is very little.

The core idea about Hotelling-Williams T test is not clear yet, I will finish that in next post.

  1. [crr]http://crr.ugent.be/archives/546
  2. [Vuong] Vuong, Q.H. (1989): Likelihood Ratio Tests for Model Selection and non-nested Hypotheses. Econometrica, 57, 307-333.
  3. [Clarke] Clarke, K.A. (2007). A Simple Distribution-Free Test for Nonnested Model Selection. Political Analysis, 15, 347-363.
  4. [Steiger] Steiger, J.H. (1980), Tests for comparing elements of a correlation matrix, Psychological Bulletin, 87, 245-251.

最经我需要对比两个指标与观测数据的相关度,\rho_{12}\rho_{13}的比较(下标1表示观测数据,2,3分别表示两种不同的指标)。开始的时候我完全没有任何想法,因为之前的数理统计中没有涉及过这么一个问题。感谢我的导师Dr Dennis Cheung, 他给我发了一份关于相关系数的PPT,上面有Hotelling-Williams T 检验[Steiger]。

Hotelling-Williams T 检验的公式如下:

t_{(N-3)} = (r_{12}-r_{13})\sqrt{\frac{(N-1)(1+r_{23})}{2(N-1)|R|/(N-3)+\bar{r}^{2}(1-r_{23})^{2}}}

  • r_{12} = correlation between Observation and measure 2
  • r_{13} = correlation between Observation and measure 3
  • r_{23} = correlation between measures
  • N = Number of Observation
  • |R| = 1 - r_{12}^2 - r_{13}^2 - r_{23}^2 + 2(r_{12})(r_{13})(r_{23})
  • \bar{r} = (r_{12} + r_{13})/2

从结果上来看,Hotelling-Williams T test在我的数据上的结果还是挺不错的。另外我对它进行一些文献搜索的时候,发现了[crr]的博客上也有解决类似问题的文章--频数指标跟单词处理时间的问题. (英文)博客上面写得非常仔细而且还额外地介绍了两个检验方法。有一个是Vuong检验[Vuong,1989], 它主要是用在一些例如log 频数指标与单词处理时间之类的非线性情况下的建模. 它的一个特出的优点是它的原理基础是基于log-likelihood的。另外一种检验方法是Clarke(2007)[Clarke]检验,他提出这个方法是基于对Vuong检验在样本量小时的保守性的怀疑。但是在[crr]作者的一系列的模拟实验之后,他们建议优先使用Hotelling-Williams T 检验,其次是Vuong检验。并且在变量间的相关度很低的时候是用Vuong检验。

当然,Hotelling-Williams T检验的核心思想我还没有来得及琢磨清楚,不过应该会在下一次的文章中写上。

  1. [crr]http://crr.ugent.be/archives/546
  2. [Vuong] Vuong, Q.H. (1989): Likelihood Ratio Tests for Model Selection and non-nested Hypotheses. Econometrica, 57, 307-333.
  3. [Clarke] Clarke, K.A. (2007). A Simple Distribution-Free Test for Nonnested Model Selection. Political Analysis, 15, 347-363.
  4. [Steiger] Steiger, J.H. (1980), Tests for comparing elements of a correlation matrix, Psychological Bulletin, 87, 245-251.

发表评论?

0 条评论。

发表评论


注意 - 你可以用以下 HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>