We begin with a list of values with associated weights .
If for any we have that , then we replace both values and by a single value which is their weighted average (using the weights and . This new value takes the combined weight of the two values it replaces, . The list is now one shorter, so becomes .
We repeat this process until either or we have a monotone increasing sequence of numbers.
We can recreate the two ordering process examples shown on pages 187 (section 3) and 189 (section 5) of Chacko (1966) in this package.
## This is permChacko 1.0.1
## To learn more about the package, please read the documentation available in the commands below:
## help("permChacko-package")
## browseVignettes("permChacko")
reduceVector(chacko66_sec3)
##
## Original vector: 10 16 14 12 18
## Reduced vector : 10 14 18
reduceVector(chacko66_sec5)
##
## Original vector: 12 14 18 16 22 20 18 24 26 30
## Reduced vector : 12 14 17 20 24 26 30
Notice that Chacko was entirely comfortable with this ordering process ending with a single value. If you look at their table on page 188 then he suggests that under the null hypothesis, if you start with a list of 5 values, then you have a 20% chance that the order process results in a single value.
The aforementioned table is available both in Chacko (1963) and Chacko (1966), as well as in this package:
chacko63_tab1
## k = 3 k = 4 k = 5 k = 6 k = 7 k = 8 k = 9 k = 10
## m = 1 0.333333 0.250000 0.200000 0.166667 0.142857 0.125000 0.111111 0.100000
## m = 2 0.500000 0.458333 0.416667 0.380556 0.350000 0.324107 0.301984 0.282897
## m = 3 0.166667 0.250000 0.291667 0.312500 0.322222 0.325694 0.325519 0.323165
## m = 4 NA 0.041667 0.083333 0.118055 0.145833 0.167882 0.185417 0.199427
## m = 5 NA NA 0.008333 0.020833 0.034722 0.048611 0.061863 0.074219
## m = 6 NA NA NA 0.001389 0.004167 0.007986 0.012500 0.017436
## m = 7 NA NA NA NA 0.000198 0.000694 0.001505 0.002604
## m = 8 NA NA NA NA NA 0.000025 0.000099 0.000240
## m = 9 NA NA NA NA NA NA 0.000003 0.000012
## m = 10 NA NA NA NA NA NA NA 0.000000
Even if the outcome is a single value, the test statistic can be calculated from equation 5 on page 188, i.e.:
In the equation above, are the values in the reduced vector, , is the number of elements in the final reduced vector . The statistic is asymptotically distributed as .
The question then becomes how to obtain a -value associated with this test statistic.
Given that
,
an analytical
-value
can be calculated for the test statistic for any
.
Moreover, for certain combinations of
and
the
-value
can be calculated using the table shown below. Finally, a numeric
-value
can be calculated through computer simulation, which will be explained
later. All these p-values are calculated using the
permChacko()
function.
permChacko(chacko66_sec3)
##
## Chacko Test for Order-restriction with Permutation Test
##
## Null hypothesis : p1 == p2 == p3 == p4 == p5
## Alternative hypothesis : p1 <= p2 <= p3 <= p4 <= p5
##
## Test statistic (chisq_bar): 2.285714
## p-values:
## Analytic p-value : 0.318907
## Numeric p-value : 0.195000 (1000 permutations)
## Numeric mid-p value : 0.192500 (1000 permutations)
## Tabular p-value : 0.196052
permChacko(chacko66_sec5)
##
## Chacko Test for Order-restriction with Permutation Test
##
## Null hypothesis : p1 == p2 == ... == p9 == p10
## Alternative hypothesis : p1 <= p2 <= ... <= p9 <= p10
##
## Test statistic (chisq_bar): 13.500000
## p-values:
## Analytic p-value : 0.035748
## Numeric p-value : 0.003000 (1000 permutations)
## Numeric mid-p value : 0.003000 (1000 permutations)
## Tabular p-value : 0.002294
One omission from Chacko (1966) is the event where . As discussed above, they clearly expect that this will happen sometimes, but provide no explanation as to how to calculate the -value in this case. In this package, we suggest an approach to calculate this value using a permutation test.
If you look at Chacko’s final example on how they evaluate significance, they say that you reject the null hypothesis if the observed calculated value obtained from the equation for above is greater than the value obtained from equation (6) in the paper (reproduced below).
This approach does work when , but seems very cumbersome, requiring calculation of the probability values , which in turn requires consultation or partial recreation of the Chacko table. With the computing power available to us, 60 years into the future from Chacko’s seminal work, we can simply obtain a close-enough -value using a permutation test.
Imagine that the sum of our original values is . Let a permutation be a stochastic distribution of objects (independently) across the categories (with each one being equally likely under the null hypothesis). For each such permutation we can go through Chacko’s ordering procedure and calculate the test statistic according to equation 5.
The -value in this case is simply the fraction of such permutations that yield a test statistic equal to or greater than the one we originally observed.