Investment Studio > Expressions > Functions > Statistical > CHITEST
float chitest(float array[rows][cols] actual_values, float array[rows][cols] expected_values)
Returns the c2 (chi-squared) confidence level of a hypothesis predicting expected_values, given the measured actual_values.
The number of degrees of freedom is taken to be = (number of rows - 1) * (number of columns - 1) of the argument arrays.
The two arrays actual_values and expected_values must have the same dimensions. All array elements are converted to float. If conversion fails, the default value 0.0 is used.
The function result is the value of the one-tailed c2 CDF (cumulative Distribution Function; see chidist) for the c2 statistic of actual_values and expected_values, i.e. for the deviation of the actual values from the expected values (see chi2). It is therefore the probability of a stochastic variable with c2 distribution being >= the measured c2 deviation.
It can be shown that for large numbers of values, this is approximately the confidence level of the hypothesis predicting expected_values. There is no strict definition of "large" in this context. A common rule of thumb is that no possible outcome of a test should have an expected absolute frequency of less than 5 in the complete series of tests (i.e. the corresponding value should be expected to occur at least 5 times in the series).
Common confidence levels required to consider a test result "significant" are 95% (often denoted "significant*"), 99% ("significant**") and 99.9% ("significant***").
Example
We want to test the hypothesis that a 6-sided dice is evenly balanced. We throw the dice 60 times and record the following results:
_actual = {{1, 8}, {2, 7}, {3, 13}, {4, 9}, {5, 12}, {6, 11}}
Here, the first column contains the score and the second column contains the number of rolls yielding that score (e.g. 9 rolls scoring a 4). Of course, if the dice is evenly balanced, the expected result is
_expected = {{1, 10}, {2, 10}, {3, 10}, {4, 10}, {5, 10}, {6, 10}}
satisfying the rule of thumb for applicability of the c2 test (at least 5 expected occurrences of each possible outcome; here we expect 10 occurrences of each).
Given the actual results, the confidence level of our hypothesis is
=chitest(array(_actual), array(_expected))
» 73%. Loosely speaking, there is a 27% chance that the dice is not evenly balanced.
We could have obtained the same result using the expression
=chidist(chi2({8, 7, 13, 9, 12, 11}, {10, 10, 10, 10, 10, 10}), 5)
The number of degrees of freedom is 5 since there are 6 possible outcomes and picking one excludes all the others. Generally speaking, the number of degrees of freedom in a test with mutually exclusive outcomes is the number of possible outcomes minus one.