CHITEST
Updated 2023-10-30 14:10:49.537000
Syntax
SELECT [westclintech].[wct].[CHITEST] (
<@Actual_range_TableName, nvarchar(4000),>
,<@ColumnNames, nvarchar(4000),>
,<@GroupedColumnName, nvarchar(4000),>
,<@GroupedColumnValue, sql_variant,>)
Description
Use the scalar function CHITEST to calculate the Pearson chi-square test for independence. CHITEST returns the value from the chi-square (χ2) distribution for the statistic and the appropriate degrees of freedom. Calculate the chi-square statistic (χ2) directly using the CHISQ or the CHISQ_q function.
The chi-square statistic is calculated by finding the difference between each observed and theoretical frequency for each possible outcome, squaring them, dividing each by the theoretical frequency, and taking the sum of the results. A second important part of determining the test statistic is to define the degrees of freedom of the test: this is essentially the number of squares errors involving the observed frequencies adjusted for the effect of using some of those observations to define the expected frequencies.
Given the test statistic and the degrees of freedom, the test value is returned by the regularized gamma function Q(a, x) where:
a is the degrees of freedom divided by 2x is χ2 statistic divided by 2
CHITEST automatically calculates the expected results and the degrees of freedom
The value of the test statistic is
\chi^2=\sum_{i=1}^r\sum_{j=1}^c\frac{(O_{i,j} - E_{i,j})^2}{E_{i,j}}
Where
{
"columns": [
{
"field": "column 1",
"maxWidth": 100
},
{
"field": "column 2",
"maxWidth": 225
}
],
"rows": [
{
"column 1": "r",
"column 2": "is the number of columns"
},
{
"column 1": "c",
"column 2": "is the number of columns"
},
{
"column 1": "O",
"column 2": "is the Observed result"
},
{
"column 1": "E",
"column 2": "is the Expected result"
} ]
}
Arguments
@GroupedColumnName
the name, as text, of the column in the table or view specified by @Actual_range_TableName which will be used for grouping the results.
@Actual_range_TableName
the name, as text, of the table or view that contains the actual, or observed, results to be used in the CHITEST calculation.
@ColumnNames
the names, as text, of the columns in the de-normalized table or view specified by @Actual_range_TableName that contains the values to be used in the CHITEST calculation.
@GroupedColumnValue
the column value to do the grouping on.
Return Type
float
Remarks
CHITEST is designed for de-normalized tables. For normalized tables, use the CHITESTN function.
CHITEST automatically calculates the expected values.
For queries that are more complex, consider using the CHITEST_q function.
CHITEST = CHIDIST(χ2, df), where df = (r-1)(c-1), r>1, c>1.
To calculate the test statistic, use the CHISQ function
CHITEST automatically calculates the expected values. If you want to enter the expected values use the CHITEST2 function.
No GROUP BY is required for this function even though it produces aggregated results.
Examples
In this hypothetical situation, we want to determine if there is an association between population density and the preference for a sport from among baseball, football, and basketball. We will use the CHITEST function to perform the chi-squared test.
CREATE TABLE #chi
(
[Sport] [varchar](20) NOT NULL,
[Rural] [float] NOT NULL,
[Suburban] [float] NOT NULL,
[Urban] [float] NOT NULL
);
INSERT INTO #CHI
VALUES
('Basketball', 28, 35, 54);
INSERT INTO #CHI
VALUES
('Baseball', 60, 43, 35);
INSERT INTO #CHI
VALUES
('Football', 52, 48, 28);
SELECT wct.CHITEST('#chi', 'Rural,Suburban,Urban', '', NULL);
This produces the following result
----------------------
0.000162912223138266