Logo

CHISQ2_q

Updated 2024-02-13 20:06:51.750000

Syntax

SELECT [westclintech].[wct].[CHISQ2_q] (
  <@Actual_range_RangeQuery, nvarchar(max),>
 ,<@Expected_range_RangeQuery, nvarchar(max),>)

Description

Use the scalar function CHISQ2_q to calculate the chi-square (χ2) statistic. This function calculates the chi-square statistic by finding the difference between each observed and theoretical frequency for each possible outcome, squaring them, dividing each by the theoretical frequency, and taking the sum of the results. A second important part of determining the test statistic is to define the degrees of freedom of the test: this is essentially the number of squares errors involving the observed frequencies adjusted for the effect of using some of those observations to define the expected frequencies.

CHISQ2_q requires the expected results as input to the function.

The value of the chi-square statistic is:

\chi^2=\sum_{i=1}^r\sum_{j=1}^c\frac{(O_{i,j} - E_{i,j})^2}{E_{i,j}}

Where

{
    "columns": [
        {
            "field": "column 1",
            "maxWidth": 100
        },
        {
            "field": "column 2",
            "maxWidth": 225
        }
    ],
    "rows": [
        {
            "column 1": "r",
            "column 2": "is the number of columns"
        },
        {
            "column 1": "c",
            "column 2": "is the number of columns"
        },
        {
            "column 1": "O",
            "column 2": "is the Observed result"
        },
        {
            "column 1": "E",
            "column 2": "is the Expected result"
        }   ]
}

Arguments

@Actual_range_RangeQuery

the select statement, as text, used to determine the observed, or actual, results to be used in the calculation.

@Expected_range_RangeQuery

the select statement, as text, used to determine the expected results to be used in the calculation.

Return Type

float

Remarks

Use CHISQ2 for de-normalized tables. Use the CHISQN2 function for normalized tables,.

CHISQ2 requires expected values as input. If you want the expected values calculated automatically, use the CHISQ function.

For queries that are more complex, consider using the CHISQ2_q function.

No GROUP BY is required for this function even though it produces aggregated results.

Examples

In this hypothetical situation, we want to determine if there is an association between population density and the preference for a sport from among baseball, football, and basketball. We will use the CHISQ2_q function to calculate the chi-squared statistic.

CREATE TABLE #O

(

    [Sport] [varchar](20) NOT NULL,

    [Rural] [float] NOT NULL,

    [Suburban] [float] NOT NULL,

    [Urban] [float] NOT NULL

);

INSERT INTO #O

VALUES

('Basketball', 28, 35, 54);

INSERT INTO #O

VALUES

('Baseball', 60, 43, 35);

INSERT INTO #O

VALUES

('Football', 52, 48, 28);

CREATE TABLE #E

(

    [Sport] [varchar](20) NOT NULL,

    [Rural] [float] NOT NULL,

    [Suburban] [float] NOT NULL,

    [Urban] [float] NOT NULL

);

INSERT INTO #E

VALUES

('Basketball', 42.77, 38.49, 35.74);

INSERT INTO #E

VALUES

('Baseball', 50.44, 45.4, 42.16);

INSERT INTO #E

VALUES

('Football', 46.79, 42.11, 39.1);

SELECT wct.CHISQ2_q('SELECT Rural, Suburban, Urban from #O', 'SELECT Rural, Suburban,

          Urban from #E');

This produces the following result

{"columns":[{"field":"column 1","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"column 1":"22.4562079299414"}]}

We could have also invoked the function by passing the data directly to it.

SELECT wct.CHISQ2_q('SELECT 28,35,54 UNION ALL

 SELECT 60,43,35 UNION ALL

 SELECT 52,48,28', 'SELECT 42.77,38.49,35.74 UNION ALL

 SELECT 50.44,45.4,42.16 UNION ALL

 SELECT 46.79,42.11,39.1');

This produces the following result

{"columns":[{"field":"column 1","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"column 1":"22.4562079299414"}]}