KSTEST_q

Updated 2023-10-26 14:11:00.240000

Syntax

SELECT [westclintech].[wct].[KSTEST_q](
  <@XValues_RangeQuery, nvarchar(max),>
 ,<@Statistic, nvarchar(4000),>)

Description

Use the scalar function KSTEST_q to compare a sample with a reference probability distribution. The Kolmogorov-Smirnov statistic qualifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference sample.

Arguments

@Statistic

identifies the return value as either the Kolmogorov-Smirnov statistic (k) or the p-value (p).

@XValues_RangeQuery

a string which contains the SQL to select the sample values and the cumulative distiribution function. The resultant table must contain two columns containing the following values.

the sample values. @x is an expression of type float or of a type that can be implicitly converted to float.

@cdf

the values of the cumulative distribution function. @cdf is an expression of type float or of a type that can be implicitly converted to float.

Return Type

float

Remarks

If @x is null then @x and the corresponding @cdf are not included in the calculation.

If @cdf is null then @cdf and the corresponding @x are not included in the calculation.

To evaluate two samples, use KSTEST2 or KSTEST2_q.

Examples

In this example, we compare the sample (-2,-1, 0, 1, 2, 3, 4) to the standard normal distribution.

/*Put data into a temp table*/

SELECT *

INTO #n

FROM

(

    VALUES

        (-2),

        (-1),

        (0),

        (1),

        (2),

        (3),

        (4)

) n (x);

/*Calcuate the statistic and the p-value*/

SELECT wct.KSTEST_q('SELECT x,wct.NORMSDIST(x) FROM #n', 'k') as k_observed,

       wct.KSTEST_q('SELECT x,wct.NORMSDIST(x) FROM #n', 'p') as p_value;

/*Clean up the temp table*/

DROP TABLE #n;

This produces the following result.

{"columns":[{"field":"k_observed","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"p_value","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"k_observed":"0.412773317497114","p_value":"0.135890127950776"}]}

In this example, we compare the sample x, consisting of 1,000 randomly generated numbers in the standard normal distribution, to the standard normal distribution.

/*Put data into a temp table*/

SELECT k.seriesvalue as x

INTO #n

FROM wctMath.wct.SeriesFloat(0, 1, NULL, 1000, 'N') k;

/*Calcuate the statistic and the p-value*/

SELECT wct.KSTEST_q('SELECT x,wct.NORMSDIST(x) FROM #n', 'k') as k_observed,

       wct.KSTEST_q('SELECT x,wct.NORMSDIST(x) FROM #n', 'p') as p_value;

/*Clean up the temp table*/

DROP TABLE #n;

This produces the following result. Your results will be different.

{"columns":[{"field":"k_observed","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"p_value","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"k_observed":"0.0207079400233719","p_value":"0.776329498203758"}]}

In this example, we compare the uniformly random numbers to the gamma distribution with shape 1 and scale 1.

/*Put data into a temp table*/

SELECT k.seriesvalue as x

INTO #n

FROM wctMath.wct.SeriesFloat(0, 1, NULL, 1000, 'R') k;

/*Calcuate the statistic and the p-value*/

SELECT wct.KSTEST_q('SELECT x,wct.GAMMADIST(x,1,1,1) FROM #n', 'k') as k_observed,

       wct.KSTEST_q('SELECT x,wct.GAMMADIST(x,1,1,1) FROM #n', 'p') as p_value;

/*Clean up the temp table*/

DROP TABLE #n;

This produces the following result. Your results will be different.

{"columns":[{"field":"k_observed","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"p_value","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"k_observed":"0.367902930881325","p_value":"6.82167636320844E-122"}]}