Logo

COVARIANCE_S

Updated 2023-10-23 19:33:31.550000

Syntax

SELECT [westclintech].[wct].[COVARIANCE_S] (
  <@known_y, float,>
 ,<@known_x, float,>)

Description

Use the aggregate function COVARIANCE_S to calculate the sample covariance for a set of x- and y-values. The formula for the sample covariance is

s_{xy} = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{n-1}

Arguments

@known_y

the y-values to be used in the calculation. @Known_y is an expression of type float or of a type that can be implicitly converted to float.

@known_x

the x-values to be used in the calculation. @Known_x is an expression of type float or of a type that can be implicitly converted to float.

Return Type

float

Remarks

COVARIANCE_S is an AGGREGATE function and follows the same conventions as all other AGGREGATE functions in SQL Server.

Use COVARIANCE_P to calculate the population covariance.

Examples

In this example, we calculate the sample covariance for a single set of x- and y-values

SELECT wct.COVARIANCE_S(y, x) as COVARIANCE_S

FROM

(

    SELECT 0.75,

           1

    UNION ALL

    SELECT 2.5,

           2

    UNION ALL

    SELECT 6.75,

           3

    UNION ALL

    SELECT 10,

           4

) n(x, y);

This produces the following result

{"columns":[{"field":"COVARIANCE_S","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"COVARIANCE_S":"5.33333333333333"}]}

In this example, we will populate some temporary table with some historical financial information and then calculate the sample covariance. First, create the table and put some data in it:

CREATE TABLE #c
(
    SYM NVARCHAR(5),
    YE BIGINT,
    REV FLOAT,
    GPROF FLOAT,
    OPINC FLOAT,
    NETINC FLOAT
);
INSERT INTO #c
VALUES
('YHOO', 2009, 6460.32, 3588.57, 386.69, 597.99);
INSERT INTO #c
VALUES
('YHOO', 2008, 72.5, 4185.14, 12.96, 418.92);
INSERT INTO #c
VALUES
('YHOO', 2007, 6969.27, 4130.52, 695.41, 639.16);
INSERT INTO #c
VALUES
('YHOO', 2006, 6425.68, 3749.96, 940.97, 751.39);
INSERT INTO #c
VALUES
('YHOO', 2005, 5257.67, 3161.47, 1107.73, 1896.23);
INSERT INTO #c
VALUES
('GOOG', 2009, 23650.56, 14806.45, 8312.19, 6520.45);
INSERT INTO #c
VALUES
('GOOG', 2008, 21795.55, 13174.04, 5537.21, 4226.86);
INSERT INTO #c
VALUES
('GOOG', 2007, 16593.99, 9944.9, 54.44, 4203.72);
INSERT INTO #c
VALUES
('GOOG', 2006, 10604.92, 6379.89, 3550, 3077.45);
INSERT INTO #c
VALUES
('GOOG', 2005, 6138.56, 3561.47, 2017.28, 1465.4);
INSERT INTO #c
VALUES
('MSFT', 2010, 62484, 509, 24167, 18760);
INSERT INTO #c
VALUES
('MSFT', 2009, 58437, 46282, 21225, 14569);
INSERT INTO #c
VALUES
('MSFT', 2008, 60420, 48822, 22271, 17681);
INSERT INTO #c
VALUES
('MSFT', 2007, 51122, 40429, 18438, 14065);
INSERT INTO #c
VALUES
('MSFT', 2006, 44282, 36632, 16064, 12599);
INSERT INTO #c
VALUES
('ORCL', 2010, 26820, 21056, 9062, 6135);
INSERT INTO #c
VALUES
('ORCL', 2009, 23252, 18458, 8321, 5593);
INSERT INTO #c
VALUES
('ORCL', 2008, 22430, 17449, 7844, 5521);
INSERT INTO #c
VALUES
('ORCL', 2007, 17996, 13805, 5974, 4274);
INSERT INTO #c
VALUES
('ORCL', 2006, 14380, 11145, 4736, 3381);
INSERT INTO #c
VALUES
('SAP', 2009, 10672, 6980, 2588, 1748);
INSERT INTO #c
VALUES
('SAP', 2008, 11575, 7370, 2701, 1847);
INSERT INTO #c
VALUES
('SAP', 2007, 10256, 6631, 2698, 1906);
INSERT INTO #c
VALUES
('SAP', 2006, 9393, 6064, 2578, 1871);
INSERT INTO #c
VALUES
('SAP', 2005, 8509, 5460, 2337, 1496);

Now, calculate the population covariance of the revenue (REV) againt the year (YE) for each company (SYM)

SELECT #c.SYM
,wct.COVARIANCE_S(REV,YE) as COVARIANCE_S
FROM #c
GROUP BY SYM;

This produces the following result.

SYM             COVARIANCE_S
----- ----------------------
GOOG              11553.6575
MSFT                10929.75
ORCL                    7534
SAP                     1627
YHOO                 -986.97

Let’s say we wanted to perform the same analysis as above, but we only want to return the results where the covariance is positive.

SELECT #c.SYM,

       wct.COVARIANCE_S(REV, YE) as COVARIANCE_S

FROM #c

GROUP BY SYM

HAVING wct.COVARIANCE_S(REV, YE) > 0;

This produces the following result.

SYM             COVARIANCE_S
----- ----------------------
GOOG              11553.6575
MSFT                10929.75
ORCL                    7534
SAP                     1627

In this example, we will calculate the correlation of the operating income (OPINC) against the revenue (REV)

SELECT #c.SYM,

       wct.COVARIANCE_S(OPINC, REV) as COVARIANCE_S

FROM #c

GROUP BY SYM;

This produces the following result.

{"columns":[{"field":"SYM"},{"field":"COVARIANCE_S","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"SYM":"GOOG","COVARIANCE_S":"15351903.708095"},{"SYM":"MSFT","COVARIANCE_S":"23798218"},{"SYM":"ORCL","COVARIANCE_S":"8568824.7"},{"SYM":"SAP","COVARIANCE_S":"147381"},{"SYM":"YHOO","COVARIANCE_S":"845158.97983"}]}