COVARIANCE_S
Updated 2023-10-23 19:33:31.550000
Syntax
SELECT [westclintech].[wct].[COVARIANCE_S] (
<@known_y, float,>
,<@known_x, float,>)
Description
Use the aggregate function COVARIANCE_S to calculate the sample covariance for a set of x- and y-values. The formula for the sample covariance is
s_{xy} = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{n-1}
Arguments
@known_y
the y-values to be used in the calculation. @Known_y is an expression of type float or of a type that can be implicitly converted to float.
@known_x
the x-values to be used in the calculation. @Known_x is an expression of type float or of a type that can be implicitly converted to float.
Return Type
float
Remarks
COVARIANCE_S is an AGGREGATE function and follows the same conventions as all other AGGREGATE functions in SQL Server.
Use COVARIANCE_P to calculate the population covariance.
Examples
In this example, we calculate the sample covariance for a single set of x- and y-values
SELECT wct.COVARIANCE_S(y, x) as COVARIANCE_S
FROM
(
SELECT 0.75,
1
UNION ALL
SELECT 2.5,
2
UNION ALL
SELECT 6.75,
3
UNION ALL
SELECT 10,
4
) n(x, y);
This produces the following result
{"columns":[{"field":"COVARIANCE_S","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"COVARIANCE_S":"5.33333333333333"}]}
In this example, we will populate some temporary table with some historical financial information and then calculate the sample covariance. First, create the table and put some data in it:
CREATE TABLE #c
(
SYM NVARCHAR(5),
YE BIGINT,
REV FLOAT,
GPROF FLOAT,
OPINC FLOAT,
NETINC FLOAT
);
INSERT INTO #c
VALUES
('YHOO', 2009, 6460.32, 3588.57, 386.69, 597.99);
INSERT INTO #c
VALUES
('YHOO', 2008, 72.5, 4185.14, 12.96, 418.92);
INSERT INTO #c
VALUES
('YHOO', 2007, 6969.27, 4130.52, 695.41, 639.16);
INSERT INTO #c
VALUES
('YHOO', 2006, 6425.68, 3749.96, 940.97, 751.39);
INSERT INTO #c
VALUES
('YHOO', 2005, 5257.67, 3161.47, 1107.73, 1896.23);
INSERT INTO #c
VALUES
('GOOG', 2009, 23650.56, 14806.45, 8312.19, 6520.45);
INSERT INTO #c
VALUES
('GOOG', 2008, 21795.55, 13174.04, 5537.21, 4226.86);
INSERT INTO #c
VALUES
('GOOG', 2007, 16593.99, 9944.9, 54.44, 4203.72);
INSERT INTO #c
VALUES
('GOOG', 2006, 10604.92, 6379.89, 3550, 3077.45);
INSERT INTO #c
VALUES
('GOOG', 2005, 6138.56, 3561.47, 2017.28, 1465.4);
INSERT INTO #c
VALUES
('MSFT', 2010, 62484, 509, 24167, 18760);
INSERT INTO #c
VALUES
('MSFT', 2009, 58437, 46282, 21225, 14569);
INSERT INTO #c
VALUES
('MSFT', 2008, 60420, 48822, 22271, 17681);
INSERT INTO #c
VALUES
('MSFT', 2007, 51122, 40429, 18438, 14065);
INSERT INTO #c
VALUES
('MSFT', 2006, 44282, 36632, 16064, 12599);
INSERT INTO #c
VALUES
('ORCL', 2010, 26820, 21056, 9062, 6135);
INSERT INTO #c
VALUES
('ORCL', 2009, 23252, 18458, 8321, 5593);
INSERT INTO #c
VALUES
('ORCL', 2008, 22430, 17449, 7844, 5521);
INSERT INTO #c
VALUES
('ORCL', 2007, 17996, 13805, 5974, 4274);
INSERT INTO #c
VALUES
('ORCL', 2006, 14380, 11145, 4736, 3381);
INSERT INTO #c
VALUES
('SAP', 2009, 10672, 6980, 2588, 1748);
INSERT INTO #c
VALUES
('SAP', 2008, 11575, 7370, 2701, 1847);
INSERT INTO #c
VALUES
('SAP', 2007, 10256, 6631, 2698, 1906);
INSERT INTO #c
VALUES
('SAP', 2006, 9393, 6064, 2578, 1871);
INSERT INTO #c
VALUES
('SAP', 2005, 8509, 5460, 2337, 1496);
Now, calculate the population covariance of the revenue (REV) againt the year (YE) for each company (SYM)
SELECT #c.SYM
,wct.COVARIANCE_S(REV,YE) as COVARIANCE_S
FROM #c
GROUP BY SYM;
This produces the following result.
SYM COVARIANCE_S
----- ----------------------
GOOG 11553.6575
MSFT 10929.75
ORCL 7534
SAP 1627
YHOO -986.97
Let’s say we wanted to perform the same analysis as above, but we only want to return the results where the covariance is positive.
SELECT #c.SYM,
wct.COVARIANCE_S(REV, YE) as COVARIANCE_S
FROM #c
GROUP BY SYM
HAVING wct.COVARIANCE_S(REV, YE) > 0;
This produces the following result.
SYM COVARIANCE_S
----- ----------------------
GOOG 11553.6575
MSFT 10929.75
ORCL 7534
SAP 1627
In this example, we will calculate the correlation of the operating income (OPINC) against the revenue (REV)
SELECT #c.SYM,
wct.COVARIANCE_S(OPINC, REV) as COVARIANCE_S
FROM #c
GROUP BY SYM;
This produces the following result.
{"columns":[{"field":"SYM"},{"field":"COVARIANCE_S","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"SYM":"GOOG","COVARIANCE_S":"15351903.708095"},{"SYM":"MSFT","COVARIANCE_S":"23798218"},{"SYM":"ORCL","COVARIANCE_S":"8568824.7"},{"SYM":"SAP","COVARIANCE_S":"147381"},{"SYM":"YHOO","COVARIANCE_S":"845158.97983"}]}