HYPGEOM_DIST
Updated 2023-11-03 14:17:48.713000
Syntax
SELECT [westclintech].[wct].[HYPGEOM_DIST](
<@Sample_s, float,>
,<@Number_sample, float,>
,<@Population_s, float,>
,<@Number_population, float,>
,<@Cumulative, bit,>)
Description
Use the scalar function HYPGEOM_DIST to calculate the probability mass function or the cumulative distributive function of the hypergeometric distribution. The hypergeometric distribution is generally described using colored balls in an urn. Given an urn containing N balls of which K are white (and N-K are black), the hypergeometric distribution calculates the probability of drawing k white balls from a sample of n without replacement (meaning that once a ball is removed from the urn it is not put back).
Arguments
@Number_population
the size of the population. @Number_population must be of a type float or of a type that intrinsically converts to float.
@Cumulative
a bit value identifying whether the probability mass function ( 'False') or the cumulative distribution function ( 'True') is to be returned. @Cumulative is of a type bit or a type that implicitly converts to bit.
@Population_s
the number of successes in the population. @Population_s must be of a type float or of type that intrinsically converts to float.
@Sample_s
the number of successes in the sample. @Sample_s must be of a type float or of type that intrinsically converts to float.
@Number_sample
the size of the sample. @Number_sample must be of a type float or of type that intrinsically converts to float.
Return Type
float
Remarks
@Sample_s, @Number_sample, @Population_s, and @Number_population are truncated; only the integer part is used.
0 ≤ @Sample_s
0 ≤ @Population_s
0 < @Number_sample
0 < @Number_population
@Population_s ≤ @Number_population.
Examples
In an urn with 1000 balls of which 300 are white we want to calculate the probability of having exactly 40 white balls when we randomly draw 150 from the urn.
SELECT wct.HYPGEOM_DIST( 40, --@Sample_s
150, --@Number_sample
300, --@Population_s
1000, --@Number_population
'False'
) as pmf;
This produces the following result.
{"columns":[{"field":"pmf","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"pmf":"0.0492662218155091"}]}
If we wanted to calculate the probability of having up to 40 balls we would use the following SQL.
SELECT wct.HYPGEOM_DIST( 40, --@Sample_s
150, --@Number_sample
300, --@Population_s
1000, --@Number_population
'True'
) as cdf;
This produces the following result.
cdf
----------------------
0.192865931527232
One way to look at the cumulative distribution function is as the sum of probability mass functions, as shown in this example.
SELECT *,
SUM(hgeom) OVER (ORDER BY sample_s) as cumulative
FROM
(
SELECT seq - 1 as sample_s,
wct.HYPGEOM_DIST(SeriesValue, 150, 300, 1000, 'False') hgeom
FROM wct.SeriesInt(0, 40, NULL, NULL, NULL)
) n;
This produces the following result.
{"columns":[{"field":"sample_s","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"hgeom","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"cumulative","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"sample_s":"0","hgeom":"2.46777221211099E-26","cumulative":"2.46777221211099E-26"},{"sample_s":"1","hgeom":"2.01542195181487E-24","cumulative":"2.04009967393598E-24"},{"sample_s":"2","hgeom":"8.13306733471617E-23","cumulative":"8.33707730210977E-23"},{"sample_s":"3","hgeom":"2.16215070361945E-21","cumulative":"2.24552147664055E-21"},{"sample_s":"4","hgeom":"4.25980765204366E-20","cumulative":"4.48435979970771E-20"},{"sample_s":"5","hgeom":"6.63394045011636E-19","cumulative":"7.08237643008713E-19"},{"sample_s":"6","hgeom":"8.50619912331109E-18","cumulative":"9.2144367663198E-18"},{"sample_s":"7","hgeom":"9.23617455975085E-17","cumulative":"1.01576182363828E-16"},{"sample_s":"8","hgeom":"8.66905192381243E-16","cumulative":"9.68481374745071E-16"},{"sample_s":"9","hgeom":"7.14477378192844E-15","cumulative":"8.11325515667351E-15"},{"sample_s":"10","hgeom":"5.2349502329707E-14","cumulative":"6.04627574863805E-14"},{"sample_s":"11","hgeom":"3.44415782626197E-13","cumulative":"4.04878540112577E-13"},{"sample_s":"12","hgeom":"2.05153119867592E-12","cumulative":"2.4564097387885E-12"},{"sample_s":"13","hgeom":"1.11403273616817E-11","cumulative":"1.35967371004702E-11"},{"sample_s":"14","hgeom":"5.54744847434262E-11","cumulative":"6.90712218438964E-11"},{"sample_s":"15","hgeom":"2.54600065909309E-10","cumulative":"3.23671287753205E-10"},{"sample_s":"16","hgeom":"1.08168479857148E-09","cumulative":"1.40535608632469E-09"},{"sample_s":"17","hgeom":"4.27062939043597E-09","cumulative":"5.67598547676066E-09"},{"sample_s":"18","hgeom":"1.57220480855479E-08","cumulative":"2.13980335623086E-08"},{"sample_s":"19","hgeom":"5.4133523072449E-08","cumulative":"7.55315566347576E-08"},{"sample_s":"20","hgeom":"1.74799045422862E-07","cumulative":"2.5033060205762E-07"},{"sample_s":"21","hgeom":"5.30621737419239E-07","cumulative":"7.80952339476859E-07"},{"sample_s":"22","hgeom":"1.51761021546781E-06","cumulative":"2.29856255494467E-06"},{"sample_s":"23","hgeom":"4.09762818933358E-06","cumulative":"6.39619074427825E-06"},{"sample_s":"24","hgeom":"1.04638837160682E-05","cumulative":"1.68600744603464E-05"},{"sample_s":"25","hgeom":"2.53142274859012E-05","cumulative":"4.21743019462477E-05"},{"sample_s":"26","hgeom":"5.81047389040933E-05","cumulative":"0.000100279040850341"},{"sample_s":"27","hgeom":"0.000126719725849272","cumulative":"0.000226998766699613"},{"sample_s":"28","hgeom":"0.000262921507309171","cumulative":"0.000489920274008785"},{"sample_s":"29","hgeom":"0.000519610940298606","cumulative":"0.00100953121430739"},{"sample_s":"30","hgeom":"0.000979227720881422","cumulative":"0.00198875893518881"},{"sample_s":"31","hgeom":"0.00176153340494886","cumulative":"0.00375029234013768"},{"sample_s":"32","hgeom":"0.00302773012381043","cumulative":"0.00677802246394811"},{"sample_s":"33","hgeom":"0.00497681466996138","cumulative":"0.0117548371339095"},{"sample_s":"34","hgeom":"0.00782991103318369","cumulative":"0.0195847481670932"},{"sample_s":"35","hgeom":"0.0117997428493288","cumulative":"0.031384491016422"},{"sample_s":"36","hgeom":"0.017045751011242","cumulative":"0.0484302420276639"},{"sample_s":"37","hgeom":"0.0236202828138613","cumulative":"0.0720505248415252"},{"sample_s":"38","hgeom":"0.0314165406796045","cumulative":"0.10346706552113"},{"sample_s":"39","hgeom":"0.0401326441906407","cumulative":"0.14359970971177"},{"sample_s":"40","hgeom":"0.0492662218155091","cumulative":"0.192865931527279"}]}