Logo

HYPGEOM_DIST

Updated 2023-11-03 14:17:48.713000

Syntax

SELECT [westclintech].[wct].[HYPGEOM_DIST](
  <@Sample_s, float,>
 ,<@Number_sample, float,>
 ,<@Population_s, float,>
 ,<@Number_population, float,>
 ,<@Cumulative, bit,>)

Description

Use the scalar function HYPGEOM_DIST to calculate the probability mass function or the cumulative distributive function of the hypergeometric distribution. The hypergeometric distribution is generally described using colored balls in an urn. Given an urn containing N balls of which K are white (and N-K are black), the hypergeometric distribution calculates the probability of drawing k white balls from a sample of n without replacement (meaning that once a ball is removed from the urn it is not put back).

Arguments

@Number_population

the size of the population. @Number_population must be of a type float or of a type that intrinsically converts to float.

@Cumulative

a bit value identifying whether the probability mass function ( 'False') or the cumulative distribution function ( 'True') is to be returned. @Cumulative is of a type bit or a type that implicitly converts to bit.

@Population_s

the number of successes in the population. @Population_s must be of a type float or of type that intrinsically converts to float.

@Sample_s

the number of successes in the sample. @Sample_s must be of a type float or of type that intrinsically converts to float.

@Number_sample

the size of the sample. @Number_sample must be of a type float or of type that intrinsically converts to float.

Return Type

float

Remarks

@Sample_s, @Number_sample, @Population_s, and @Number_population are truncated; only the integer part is used.

0 ≤ @Sample_s

0 ≤ @Population_s

0 < @Number_sample

0 < @Number_population

@Population_s ≤ @Number_population.

Examples

In an urn with 1000 balls of which 300 are white we want to calculate the probability of having exactly 40 white balls when we randomly draw 150 from the urn.

SELECT wct.HYPGEOM_DIST(   40,   --@Sample_s

                           150,  --@Number_sample

                           300,  --@Population_s

                           1000, --@Number_population

                           'False'

                       ) as pmf;

This produces the following result.

{"columns":[{"field":"pmf","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"pmf":"0.0492662218155091"}]}

If we wanted to calculate the probability of having up to 40 balls we would use the following SQL.

SELECT wct.HYPGEOM_DIST(   40,   --@Sample_s

                           150,  --@Number_sample

                           300,  --@Population_s

                           1000, --@Number_population

                           'True'

                       ) as cdf;

This produces the following result.

                    cdf 
 ---------------------- 
      0.192865931527232 

One way to look at the cumulative distribution function is as the sum of probability mass functions, as shown in this example.

SELECT *,

       SUM(hgeom) OVER (ORDER BY sample_s) as cumulative

FROM

(

    SELECT seq - 1 as sample_s,

           wct.HYPGEOM_DIST(SeriesValue, 150, 300, 1000, 'False') hgeom

    FROM wct.SeriesInt(0, 40, NULL, NULL, NULL)

) n;

This produces the following result.

{"columns":[{"field":"sample_s","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"hgeom","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"},{"field":"cumulative","headerClass":"ag-right-aligned-header","cellClass":"ag-right-aligned-cell"}],"rows":[{"sample_s":"0","hgeom":"2.46777221211099E-26","cumulative":"2.46777221211099E-26"},{"sample_s":"1","hgeom":"2.01542195181487E-24","cumulative":"2.04009967393598E-24"},{"sample_s":"2","hgeom":"8.13306733471617E-23","cumulative":"8.33707730210977E-23"},{"sample_s":"3","hgeom":"2.16215070361945E-21","cumulative":"2.24552147664055E-21"},{"sample_s":"4","hgeom":"4.25980765204366E-20","cumulative":"4.48435979970771E-20"},{"sample_s":"5","hgeom":"6.63394045011636E-19","cumulative":"7.08237643008713E-19"},{"sample_s":"6","hgeom":"8.50619912331109E-18","cumulative":"9.2144367663198E-18"},{"sample_s":"7","hgeom":"9.23617455975085E-17","cumulative":"1.01576182363828E-16"},{"sample_s":"8","hgeom":"8.66905192381243E-16","cumulative":"9.68481374745071E-16"},{"sample_s":"9","hgeom":"7.14477378192844E-15","cumulative":"8.11325515667351E-15"},{"sample_s":"10","hgeom":"5.2349502329707E-14","cumulative":"6.04627574863805E-14"},{"sample_s":"11","hgeom":"3.44415782626197E-13","cumulative":"4.04878540112577E-13"},{"sample_s":"12","hgeom":"2.05153119867592E-12","cumulative":"2.4564097387885E-12"},{"sample_s":"13","hgeom":"1.11403273616817E-11","cumulative":"1.35967371004702E-11"},{"sample_s":"14","hgeom":"5.54744847434262E-11","cumulative":"6.90712218438964E-11"},{"sample_s":"15","hgeom":"2.54600065909309E-10","cumulative":"3.23671287753205E-10"},{"sample_s":"16","hgeom":"1.08168479857148E-09","cumulative":"1.40535608632469E-09"},{"sample_s":"17","hgeom":"4.27062939043597E-09","cumulative":"5.67598547676066E-09"},{"sample_s":"18","hgeom":"1.57220480855479E-08","cumulative":"2.13980335623086E-08"},{"sample_s":"19","hgeom":"5.4133523072449E-08","cumulative":"7.55315566347576E-08"},{"sample_s":"20","hgeom":"1.74799045422862E-07","cumulative":"2.5033060205762E-07"},{"sample_s":"21","hgeom":"5.30621737419239E-07","cumulative":"7.80952339476859E-07"},{"sample_s":"22","hgeom":"1.51761021546781E-06","cumulative":"2.29856255494467E-06"},{"sample_s":"23","hgeom":"4.09762818933358E-06","cumulative":"6.39619074427825E-06"},{"sample_s":"24","hgeom":"1.04638837160682E-05","cumulative":"1.68600744603464E-05"},{"sample_s":"25","hgeom":"2.53142274859012E-05","cumulative":"4.21743019462477E-05"},{"sample_s":"26","hgeom":"5.81047389040933E-05","cumulative":"0.000100279040850341"},{"sample_s":"27","hgeom":"0.000126719725849272","cumulative":"0.000226998766699613"},{"sample_s":"28","hgeom":"0.000262921507309171","cumulative":"0.000489920274008785"},{"sample_s":"29","hgeom":"0.000519610940298606","cumulative":"0.00100953121430739"},{"sample_s":"30","hgeom":"0.000979227720881422","cumulative":"0.00198875893518881"},{"sample_s":"31","hgeom":"0.00176153340494886","cumulative":"0.00375029234013768"},{"sample_s":"32","hgeom":"0.00302773012381043","cumulative":"0.00677802246394811"},{"sample_s":"33","hgeom":"0.00497681466996138","cumulative":"0.0117548371339095"},{"sample_s":"34","hgeom":"0.00782991103318369","cumulative":"0.0195847481670932"},{"sample_s":"35","hgeom":"0.0117997428493288","cumulative":"0.031384491016422"},{"sample_s":"36","hgeom":"0.017045751011242","cumulative":"0.0484302420276639"},{"sample_s":"37","hgeom":"0.0236202828138613","cumulative":"0.0720505248415252"},{"sample_s":"38","hgeom":"0.0314165406796045","cumulative":"0.10346706552113"},{"sample_s":"39","hgeom":"0.0401326441906407","cumulative":"0.14359970971177"},{"sample_s":"40","hgeom":"0.0492662218155091","cumulative":"0.192865931527279"}]}

See Also

HYPGEOM_INV - Inverse of the hypergeometric distribution

NEGBINOM_DIST - Negative binomial distribution