Home / Guides

Choosing the Right Probability Distribution

A structured decision framework for selecting probability distributions based on data characteristics, domain knowledge, and modeling goals.

Try This In The Plotter

Jump directly into the interval probability workflow and apply this guide on a live distribution chart.

Open Interactive Interval Probability

Start With Variable Type and Support

The first split is discrete vs continuous. Then consider the support: is it bounded (like [0, 1] for proportions), non-negative (like lifetimes), or the full real line (like measurement errors)?

Support constraints immediately rule out many candidates. A proportion cannot follow a normal distribution without truncation issues, and a count variable should not use a continuous model without justification.

Symmetric vs Skewed

If your data are roughly symmetric and unbounded, the normal distribution is the natural starting point. For heavy-tailed symmetric data, consider the Student t or Cauchy distributions.

Right-skewed positive data suggest exponential, gamma, Weibull, or lognormal. Left-skewed data are rarer but can sometimes be modeled by reflecting a right-skewed distribution.

Count Data Decision Tree

Binary outcomes with a fixed number of trials lead to the binomial. Counts of rare events with no fixed upper bound suggest Poisson. If the variance exceeds the mean (overdispersion), the negative binomial is a better choice.

The geometric distribution models the number of trials until the first success and is a special case of the negative binomial.

Lifetime and Waiting Time Models

For constant failure rate, use the exponential. If the failure rate changes over time, the Weibull is the most common first choice. The gamma generalizes the exponential for aggregate waiting times.

The lognormal is appropriate when failure results from multiplicative degradation, and the Pareto models heavy-tailed phenomena like income and file sizes.

Validating Your Choice

After selecting a candidate, fit the parameters and check the fit visually with histograms overlaid on the theoretical density. QQ plots reveal tail deviations that density overlays can miss.

Formal tests like Kolmogorov-Smirnov or Anderson-Darling quantify the discrepancy but can be overly sensitive with large samples. Prioritize practical fit for your use case over p-values.

Related Distributions

Normal Distribution

Binomial Distribution

Poisson Distribution

Exponential Distribution

Gamma Distribution

Weibull Distribution

Beta Distribution

Related Calculators

Normal Distribution Calculator

Binomial Distribution Calculator

Poisson Distribution Calculator

Exponential Distribution Calculator

Gamma Distribution Calculator

Weibull Distribution Calculator

Beta Distribution Calculator

Uniform Distribution Calculator