Weibull distribution

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found.

Weibull (2-Parameter)
Probability density function
Probability distribution function
Cumulative distribution function
Cumulative distribution function
Parameters \lambda\in (0, +\infty)\, scale
k\in (0, +\infty)\, shape
Support x \in [0, +\infty)\,
PDF f(x)=\begin{cases}
\frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}} & x\geq0\\
0 & x<0\end{cases}
CDF \begin{cases}1- e^{-(x/\lambda)^k} & x\geq0\\ 0 & x<0\end{cases}
Mean \lambda \, \Gamma(1+1/k)\,
Median \lambda(\ln(2))^{1/k}\,
Mode \begin{cases}
\lambda \left(\frac{k-1}{k} \right)^{\frac{1}{k}}\, &k>1\\
0 &k=1\end{cases}
Variance \lambda^2\left[\Gamma\left(1+\frac{2}{k}\right) - \left(\Gamma\left(1+\frac{1}{k}\right)\right)^2\right]\,
Skewness \frac{\Gamma(1+3/k)\lambda^3-3\mu\sigma^2-\mu^3}{\sigma^3}
Ex. kurtosis (see text)
Entropy \gamma(1-1/k)+\ln(\lambda/k)+1 \,
MGF \sum_{n=0}^\infty \frac{t^n\lambda^n}{n!}\Gamma(1+n/k), \ k\geq1
CF \sum_{n=0}^\infty \frac{(it)^n\lambda^n}{n!}\Gamma(1+n/k)

In probability theory and statistics, the Weibull distribution /ˈvbʊl/ is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Fréchet (1927) and first applied by Rosin & Rammler (1933) to describe a particle size distribution.

Definition

The probability density function of a Weibull random variable is:[1]


f(x;\lambda,k) =
\begin{cases}
\frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}} & x\geq0 ,\\
0 & x<0,
\end{cases}

where k > 0 is the shape parameter and λ > 0 is the scale parameter of the distribution. Its complementary cumulative distribution function is a stretched exponential function. The Weibull distribution is related to a number of other probability distributions; in particular, it interpolates between the exponential distribution (k = 1) and the Rayleigh distribution (k = 2 and \lambda = \sqrt{2}\sigma [2]).

If the quantity X is a "time-to-failure", the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, k, is that power plus one, and so this parameter can be interpreted directly as follows:

  • A value of k < 1 indicates that the failure rate decreases over time. This happens if there is significant "infant mortality", or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population.
  • A value of k = 1 indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure.
  • A value of k > 1 indicates that the failure rate increases with time. This happens if there is an "aging" process, or parts that are more likely to fail as time goes on.

In the field of materials science, the shape parameter k of a distribution of strengths is known as the Weibull modulus.

Properties

Density function

The form of the density function of the Weibull distribution changes drastically with the value of k. For 0 < k < 1, the density function tends to ∞ as x approaches zero from above and is strictly decreasing. For k = 1, the density function tends to 1/λ as x approaches zero from above and is strictly decreasing. For k > 1, the density function tends to zero as x approaches zero from above, increases until its mode and decreases after it. It is interesting to note that the density function has infinite negative slope at x = 0 if 0 < k < 1, infinite positive slope at x = 0 if 1 < k < 2 and null slope at x = 0 if k > 2. For k = 2 the density has a finite positive slope at x = 0. As k goes to infinity, the Weibull distribution converges to a Dirac delta distribution centered at x = λ. Moreover, the skewness and coefficient of variation depend only on the shape parameter.

Distribution function

The cumulative distribution function for the Weibull distribution is

F(x;k,\lambda) = 1- e^{-(x/\lambda)^k}\,

for x ≥ 0, and F(x; k; λ) = 0 for x < 0.

The quantile (inverse cumulative distribution) function for the Weibull distribution is

Q(p;k,\lambda) = \lambda {(-\ln(1-p))}^{1/k}

for 0 ≤ p < 1.

The failure rate h (or hazard function) is given by

 h(x;k,\lambda) = {k \over \lambda} \left({x \over \lambda}\right)^{k-1}.

Moments

The moment generating function of the logarithm of a Weibull distributed random variable is given by[3]

E\left[e^{t\log X}\right] = \lambda^t\Gamma\left(\frac{t}{k}+1\right)

where Γ is the gamma function. Similarly, the characteristic function of log X is given by

E\left[e^{it\log X}\right] = \lambda^{it}\Gamma\left(\frac{it}{k}+1\right).

In particular, the nth raw moment of X is given by

m_n = \lambda^n \Gamma\left(1+\frac{n}{k}\right).

The mean and variance of a Weibull random variable can be expressed as

\mathrm{E}(X) = \lambda \Gamma\left(1+\frac{1}{k}\right)\,

and

\textrm{var}(X) = \lambda^2\left[\Gamma\left(1+\frac{2}{k}\right) - \left(\Gamma\left(1+\frac{1}{k}\right)\right)^2\right]\,.

The skewness is given by

\gamma_1=\frac{\Gamma\left(1+\frac{3}{k}\right)\lambda^3-3\mu\sigma^2-\mu^3}{\sigma^3}

where the mean is denoted by μ and the standard deviation is denoted by σ.

The excess kurtosis is given by

\gamma_2=\frac{-6\Gamma_1^4+12\Gamma_1^2\Gamma_2-3\Gamma_2^2
-4\Gamma_1\Gamma_3+\Gamma_4}{[\Gamma_2-\Gamma_1^2]^2}

where \Gamma_i=\Gamma(1+i/k). The kurtosis excess may also be written as:

\gamma_{2}=\frac{\lambda^4\Gamma(1+\frac{4}{k})-4\gamma_{1}\sigma^3\mu-6\mu^2\sigma^2-\mu^4}{\sigma^4}-3

Moment generating function

A variety of expressions are available for the moment generating function of X itself. As a power series, since the raw moments are already known, one has

E\left[e^{tX}\right] = \sum_{n=0}^\infty \frac{t^n\lambda^n}{n!}\Gamma\left(1+\frac{n}{k}\right).

Alternatively, one can attempt to deal directly with the integral

E\left[e^{tX}\right] = \int_0^\infty e^{tx} \frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^k}\,dx.

If the parameter k is assumed to be a rational number, expressed as k = p/q where p and q are integers, then this integral can be evaluated analytically.[4] With t replaced by −t, one finds

 E\left[e^{-tX}\right] = \frac1{ \lambda^k\, t^k} \, \frac{ p^k \, \sqrt{q/p}} {(\sqrt{2 \pi})^{q+p-2}} \, G_{p,q}^{\,q,p} \!\left( \left. \begin{matrix} \frac{1-k}{p}, \frac{2-k}{p}, \dots, \frac{p-k}{p} \\ \frac{0}{q}, \frac{1}{q}, \dots, \frac{q-1}{q} \end{matrix} \; \right| \, \frac {p^p} {\left( q \, \lambda^k \, t^k \right)^q} \right)

where G is the Meijer G-function.

The characteristic function has also been obtained by Muraleedharan et al. (2007). The characteristic function and moment generating function of 3-parameter Weibull distribution have also been derived by Muraleedharan & Soares (2014) by a direct approach.

Shannon entropy

The entropy is given by


H(\lambda,k) = \gamma\left(1\!-\!\frac{1}{k}\right) + \ln\left(\frac{\lambda}{k}\right) + 1

where \gamma is the Euler–Mascheroni constant.

Parameter estimation

Maximum likelihood

The maximum likelihood estimator for the \lambda parameter given k is,

\hat \lambda^k = \frac{1}{n} \sum_{i=1}^n x_i^k

The maximum likelihood estimator for k is,


  \hat k^{-1} = \frac{\sum_{i=1}^n x_i^k \ln x_i }
                       {\sum_{i=1}^n x_i^k }
                  - \frac{1}{n} \sum_{i=1}^n \ln x_i

This being an implicit function, one must generally solve for k by numerical means.

When x_1 > x_2 > ... > x_N are the N largest observed samples from a dataset of more than N samples, then the maximum likelihood estimator for the \lambda parameter given k is,[5]

\hat \lambda^k = \frac{1}{N} \sum_{i=1}^N (x_i^k - x_N^k)

Also given that condition, the maximum likelihood estimator for k is,


  \hat k^{-1} = \frac{\sum_{i=1}^N (x_i^k \ln x_i -  x_N^k \ln x_N)}
                       {\sum_{i=1}^N (x_i^k - x_N^k)}
                  - \frac{1}{N} \sum_{i=1}^N \ln x_i

Again, this being an implicit function, one must generally solve for k by numerical means.

Weibull plot

The fit of data to a Weibull distribution can be visually assessed using a Weibull Plot.[6] The Weibull Plot is a plot of the empirical cumulative distribution function \hat F(x) of data on special axes in a type of Q-Q plot. The axes are \ln(-\ln(1-\hat F(x))) versus \ln(x). The reason for this change of variables is the cumulative distribution function can be linearized:

\begin{align}
F(x) &= 1-e^{-(x/\lambda)^k}\\
-\ln(1-F(x)) &= (x/\lambda)^k\\
\underbrace{\ln(-\ln(1-F(x)))}_{\textrm{'y'}} &= \underbrace{k\ln x}_{\textrm{'mx'}} - \underbrace{k\ln \lambda}_{\textrm{'c'}}
\end{align}

which can be seen to be in the standard form of a straight line. Therefore if the data came from a Weibull distribution then a straight line is expected on a Weibull plot.

There are various approaches to obtaining the empirical distribution function from data: one method is to obtain the vertical coordinate for each point using \hat F = \frac{i-0.3}{n+0.4} where i is the rank of the data point and n is the number of data points.[7]

Linear regression can also be used to numerically assess goodness of fit and estimate the parameters of the Weibull distribution. The gradient informs one directly about the shape parameter k and the scale parameter \lambda can also be inferred.

The Weibull distribution is used[citation needed]

Fitted cumulative Weibull distribution to maximum one-day rainfalls using CumFreq, see also distribution fitting
  • In describing the size of particles generated by grinding, milling and crushing operations, the 2-Parameter Weibull distribution is used, and in these applications it is sometimes known as the Rosin-Rammler distribution.[citation needed] In this context it predicts fewer fine particles than the Log-normal distribution and it is generally most accurate for narrow particle size distributions.[citation needed] The interpretation of the cumulative distribution function is that F(x; k; λ) is the mass fraction of particles with diameter smaller than x, where λ is the mean particle size and k is a measure of the spread of particle sizes.

Related distributions

f(x;k,\lambda, \theta)={k \over \lambda} \left({x - \theta \over \lambda}\right)^{k-1} e^{-({x-\theta \over \lambda})^k}\,

for x \geq \theta and f(x; k, λ, θ) = 0 for x < θ, where k >0 is the shape parameter, \lambda >0 is the scale parameter and \theta is the location parameter of the distribution. When θ=0, this reduces to the 2-parameter distribution.

  • The Weibull distribution can be characterized as the distribution of a random variable W such that the random variable
X = \left(\frac{W}{\lambda}\right)^k

is the standard exponential distribution with intensity 1.[3]

  • This implies that the Weibull distribution can also be characterized in terms of a uniform distribution: if U is uniformly distributed on (0,1), then the random variable W = \lambda(-\ln(U))^{1/k}\, is Weibull distributed with parameters k and λ. (Note that -\ln(U) here is equivalent to X just above.) This leads to an easily implemented numerical scheme for simulating a Weibull distribution.
  • The Weibull distribution interpolates between the exponential distribution with intensity 1/λ when k = 1 and a Rayleigh distribution of mode \sigma = \lambda/\sqrt{2} when k = 2.
f_{\rm{Frechet}}(x;k,\lambda)=\frac{k}{\lambda} \left(\frac{x}{\lambda}\right)^{-1-k} e^{-(x/\lambda)^{-k}} = -f_{\rm{Weibull}}(x;-k,\lambda).
  • The distribution of a random variable that is defined as the minimum of several random variables, each having a different Weibull distribution, is a poly-Weibull distribution.
f(x;P_{\rm{80}},m) =  \begin{cases}
1-e^{ln\left(0.2\right)\left(\frac{x}{P_{\rm{80}}}\right)^m} & x\geq0 ,\\
0 & x<0 ,\end{cases}

where

x: Particle size
P_{\rm{80}}: 80th percentile of the particle size distribution
m: Parameter describing the spread of the distribution

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. http://www.mathworks.com.au/help/stats/rayleigh-distribution.html
  3. 3.0 3.1 3.2 Johnson, Kotz & Balakrishnan 1994
  4. See (Cheng, Tellambura & Beaulieu 2004) for the case when k is an integer, and (Sagias & Karagiannidis 2005) for the rational case.
  5. Lua error in package.lua at line 80: module 'strict' not found..
  6. The Weibull plot
  7. Wayne Nelson (2004) Applied Life Data Analysis. Wiley-Blackwell ISBN 0-471-64462-5
  8. Survival/Failure Time Analysis
  9. Wind Speed Distribution Weibull
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.

Bibliography

  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.

External links