| Title: | Estimators and Algorithms for Heavy-Tailed Distributions |
|---|---|
| Description: | Implements the estimators and algorithms described in Chapters 8 and 9 of the book "The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation" by Nair et al. (2022, ISBN:9781009053730). These include the Hill estimator, Moments estimator, Pickands estimator, Peaks-over-Threshold (POT) method, Power-law fit, and the Double Bootstrap algorithm. |
| Authors: | Farid Rohan [aut, cre] |
| Maintainer: | Farid Rohan <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-26 06:23:33 UTC |
| Source: | https://github.com/0diraf/heavytails |
This function implements the Double Bootstrap algorithm as described by in Chapter 9 by Nair et al. It applies bootstrapping to two samples of different sizes to choose the value of that minimizes the mean square error.
doublebootstrap( data, n1 = -1, n2 = -1, r = 50, k_max_prop = 0.5, kvalues = 20, na.rm = FALSE )doublebootstrap( data, n1 = -1, n2 = -1, r = 50, k_max_prop = 0.5, kvalues = 20, na.rm = FALSE )
data |
A numeric vector of i.i.d. observations. |
n1 |
A numeric scalar specifying the first bootstrap sample size, Nair et al. describe this as |
n2 |
A numeric scalar specifying the second bootstrap sample size |
r |
A numeric scalar specifying the number of bootstraps |
k_max_prop |
A numeric scalar. The max k as a proportion of the sample size. It might be computationally expensive to consider all possible k values from the data. Furthermore, lower k values can be noisy, while higher values can be biased. Hence, k here is limited to a specific proportion (by default 50%) of the data |
kvalues |
An integer specifying the length of sequence of candidate k values |
na.rm |
Logical. If |
Chapter 9 of Nair et al. specifically describes the Double Bootstrap algorithm for the Hill estimator.
The Hill Double Bootstrap method uses the Hill estimator as the first estimator
And a second moments-based estimator:
Where
The difference between these two is given by:
The Hill bootstrap method selects in a way that minimizes the mean square error in the numerator by going through bootstrap samples of different sizes and .
This process is repeated to determine with the bootstrap sample of size . The final is given by:
A named list containing the final results of the Double Bootstrap algorithm:
k: The optimal number of top-order statistics selected by minimizing the MSE.
alpha: The estimated tail index (Hill estimator) corresponding to .
Danielsson, J., de Haan, L., Peng, L., & de Vries, C. G. (2001). Using a bootstrap method to choose the sample fraction in tail index estimation. Journal of Multivariate Analysis, 76(2), 226–248. doi:10.1006/jmva.2000.1903
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 229-233) doi:10.1017/9781009053730
xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) db_kalpha <- doublebootstrap(data = x, n1 = -1, n2 = -1, r = 5, k_max_prop = 0.5, kvalues = 20)xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) db_kalpha <- doublebootstrap(data = x, n1 = -1, n2 = -1, r = 5, k_max_prop = 0.5, kvalues = 20)
Computes the probability density function of the Pareto(,
) distribution:
dpareto(x, alpha, xm)dpareto(x, alpha, xm)
x |
A numeric vector of quantiles. |
alpha |
A positive numeric scalar: tail index. |
xm |
A positive numeric scalar: scale parameter (lower bound). |
A numeric vector of density values (zero for ).
dpareto(x = c(1, 2, 5), alpha = 2, xm = 1)dpareto(x = c(1, 2, 5), alpha = 2, xm = 1)
Hill estimator used to calculate the tail index (alpha) of input data.
hill_estimator(data, k, na.rm = FALSE)hill_estimator(data, k, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
k |
An integer specifying the number of top order statistics to use (the size of the tail). Must be strictly less than the sample size. |
na.rm |
Logical. If |
where are the order statistics
of the data (descending order).
A single numeric scalar: Hill estimator calculation of the tail index .
Hill, B. M. (1975). A Simple General Approach to Inference About the Tail of a Distribution. The Annals of Statistics, 3(5), 1163–1174. http://www.jstor.org/stable/2958370
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 203-205) doi:10.1017/9781009053730
xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) hill <- hill_estimator(data = x, k = 5)xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) hill <- hill_estimator(data = x, k = 5)
Plots the Hill estimator of the tail index as a function
of the number of top order statistics . A stable plateau in this plot
is used to visually select a suitable value of .
hill_plot(data, k_range = NULL, alpha_true = NULL, na.rm = FALSE, ...)hill_plot(data, k_range = NULL, alpha_true = NULL, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
k_range |
An integer vector specifying which values of |
alpha_true |
Optional numeric scalar. If supplied, a horizontal
reference line at the true |
na.rm |
Logical. If |
... |
Additional arguments passed to |
A data.frame with columns k and alpha_hat,
returned invisibly. Users who prefer ggplot2 can capture this output
and re-plot.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. doi:10.1017/9781009053730
set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) result <- hill_plot(x)set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) result <- hill_plot(x)
Tests whether a Pareto(, ) distribution is a good fit
for the data by computing a bootstrap p-value for the Kolmogorov-Smirnov
(KS) statistic (Step 2 of the Clauset et al. pipeline, §8.5).
ks_gof(data, alpha, xm, n_boot = 1000, na.rm = FALSE)ks_gof(data, alpha, xm, n_boot = 1000, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
alpha |
A positive numeric scalar: the Pareto tail index. Typically
obtained from |
xm |
A positive numeric scalar: the lower bound. Only
|
n_boot |
A positive integer: number of bootstrap replicates. Default
|
na.rm |
Logical. If |
The p-value is the fraction of bootstrap KS statistics that exceed the observed KS statistic. A large p-value (e.g., > 0.1) means the Pareto hypothesis cannot be rejected.
A named list with elements:
ks_statistic: Observed KS distance.
p_value: Bootstrap p-value.
n_boot: Number of bootstrap replicates used.
n: Number of observations used (those ).
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703. doi:10.1137/070710111
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 194-196) doi:10.1017/9781009053730
set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) fit <- mle_pareto(x) ks_gof(x, alpha = fit$alpha, xm = fit$xm, n_boot = 100)set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) fit <- mle_pareto(x) ks_gof(x, alpha = fit$alpha, xm = fit$xm, n_boot = 100)
Estimates the lower bound of a power-law regime by finding
the order statistic that minimizes the Kolmogorov-Smirnov distance between
the empirical distribution and the fitted Pareto (Step 1 of the Clauset
et al. pipeline, §8.5).
ks_xmin(data, kmax = -1, kmin = 2, na.rm = FALSE)ks_xmin(data, kmax = -1, kmin = 2, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
kmax |
Maximum number of top order statistics to consider. If
|
kmin |
Minimum number of top order statistics. Default |
na.rm |
Logical. If |
This function extracts and exposes the core loop from plfit,
allowing estimation as a standalone step — useful as input
to mle_pareto, wls_pareto, or
ks_gof.
A named list with elements:
xm: Estimated lower bound .
ks_distance: Minimum KS distance achieved.
k_hat: The optimal .
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703. doi:10.1137/070710111
set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) ks_xmin(x)set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) ks_xmin(x)
Compares the Pareto distribution fit against one or more alternative distributions using the Vuong likelihood ratio test for non-nested models (§8.5, Step 3; Clauset et al. 2009, §3.3).
lr_test_pareto( data, xm = NULL, alternatives = c("exponential", "lognormal", "weibull"), na.rm = FALSE )lr_test_pareto( data, xm = NULL, alternatives = c("exponential", "lognormal", "weibull"), na.rm = FALSE )
data |
A numeric vector of i.i.d. observations. |
xm |
A positive numeric scalar: lower bound. Only
|
alternatives |
A character vector naming the distributions to compare
against. Supported: |
na.rm |
Logical. If |
For each alternative, the log-likelihood ratio
is computed.
The Vuong test statistic checks whether the mean per-observation
log-likelihood ratio is significantly different from zero. A positive
with a small p-value indicates the Pareto is preferred; a negative
with a small p-value indicates the alternative is preferred.
A data.frame with one row per alternative and columns:
alternative: Name of the alternative distribution.
ll_pareto: Pareto log-likelihood.
ll_alternative: Alternative log-likelihood.
lr_statistic: Vuong test statistic (z-score).
p_value: Two-sided p-value.
preferred: "pareto" or the alternative name.
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703. doi:10.1137/070710111
Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57(2), 307-333.
set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) lr_test_pareto(x, xm = 1)set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) lr_test_pareto(x, xm = 1)
Estimates the tail index of a Pareto(, )
distribution via maximum likelihood (Theorem 8.1 of Nair et al.).
mle_pareto(data, xm = NULL, bias_corrected = TRUE, na.rm = FALSE)mle_pareto(data, xm = NULL, bias_corrected = TRUE, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
xm |
Optional positive numeric scalar. Lower bound of the Pareto
support. If |
bias_corrected |
Logical. If |
na.rm |
Logical. If |
The MLE is:
Unlike the Hill estimator (which uses only the top order statistics),
this estimator uses all observations and assumes the entire sample
follows a Pareto distribution with known lower bound .
A finite-sample bias-corrected version (§8.3) uses in the
numerator:
A named list with elements:
alpha: Estimated tail index.
xm: The lower bound used.
n: Number of observations used (those ).
bias_corrected: Logical indicating whether bias correction
was applied.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 162-167) doi:10.1017/9781009053730
set.seed(1) x <- rpareto(n = 1000, alpha = 2, xm = 1) mle_pareto(x)set.seed(1) x <- rpareto(n = 1000, alpha = 2, xm = 1) mle_pareto(x)
Moments estimator to calculate for the input data.
moments_estimator(data, k, na.rm = FALSE, eps = 1e-12)moments_estimator(data, k, na.rm = FALSE, eps = 1e-12)
data |
A numeric vector of i.i.d. observations. |
k |
An integer specifying the number of top order statistics to use (the size of the tail). Must be strictly less than the sample size. |
na.rm |
Logical. If |
eps |
numeric, factor added to the denominator to avoid division by zero. Default value is 1e-12. |
A single numeric scalar: Moments estimator calculation of the shape parameter .
Dekkers, A. L. M., Einmahl, J. H. J., & De Haan, L. (1989). A Moment Estimator for the Index of an Extreme-Value Distribution. The Annals of Statistics, 17(4), 1833–1855. http://www.jstor.org/stable/2241667
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 216-219) doi:10.1017/9781009053730
xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) moments <- moments_estimator(data = x, k = 5)xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) moments <- moments_estimator(data = x, k = 5)
Plots the Moments estimator of the shape parameter as a
function of the number of top order statistics . A stable plateau
indicates a suitable choice of .
moments_plot(data, k_range = NULL, xi_true = NULL, na.rm = FALSE, ...)moments_plot(data, k_range = NULL, xi_true = NULL, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
k_range |
An integer vector specifying which values of |
xi_true |
Optional numeric scalar. If supplied, a horizontal reference
line at the true |
na.rm |
Logical. If |
... |
Additional arguments passed to |
A data.frame with columns k and xi_hat,
returned invisibly.
Dekkers, A. L. M., Einmahl, J. H. J., & De Haan, L. (1989). A Moment Estimator for the Index of an Extreme-Value Distribution. The Annals of Statistics, 17(4), 1833–1855.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. doi:10.1017/9781009053730
set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) moments_plot(x)set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) moments_plot(x)
Computes the cumulative distribution function of the Pareto(,
) distribution:
pareto_cdf(x, xmin, alpha)pareto_cdf(x, xmin, alpha)
x |
A numeric vector of quantiles. |
xmin |
A positive numeric scalar: scale parameter (lower bound). |
alpha |
A positive numeric scalar: tail index. |
A numeric vector of CDF values in .
pareto_cdf(x = c(1, 2, 5), xmin = 1, alpha = 2)pareto_cdf(x = c(1, 2, 5), xmin = 1, alpha = 2)
Pickands estimator to calculate for the input data.
pickands_estimator(data, k, na.rm = FALSE)pickands_estimator(data, k, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
k |
An integer specifying the number of top order statistics to use (the size of the tail). Must be strictly less than the sample size. |
na.rm |
Logical. If |
A single numeric scalar: Pickands estimator calculation of the shape parameter .
Pickands, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics, 3(1), 119–131. http://www.jstor.org/stable/2958083
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 219-221) doi:10.1017/9781009053730
xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) pickands <- pickands_estimator(data = x, k = 5)xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) pickands <- pickands_estimator(data = x, k = 5)
Plots the Pickands estimator of the shape parameter as a
function of the number of top order statistics . A stable plateau
indicates a suitable choice of .
pickands_plot(data, k_range = NULL, xi_true = NULL, na.rm = FALSE, ...)pickands_plot(data, k_range = NULL, xi_true = NULL, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
k_range |
An integer vector specifying which values of |
xi_true |
Optional numeric scalar. If supplied, a horizontal reference
line at the true |
na.rm |
Logical. If |
... |
Additional arguments passed to |
The Pickands estimator requires , so the default k_range
upper bound is floor(n/4) - 1.
A data.frame with columns k and xi_hat,
returned invisibly.
Pickands, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics, 3(1), 119–131.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. doi:10.1017/9781009053730
set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) pickands_plot(x)set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) pickands_plot(x)
This function implements the PLFIT algorithm as described by Clauset et al. to determine the value of . It minimizes the Kolmorogorov-Smirnoff (KS) distance between the empirical cumulative distribution function and the fitted power law.
plfit(data, kmax = -1, kmin = 2, na.rm = FALSE)plfit(data, kmax = -1, kmin = 2, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
kmax |
Maximum number of top-order statistics. If kmax = -1, then kmax=(n-1) where n is the length of dataset |
kmin |
Minimum number of top-order statistics to start with |
na.rm |
Logical. If |
The above equation, as described by Nair et al., is implemented in this function with an Empirical CDF instead of the empirical survival function, which is mathematical equivalent since they are both complements of each other.
A named list containing the results of the PLFIT algorithm:
k_hat: The optimal number of top-order statistics .
alpha_hat: The estimated power-law exponent corresponding to .
xmin_hat: The minimum value above which the power law is fitted.
ks_distance: The minimum Kolmogorov-Smirnov distance found.
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703. doi:10.1137/070710111
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 227-229) doi:10.1017/9781009053730
xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) plfit_values <- plfit(data = x, kmax = -1, kmin = 2)xmin <- 1 alpha <- 2 r <- runif(800, 0, 1) x <- (xmin * r^(-1/(alpha))) plfit_values <- plfit(data = x, kmax = -1, kmin = 2)
This function chooses the and that minimize the negative log likelihood of the Generalized Pareto Distribution (GPD).
pot_estimator(data, u, start_xi = 0.1, start_beta = NULL, na.rm = FALSE)pot_estimator(data, u, start_xi = 0.1, start_beta = NULL, na.rm = FALSE)
data |
A numeric vector of i.i.d. observations. |
u |
A numeric scalar that specifies the threshold value to calculate excesses |
start_xi |
Initial value of |
start_beta |
Initial value of |
na.rm |
Logical. If |
The PDF of a excess data point is given by:
If we apply to the above equation we get:
For all excess data points :
We can thus minimize . The parameters and that minimize the negative log likelihood are the same that maximize the log likelihood. Hence, by using the excesses, we are able to determine and that best fit the tail of the data.
There is also the case to consider when which results in an exponential distribution. The total log likelihood in such a case is:
An unnamed numeric vector of length 2 containing the estimated
Generalized Pareto Distribution (GPD) parameters that minimize the negative log likelihood: (shape/tail index) and (scale parameter).
Davison, A. C., & Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society: Series B (Methodological), 52(3), 393-425. doi:10.1111/j.2517-6161.1990.tb01796.x
Balkema, A. A., & de Haan, L. (1974). Residual life time at great age. The Annals of Probability, 2(5), 792-804. doi:10.1214/aop/1176996548
Pickands, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics, 3(1), 119–131. http://www.jstor.org/stable/2958083
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 221-226) doi:10.1017/9781009053730
x <- rweibull(n=800, shape = 0.8, scale = 1) values <- pot_estimator(data = x, u = 2, start_xi = 0.1, start_beta = NULL)x <- rweibull(n=800, shape = 0.8, scale = 1) values <- pot_estimator(data = x, u = 2, start_xi = 0.1, start_beta = NULL)
Produces a QQ plot comparing the empirical quantiles of the data (filtered
to ) against the theoretical quantiles of a
Pareto(, ) distribution. Points falling close to the
45-degree reference line indicate a good Pareto fit.
qq_pareto(data, alpha, xm = NULL, na.rm = FALSE, ...)qq_pareto(data, alpha, xm = NULL, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
alpha |
A positive numeric scalar: the Pareto tail index (as returned
by |
xm |
Optional numeric scalar. Lower threshold; only data |
na.rm |
Logical. If |
... |
Additional arguments passed to |
The theoretical quantile for the -th order statistic is:
A data.frame with columns empirical and
theoretical, returned invisibly.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 191-194) doi:10.1017/9781009053730
set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) qq_pareto(x, alpha = 2, xm = 1)set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) qq_pareto(x, alpha = 2, xm = 1)
Plots the empirical complementary CDF (CCDF) of the data on a log-log scale.
A power-law distribution appears as a straight line on this plot. If a fitted
plfit() result is supplied, the theoretical Pareto CCDF is overlaid.
rank_plot(data, fit = NULL, log_scale = TRUE, na.rm = FALSE, ...)rank_plot(data, fit = NULL, log_scale = TRUE, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
fit |
Optional. A list returned by |
log_scale |
Logical. If |
na.rm |
Logical. If |
... |
Additional arguments passed to |
A data.frame with columns x and ccdf,
returned invisibly.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 176-179) doi:10.1017/9781009053730
set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) fit <- plfit(x) rank_plot(x, fit = fit)set.seed(1) x <- rpareto(800, alpha = 2, xm = 1) fit <- plfit(x) rank_plot(x, fit = fit)
Generates random samples from a Pareto(, )
distribution via inverse CDF: where
.
rpareto(n, alpha, xm)rpareto(n, alpha, xm)
n |
A non-negative integer: number of samples to generate. |
alpha |
A positive numeric scalar: tail index. |
xm |
A positive numeric scalar: scale parameter (lower bound). |
A numeric vector of length n.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. doi:10.1017/9781009053730
x <- rpareto(n = 500, alpha = 2, xm = 1)x <- rpareto(n = 500, alpha = 2, xm = 1)
Estimates the Pareto tail index via weighted least squares (WLS)
regression on the log-log rank plot (Theorem 8.5 of Nair et al.). The WLS
weights downweight noisy tail
observations relative to OLS, recovering the MLE asymptotically.
wls_pareto(data, xm = NULL, plot = TRUE, na.rm = FALSE, ...)wls_pareto(data, xm = NULL, plot = TRUE, na.rm = FALSE, ...)
data |
A numeric vector of i.i.d. observations. |
xm |
Optional positive numeric scalar. Lower bound. If |
plot |
Logical. If |
na.rm |
Logical. If |
... |
Additional graphical arguments passed to
|
The WLS estimate is:
If plot = TRUE, the rank plot is drawn with both the WLS and OLS
fitted lines, reproducing Figure 8.9 of Nair et al.
A named list with elements:
alpha_wls: WLS estimate of the tail index.
alpha_ols: OLS estimate (unweighted) for comparison.
xm: The lower bound used.
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 167-173) doi:10.1017/9781009053730
set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) wls_pareto(x)set.seed(1) x <- rpareto(n = 500, alpha = 2, xm = 1) wls_pareto(x)