Economic efficiency analyses consider two separate measures of efficiency:
Technical efficiency: How effectively inputs are transformed into outputs
Allocative inefficiency: How effectively the production plan maximizes profit
Traditionally, data envelopment analysis (DEA) and stochastic frontier analysis (SFA) have been used to estimate technical efficiency. Both methods estimate a production frontier that quantifies the maximum quantity of output producible by a given bundle of inputs. The observed output of a decision making unit (DMU) can be compared to the frontier to arrive at an estimate of efficiency. SFA assumes efficiency is randomly determined and follows a distribution common across DMUs (Aigner et al. 1977). A standard stochastic frontier model takes the form yi = f(Xi) − δi + εi, where i indexes the DMU, yi is log-output, f is a transformation function, Xi are inputs, δi > 0 is log-efficiency, and εi is observational error. The log-efficiency term δi may be assumed to follow a half-normal, exponential, or other positively-defined distribution. Traditionally, SFA uses a parametric approximation for f.
DEA, on the other hand, assumes the production frontier is deterministic and defined by highest observed output (conditional on inputs used). The estimated frontier is piecewise-linear connecting maximum observed output conditional on inputs used while enforcing monotonicity and concavity constraints (as well as constraints on returns to scale, if those are specified). An example DEA frontier using variable returns to scale is shown below:
library(ggplot2)
# library(rDEA) #rDEA not installing with travis-ci, dea method below is included in snfa
data(univariate)
dea.fit <- dea(univariate$x, univariate$y,
univariate$x, univariate$y,
model = "output",
RTS = "variable")
univariate$frontier <- univariate$y / dea.fit$thetaOpt
ggplot(univariate, aes(x, y)) +
geom_point() +
geom_line(aes(y = frontier), color = "red")
Allocative efficiency is evaluated by checking whether DMUs’ first-order condition for profit-maximization is satisifed. Specifically, if DMUs are price-takers in input and output markets, first-order conditions are given by $$\frac{\partial f(X_i)}{\partial x^j} = \frac{w_i^j}{p_i},$$ where superscripts index inputs, wij is the cost of input j to DMU i, and pi is the price of output for DMU i. If efficiency is multiplicative, so that f(Xi) = p(Xi)δi for a frontier function p, then the first-order condition can be expressed as $$\delta_i\frac{\partial p(X_i)}{\partial x^j} = \frac{w_i^j}{p_i}.$$ Empircally, log-overallocation of input j by DMU i can be estimated by $$\log\left(w_i^j\right) + \log\left(p_i\right) - \log\left(\delta_i\frac{\partial p(X_i)}{\partial x^j}\right).$$ If this quantity is positive, DMU i used more of input j than would be profit maximizing.
As detailed above, to estimate allocative inefficiency one needs to be able to estimate marginal input productivities, which are derivatives of the production function/frontier. Since DEA is piecewise-linear, there can be points on the estimated frontier where the derivative is undefined. Using the previous example, the points in blue in the following plot have undefined derivatives.
Thus, DEA is inappropriate for estimating allocative inefficiency.
Smooth non-parametric frontier analysis (SNFA) is a smooth analogue of DEA (Racine et al. 2009). It uses constrained kernel smoothing that ensures the estimated frontier lies above observed output for each observation, as well as imposing monotonicity or concavity constraints, if specified. An increasing, concave boundary is fit below:
X <- as.matrix(univariate$x)
y <- univariate$y
N.fit <- 100
X.fit <- as.matrix(seq(min(X), max(X), length.out = N.fit))
#Reflect data for fitting
reflected.data <- reflect.data(X, y)
X.eval <- reflected.data$X
y.eval <- reflected.data$y
frontier.mc <- fit.boundary(X.eval, y.eval,
X.bounded = X, y.bounded = y,
X.constrained = X.fit,
X.fit = X.fit,
method = "mc")
frontier.df <- data.frame(x = X.fit,
y = frontier.mc$y.fit)
ggplot(univariate, aes(x, y)) +
geom_point() +
geom_line(data = frontier.df, color = "red")
slope.df <- data.frame(x = X.fit,
slope = frontier.mc$gradient.fit)
ggplot(slope.df, aes(x, slope)) +
geom_line()
A more in-depth example examining different constraints can be found
in the example for fit.boundary
:
The function allocative.efficiency
uses SNFA to fit a
production frontier, then uses the frontier to derive estimates of
marginal input productivities. Those marginal productivities are
compared to ratio of input to output prices to arrive at an estimate of
overallocation. The example in allocative.efficiency
estimates overallocation of labor and capital in the U.S. using
macroeconomic data. First, data is loaded and cleaned:
data(USMacro)
USMacro <- USMacro[complete.cases(USMacro),]
#Extract data
X <- as.matrix(USMacro[,c("K", "L")])
y <- USMacro$Y
X.price <- as.matrix(USMacro[,c("K.price", "L.price")])
y.price <- rep(1e9, nrow(USMacro)) #Price of $1 billion of output is $1 billion
Then, the model is fit with allocative.efficiency
:
#Run model
efficiency.model <- allocative.efficiency(X, y,
X.price, y.price,
X.constrained = X,
model = "br",
method = "mc")
Finally, results are plotted and average overallocation is estimated:
#Plot technical/allocative efficiency over time
library(ggplot2)
technical.df <- data.frame(Year = USMacro$Year,
Efficiency = efficiency.model$technical.efficiency)
ggplot(technical.df, aes(Year, Efficiency)) +
geom_line()
allocative.df <- data.frame(Year = rep(USMacro$Year, times = 2),
log.overallocation = c(efficiency.model$log.overallocation[,1],
efficiency.model$log.overallocation[,2]),
Variable = rep(c("K", "L"), each = nrow(USMacro)))
ggplot(allocative.df, aes(Year, log.overallocation)) +
geom_line(aes(color = Variable))
#Estimate average overallocation across sample period
lm.model <- lm(log.overallocation ~ 0 + Variable, allocative.df)
summary(lm.model)
#>
#> Call:
#> lm(formula = log.overallocation ~ 0 + Variable, data = allocative.df)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.43704 -0.18195 -0.08572 0.14338 0.61385
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> VariableK 2.0297 0.0465 43.65 <2e-16 ***
#> VariableL -0.8625 0.0465 -18.55 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.279 on 70 degrees of freedom
#> Multiple R-squared: 0.9698, Adjusted R-squared: 0.9689
#> F-statistic: 1124 on 2 and 70 DF, p-value: < 2.2e-16
Aigner D, Lovell CK, Schmidt P (1977). “Formulation and estimation of stochastic frontier production function models.” Journal of Econometrics, 6(1), 21-37.
Racine JS, Parmeter CF, Du P (2009). “Constrained nonparametric kernel regression: Estimation and inference.” Working paper.