LASSO variable selection in data envelopment analysis with small datasets
Journal
Omega (United Kingdom)
Journal Volume
91
Date Issued
2020
Author(s)
Cai J.-Y.
Abstract
The curse of dimensionality problem arises when a limited number of observations are used to estimate a high-dimensional frontier, in particular, by data envelopment analysis (DEA). The study conducts a data generating process (DGP) to argue the typical “rule of thumb” used in DEA, e.g. the required number of observations should be at least larger than twice of the number of inputs and outputs, is ambiguous and will produce large deviations in estimating the technical efficiency. To address this issue, we propose a Least Absolute Shrinkage and Selection Operator (LASSO) variable selection technique, which is usually used in data science for extracting significant factors, and combine it in a sign-constrained convex nonparametric least squares (SCNLS), which can be regarded as DEA estimator. Simulation results demonstrate that the proposed LASSO-SCNLS method and its variants provide useful guidelines for the DEA with small datasets. © 2018 Elsevier Ltd
Subjects
Convex nonparametric least squares; Data envelopment analysis; Efficiency estimation; Feature selection; Lasso
Other Subjects
article; human; least square analysis; practice guideline; simulation
Type
journal article
