Embedding Boosted Regression Trees approach to variable selection and cross-validation in parametric regression to predict diameter distribution after thinning
Journal
Forest Ecology and Management
Journal Volume
499
Date Issued
2021
Author(s)
Abstract
Modeling stand diameter distribution is useful for many reasons such as predicting future product range. Parameter Prediction Method (PPM) is a commonly used method and has the advantage of interpreting parameters in terms of silvicultural practices and stand dynamics. Past PPM studies have seldom applied variable selection and cross-validation. The goal of this study was to embed a machine learning technique, Boosted Regression Trees (BRT), into PPM to address the knowledge gap. The PPM-BRT framework first applies BRT models to cross-validate and select stand attributes that are influential to the parameters of a probability density function (PDF) and then uses Seemingly Unrelated Regression (SUR) linear models to quantify the relationships. The framework was tested on Taiwania cryptomerioides thinning experiments. The three-parameter Weibull PDF had the best overall fit to the diameter distributions after thinning. The fitted BRT models explained about 76.9–86.8% of variations in the shape, scale, and location parameters. The final set of predictors selected by BRT that highly influenced the three parameters included number of years since thinning and the three moments of residual diameter distribution immediately after thinning. The SUR models showed that the shape and scale parameters were negatively associated with skewness of residual diameter distribution, but the location parameter was positively associated with it. Also, the three parameters were positively associated with number of years since thinning. This suggests that an intensive thinning from below results in a post-harvest diameter distribution that is more positively skewed, less variation in residual diameters, and larger minimum diameter. The diameter distribution would be less skewed and more heterogenous over time likely due to stem exclusion. Our study shows that BRT is more robust than stepwise regression. Future work could explore partially linear model for better integration of machine learning and parametric models. ? 2021 Elsevier B.V.
Subjects
Machine learning
Parameter Prediction Method
Seemingly Unrelated Regression
Silviculture
Stand dynamics
Forestry
Probability density function
Regression analysis
Weibull distribution
Boosted regression trees
Cross validation
Diameter distributions
Machine-learning
Parameter prediction methods
Seemingly unrelated regression
Thinnings
Variables selections
Forecasting
forestry practice
model validation
probability density function
regression analysis
silviculture
thinning
Forecasts
Regression Analysis
Statistical Distribution
Taiwania cryptomerioides
Type
journal article
