Artificial neural networks, quantile regression, and linear regression for site index prediction in the presence of outliers

Abstract: The objective of this work was to compare methods of obtaining the site index for eucalyptus (Eucalyptus spp.) stands, as well as to evaluate their impact on the stability of this index in databases with and without outliers. Three methods were tested, using linear regression, quantile regression, and artificial neural network. Twenty-two permanent plots from a continuous forest inventory were used, measured in trees with ages from 23 to 83 months. The outliers were identified using a boxplot graphic. The artificial neural network showed better results than the linear and quantile regressions, both for dominant height and site index estimates. The stability obtained for the site index classification by the artificial neural network was also better than the one obtained by the other methods, regardless of the presence or the absence of outliers in the database. This shows that the artificial neural network is a solid modelling technique in the presence of outliers. When the cause of the presence of outliers in the database is not known, they can be kept in it if techniques as artificial neural networks or quantile regression are used.