Estimation of Monthly Mean Reference Evapotranspiration by Using Artificial Neural Network Models in Basrah City, South of Iraq

The main objective of this study is to evaluate the comparative performance of three artificial neural network techniques (radial basis functions “RBF”, multilayer perceptron “MLP”, and group method of data handling “GMDH”) based approach with the Penman–Monteith “PM” method for determining the group reference evapotranspiration “ET 0 ” on monthly basis in Basrah City, south of Iraq. Climate information extends over 22 years (1991- 2012), monthly records of maximum temperature (Tmax), mean temperature (Tmean), minimum temperature (Tmin), wind speed (U) and relative humidity (RH) are used in this research. The architecture of artificial neural network models is performed during the process of training. The efficiency of trained model is checked by using the testing data, which is not used in the process of training. The evaluating of the artificial neural model performance is carried out by using cross-validation, a set of rows for each validation fold is determined randomly after stratification on the target variable “ET 0 ”. Various set of climate inputs variables are used for creating nine artificial neural network models. The efficiency of artificial neural network models with two predictor variables (Tmean & U) for simulating ET 0 is highly efficient according to the evaluation criteria. There is a significant improvement in the results of all artificial neural network models when using three input combination variables (Tmean, U, & RH) compared with the models that have only two-climate variables. Artificial neural network models especially (RBF, MLP, and GMDH) are efficient and powerful techniques for simulating ET 0 .


Introduction
There is no easy way for distinguishing between evaporation and transpiration, two processes occur simultaneously. Regardless of the availability of water in topsoil, the evaporation is determined from crop soil by the part of the solar radiation that reaches the soil surface [1]. During the growing period of the crop as the crop evolves and the crop canopy is more and more shaded than the land area, this part of evaporation is decrease. Water is mostly lost when the crop is small, but after some time, transpiration becomes the main process when the crop develops and covers the soil completely [2]. The main weather factors affecting evapotranspiration are air temperature, humidity, radiation, and wind speed. The reference evapotranspiration (ET0) is the rate of evapotranspiration from a reference surface, as the water is abundant. The reference surface can be expressed as a hypothetical grass crop with certain properties [2]. The estimation of evapotranspiration (ET) is one of the main tasks in calculating the water budget; this is the second largest element after precipitation [2]; therefore, estimating the quantity of ET is a key factor in the management of scarce water resources. The importance of estimating the amount of ET in hydrological and agricultural studies led to the development of different methodologies and techniques for estimating this value. Lysimeter filed instrument or water balance approach are used for estimating ET0 as a direct method of measurement, it can also be measured indirectly through climate information [3]. However, the high operating costs is the drawback of the lysimeter. Moreover, there are numerous errors affecting the accuracy of the measurements. Differences in the thermal, wind and radiological system between the lysimeter and its surroundings [4] in addition to managing the lysimeter, it can affect the measurements. Due to these difficulties in estimating ET0, indirect ET0 estimation methodology that primarily relies on the ease of capturing meteorological data has become more common. In the past few decades, many methodologies have been developed, they are classified as temperature-based, radiation-based, evaporation-based and compositiontype, to estimate ET0 [5]. The Penman-Monteith equation (FAO56 PM) is adopted by the Food and Agriculture Organization of the United Nations (FAO) for providing a valid global common standard for estimating of ET0, crop varieties development, and calibration / evaluation of other ET0 methods when Lysimeter measurements are not available [1]. In addition to traditional modelling techniques for estimating evapotranspiration, artificial neural networks (ANNs) have useful for modelling complex nonlinear problems in the recent years [6][7][8][9][10][11][12][13]. Neural networks have been demonstrated to be a competitive alternative to traditional models for modelling of nonlinear systems. ANNs have been used in many theoretical and practical applications related to meteorology, environmental processes and water resource engineering; many of these applications relate to classification, prediction and estimation problems [14][15][16][17][18][19][20]. ANNs applications have gained wide popularity because of their enormous functional properties, lower data requirements and long-term prediction capability which possess enormous characteristics over methods of traditional analytic. There has been an increasing rush in the use of ANNs in hydrological modelling and water resources engineering. The main objective of this research is to assess the comparative performance of ANN techniques based approach with the PM method to estimate the ET0 on monthly basis in Basrah City, south of Iraq.

Study area and data preparation
Basrah is lie in south of Iraq and it is bordered by Iran, Kuwait and Saudi Arabia. The province consists of a vast desert plain, intersected by the Shatt al-Arab waterway, which consists of the confluence of the Tigris and Euphrates rivers in the Qurna and flows into the Arabian Gulf. Around Qurna, a number of lakes can be found, while the marshes extend from the northern province to the neighbouring provinces of Dhi Qar and Maysan. Basrah Province is considered as Iraq's only gateway to the sea. Like the surrounding area, Basrah has a hot and dry climate. Temperatures at summer season are among the highest recorded in the world. It is located between longitude line (47° 30' -48° 30') and latitude line (30° 00' -31° 00') as shown in Figure 1. The climate information of Hai Al-Hussain meteorological station in the middle of Basrah used in this study. Climate information extends over 22 years (1991 -2012) monthly records of maximum temperature (Tmax), mean temperature (Tmean), minimum temperature (Tmin), wind speed (U) and relative humidity (RH) are used here. Table 1 is illustrated the statistical summary of climate variables.

Penman-Monteith method
FAO-56 PM method for estimation ET0 is an unambiguous physical method integrates both aerodynamic and physiological parameters [21]. It is considered as the most a universally consistent accepted method for estimating ET0 under different types of climate. The standard form of the PM method for estimating ET0 is presented as following [1]: : Saturation vapor pressure deficit (kPa) : Slope vapor pressure curve (kPa o C -1 ) : Psychrometric constant (kPa o C -1 ) FAO-56 PM method is applied in this research as a reference method to evaluate the performance of artificial neural network techniques. The FAO 56 PM method requires measurements of relative humidity, wind speed, temperature, and solar radiation. This demand for data is the main obstacle to its use in locations where the information of climate data is limited [22][23][24][25]. This is a major problem for developing countries [26][27][28] and especially for tropical regions [29]. To avoid this deficit with the required data, many researchers have attempted to apply artificial intelligence (AI) techniques such as ANNs for modelling ET0 with high accuracy results [12, 13, and 15].

Radial basis function network
The radial basis activation functions is used in radial basis function (RBF) network which is a type of an artificial neural network. Network output is a linear combination of radial basis functions for input and neuron parameters. RBF networks have many applications in different disciplines, which includes function approximation, prediction of time series, system control, and classification. RBF networks consists of three layers: an input layer, a non-linear activation function of a hidden layer and an output layer. Figure 2 is presented the architecture of a RBF network. The input vector ( ) is used as input for all radial basis functions, each with different parameters. The output of the network is a linear combination of the outputs from radial basis functions. The RBF neuron activation is presented below: Where : The input , : The mean : Coefficient, which is controlling the width of the bell curve

Multilayer perceptron neural network
Multilayer perceptron (MLP) is the layering arrangement of non-linear processing elements (PEs) as presented in Figure 3. The connection between PEs is weighted by a scalar weight (w), the scalar weight is adopted during the process of training.

Fig. 3: Architecture of a MLP with one hidden layer
The contact weights are modified during the process of training for reducing the squared difference between the PE response and the desired output. The inverse input autocorrelation matrix (R -1 ) and the crosscorrelation vector (P) between the desired response and the input yield the optimal weights (wopt). The analytical solution of this problem is equivalent to a search technique to find the minimum of the quadratic performance surface, J (wi), using gradient descent by adjusting the weights at each epoch [30]: : Coefficient of learning rate ( ): Gradient vector of the performance surface for the ith input node at iteration k Equation (4) is applied for calculating the performance surface (J) : Target vector , : Output of the p th output neuron

Group method of data handling polynomial network
The group method of data handling (GMDH) method was firstly presented by Ivakhnenko [31] for identification and modelling of complex systems. This type of network is used as a way to circumvent the difficulty through prior knowledge of the process under study. The basic goal for using GMDH is to construct an analytical function in a feedforward network based on a quadratic node transfer function [32] whose coefficients are calculated by using a regression technique. Through the training process, the architecture of a GMDH network is formed. The node activation function depended on elementary polynomials of arbitrary order. The multidimensional problem of model improvement is solved by this method through the selection procedure and models from a set of candidate models in accordance to the provided criterion. A general link between input and output can be expressed by the functions of the Volterra, which are the discrete, analogous of the polynomial of Kolmogorov-Gabor [33], Equation (5): Where: : Inputs , : Polynomial coefficients : The node output

Methodology
Three artificial neural network techniques (radial basis activation functions, multilayer perceptron, and group method of data handling polynomial) are used in this study. Five input parameters (Tmean, Tmax, Tmin, RH and U) were used for constructing models. Different inputs combination are used for creating nine artificial neural network models. The detailed information for presented models and input variables are illustrated in Table 2. The input variables are classified into two types (training and testing). The architectures of artificial intelligent models are performed during the training process. Trained models and their efficiency are checked by using the testing data, which is not used in the process of training. During the testing and training process, certain percentages of testing and training events are used, different results and conclusions are produced when using different percentages of data for training and testing. For solving this problem, cross validation is used in this paper [34]. When using crossvalidation to assess the performance of the artificial neural model, a set of rows for each validation fold is determined randomly after stratification on the target variable "ET0".

Table 2: Artificial neural network models with input combinations
The performances of ANNs models for training and testing phase are assessing based on maximum error (ME), root mean squared error (RMSE), mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). RBF network is consisted from three layers; every predictor variable has one neuron in the input layer. The range of values is standardized by the input neurons; this process is achieved by subtracting the median and dividing by the interquartile range. The values that processing by the input neurons pass to the neurons of the hidden layer. The training process calculates the optimum number of neurons in the hidden layer. A RBF has a number of dimensions equal to the number of predictor variables. Every dimension has a radius may be different from other dimension. The spreads (radiuses) and centres are calculated by using the training process. The last layer is the summation layer; the results are transferred from hidden layer to the summation layer. The produced values in the hidden layer are multiplied by weights and transferred to the summation layer that summed the weighted values to represent at the end of the process as the network output. The training algorithm, which is presented by [35], is used here. The process of adding neurons to the network stops by this algorithm when estimating leave-one-out (LOO) error increases due to over fitting. Ridge regression is used for computing the optimum weights between the neurons in summation layer and the hidden layer. The network of MLP has an input layer; one hidden layer is used in this paper and an output layer. The range of values is standardized by the input neurons, the range of predictor variable from -1 to 1. The values of predictor variables are distributed to the neurons of the hidden layers. There is a constant, it is named bias (see Figure 3), that is fed to the hidden layer; this bias is multiplied by a weight. The optimum number of neurons in the hidden layer is determined automatically by specifying the minimum and maximum number. This method is carried out by constructing many models with varying number of neurons then check the performance of each model through cross validation training process. Conjugate gradient algorithm is used here for modifying the weight values by using the gradient; this process is achieved by errors backward propagation of the network. The user does not need to specify the momentum and learning rate parameters. In this study, the hidden layer activation function and output layer activation function are logistic and linear respectively. GMDH is considered as self-organizing networks. At the first, the network is started with only input neurons. Neurons are added to the hidden layers from a candidate pool during the process of training. The number of network layers is selected automatically to achieve the accuracy of results without over fitting. Network layer connections is pervious layer and original input variables. Quadratic polynomial of two variables is used in this research as transfer function. Maximum network layers and maximum polynomial order are equal to 20 and 16 respectively. Control data is used to finish the constructing process when over fitting occurs. Least squares regression is applied to calculate the optimal parameters for the function in each candidate neuron to make it best fit the training data. [36]. The importance of the variables for creating the model and the amount of its contribution has been calculated through importance score of variables. Importance score are calculated using information about how variables are used as primary divisions and as alternative divisions. Obviously, the variable specified as the primary splitter in the tree is important, in addition to alternate splitters which are closely mimicking the primary splitter are also important because they may be of the same quality as the primary splitter for producing the tree. If the primary splitter is a little better than the alternative, the primary splitter may hide the importance of the other variable. By using this feature, it is possible to know the most important variables and measuring the potential and actual value of a predictor. The most important predictor is scaled at 100.00, while less important variables take lower values.

Results and discussion
The RBF network parameters of the training and validation processes for models (1, 4, and 7) are shown in Table (3). The minimum radius (spread) for neurons are the same value for all models. The maximum radius is determined during the training process. If the verification error is worse than the training error, then the maximum radius value is increased. If the training and validation errors are close but greater than the specified value, the maximum radius is reduced. It is observed from this table that when the number of predictors increases, the maximum value of the radius and the number of neurons increase. Table 3:

RBF network parameters
Network size evaluation was performed using 4-fold cross-validation for models of MLP (Table 4). The optimal number of neurons in the hidden layer is specified by the automatic search. The optimal number of neurons for each hidden layer will be specified after the search is completed. The number of neurons does not necessarily increase with increasing the number of predictors as shown in Table 4 (7, 8 and 9) respectively compared with the models (1, 2, and 3) for validation data. Artificial intelligent techniques especially (RBF, MLP, and GMDH) are efficient and powerful techniques for simulating the reference evapotranspiration (ET0). The importance of the predictor variables for constructing the model and the amount of its contribution has been calculated through importance score of variables as shown in Table 6. It can be concluded that the most important variable for estimating of ET0 is the maximum temperature, followed by relative humidity and then the wind speed. Figures 4 to 12 present the comparative plot in the training and validation phase for ET0 given by FAO-56 PM method and ET0 estimated by artificial intelligent models.

CONCLUSIONS
Three artificial neural network techniques (RBF, MLP, and GMDH) are used for estimating ET0. FAO-56 PM method was used in this research as a reference method to evaluate the performance of artificial neural network techniques. Five input parameters (Tmean, Tmax, Tmin, RH and U) were used for constructing models. Many climate variables are needed for applying FAO-56 PM method, this demand for data is the main obstacle to its use in locations where climate data are limited, especially for developing countries and tropical regions. The performances of artificial neural network models for both training and validation phase are evaluating based on (ME, RMSE, MSE, MAE, and MAPE). The efficiency of three artificial neural network models with two predictor variables (Tmean & U) for simulating ET0 is highly efficient according to statistical errors. There is a significant improvement in the results of all artificial neural network models when using three input combination variables (Tmean, U, & RH) compared with the models that have only two climate variables. Artificial neural network models especially (RBF, MLP, and GMDH) are efficient and powerful techniques for simulating ET0. The importance of the predictor variables for constructing the model and the amount of its contribution has been determined through importance score of variables. The most important climate variable for estimating of ET0 is the maximum temperature, followed by relative humidity and then the wind speed.