Using Artificial Neural Networks for Evaluation of Collapse Potential of Some Iraqi Gypseous Soils

In this research, Artificial Neural Networks (ANNs) will be used in an attempt to predict collapse potential of gypseous soils. Two models are built one for collapse potential obtained by single oedometer test and the other is for collapse potential obtained by double oedometer test. A database of laboratory measurements for collapse potential is used. Six parameters are considered to have the most significant impact on the magnitude of collapse potential and are being used as an input to the models. These include the Gypsum content, Initial void ratio, Total unit weight, Initial water content, Dry unit weight, Soaking pressure. The output model will be the corresponding collapse potential. Multi-layer perceptron trainings using back propagation algorithm are used in this work. A number of issues in relation to ANN construction such as the effect of ANN geometry and internal parameters on the performance of ANN models are investigated. Information on the relative importance of the factors affecting the collapse potential are presented and practical equations for prediction of collapse potential from single oedometer test and double oedometer test in gypseous soils are developed. It was found that ANNs have the ability to predict the collapse potential from single oedometer test and double oedometer test in gypseous soil samples with a good degree of accuracy. The ANN models developed to study the impact of the internal network parameters on model performance indicate that ANN performance is sensitive to the number of hidden layer nodes, momentum terms, learning rate, and transfer functions. The sensitivity analysis indicated that for the models the results indicate that the initial void ratio and gypsum content have the most significant affect on the predicted the collapse potential.


Introduction
Gypseous soil is a term used to denote soil with gypsum content. They are found in many regions in the world, mainly in arid and semiarid regions. Gypseous soils cover about 30% of the surface area of Iraq with gypsum content differs from one area to another, Nashat [1,2] . Gypseous soils are usually stiff when dry, but great losses in strength and sudden increase in compressibility occur upon wetting.
Several investigators studied the collapsibility behaviour of gypseous soils and agreed to consider a term named "Collapse Potential" proposed by Jennings and Knight [3] as a guide in the design of the foundations on gypseous soils, [3,4,5,6,7,8,9] . This term can be measured through testing an oedometer sample after a simple alteration of the procedure of the test.
Over the last few years, the use of (ANNs) has increased in many areas of engineering. In particular, ANNs have been applied to many geotechnical engineering problems and have demonstrated some degree of success, [10,11,12,13] .
Scope of this paper to explore the use of Artificial Neural Network (ANN) models for predicting the "Collapse Potential (CP)" from Single Collapse Test and Double Oedometer Test under different conditions, provide a mathematical equation for prediction of "Collapse Potential (CP)" for two tests based on ANN technique and carry a sensitivity analysis to identify which of the input variables have the most significant impact on "Collapse Potential (CP)" for two models predictions.

Brief Overview of Artificial Neural Networks
An artificial neural network is an attempt to simulate the manner in which the brain interprets information as determined by the current knowledge. Artificial neural networks behave in much the same manner as biological neural networks. Many authors have described the structure and operation of ANNs Zurada [14] . ANNs consist of a number of artificial neurons variously known as processing elements "PEs", "nodes" or "units". For multilayer perceptrons (MLPs), which is the most commonly used ANNs in geotechnical engineering, processing elements in are usually arranged in layers: an input layer, an output layer and one or more intermediate layers called hidden layers. Each processing element in a specific layer is fully or partially connected to many other processing elements via weighted connections. From many other processing elements, an individual processing element receives its weighted inputs, which are summed and a bias unit or threshold is added or subtracted. The bias unit is used to scale the input to a useful range to improve the convergence properties of the neural network. The result of this combined summation is passed through a transfer function (e.g. logistic sigmoid or hyperbolic tangent) to produce the output of the processing element. For node j, this process is summarized in equations.
Transfer (2) where I j = the activation level of node j; W ij = the connection weight between nodes i and j; x i = the input from node i , i = 0,1,……, n; θ j = the bias or threshold for node j; y j = the output of node j; and f(.) = the transfer (activation) function

Development of ANNs Models
Over the years, several investigators studied the collapsibility behaviour of gypseous soils and achieved many empirical relationships of this process depending on many factors affect on it and agreed to consider a term "Collapse potential (CP)" proposed by Jennings and Knight [3] , as guide in the design of the foundations on gypseous soils.
The data used to calibrate and validate the neural network models are obtained from the literature, and include laboratory measurements of collapse potential as well as corresponding information regarding the soil properties, apparatus used and testing conditions. The data cover a range of soil types. The database comprises a total of (345) case record, and can be found in the literature. The steps for developing ANN models as outlined include the determination of model inputs and outputs, pre-processing and division of the available data, scaling of data, and the determination of appropriate network architecture and optimization of the connection weights. A PC-based commercial software system called Neuframe Version 4.0 (Neusciences [15] ) is used, in which optimal network architecture is determined by trial-and-error.

Models Inputs and Outputs
It is generally accepted that six parameters have the most significant impact on the collapse potential in gypseous soils, and are thus used as the ANN model inputs. These include the following: Gypsum content (GC) %. Initial void ratio (e o ) Initial total unit weight (γ t ) Initial water content (w o ) Initial dry unit weight (γ d ) Soaking pressure (P so ) kPa The output of the model is Collapse potential of Single Oedometer Test and double Oedometer Test.

Pre-processing and Data Division
Data processing is very important in using neural nets successfully. It determines what information is presented to create the model during the training phase. It can be in the form of data scaling, normalization and transformation. Transforming the input data into some known forms (e.g. log., exponential, etc.) may be helpful to improve ANN performance. The next step in the development of ANN models is dividing the available data into their subsets training, testing, and validation. The training set is used to adjust the connection weights of the neural network. The testing set is used to check the performance of the network at various stages of learning, and training is stopped once the error in the testing set increases. The validation set is used to evaluate the performance of the model once training has been successfully accomplished, Shahin [13] .
In total, 80% of the data are used for training and 20% are used for validation. The training data are further divided into 70% for the training set and 30% for the testing set. These subsets are also divided in such a way that they are statistically consistent and thus represent the same statistical population. In order to achieve this, several random combinations of the training, testing and validation sets are tried until three statistically consistent data sets are nearly obtained. To examine how representative the training, testing and validation sets are with respect to each other t-test and F-test are carried out. The t-test examines the null hypothesis of no difference in the means of two data sets and the F-test examines the null hypothesis of no difference in the variances of the two sets. For a given level of significance, test statistics can be calculated to test the null hypotheses for the t-test and F-test respectively. Traditionally, a level of significance equal to 0.05 is selected.

Scaling of Data
The input and output variables are pre-processed by scaling them to eliminate their dimension and to ensure that all variables receive equal attention during training. Scaling has to be commensurate with the limits of the transfer functions used in the hidden and output layers. The simple linear mapping of the variables , extremes to the neural network ' s practical extremes is adopted for scaling, as it is the most commonly used method, Shahin [13] . As part of this method, for each variable x with minimum and maximum values of x min and x max , respectively, the scaled value x n is calculated as follows

Model Architecture, Optimization and Stopping Criteria
One of the most important and difficult tasks in the development of ANN models is determining the model architecture (i.e. the number and connectivity of the hidden layer nodes). A network with one hidden layer can approximate any continuo function, provided that sufficient connection weights are used, Shahin [13] .Consequently, one hidden layer is used in this research. The general strategy adopted for finding the optimal network architecture and internal parameters that control the training process is as follows: a number of trials is carried out using the default parameters of the software used with one hidden layer and 1, 2, 3,…, 13 hidden layer nodes. It should be noted that 13 is the upper limit for the number of hidden layer nodes needed to map any continuous function for a network with 6 inputs, Caudill [16] and consequently, is used in this work.
The network that performs best with respect to the testing set is retrained with different combinations of momentum terms, learning rates and transfer functions in an attempt to improve model performance, since the back-propagation algorithm uses a first-order gradient descent technique starting po Consequ function is occurs.

Do
The optim equal 0.15 error differ

ANN
The small network to

Sin
Using the oedometer where

Do
Using the oedometer

Sensitivity Analysis of the ANN Model Inputs
In an attempt to identify which of the input variables has the most significant impact on collapse potential of single and double oedometer test predictions, a sensitivity analysis is carried out on the ANN models. A simple and innovative technique proposed by Garson [17] is used to interpret the relative importance of the input variables by examining the connection weights of the trained network.

Single Oedometer Test
The results indicate that the gypsum content and initial void ratio had the most significant effect on the predicted collapse potential of single oedometer test with a relative importance of 27.1 and 26.7% respectively, followed by dry unit weight, initial water content, soaking pressure and total unit weight with a relative importance of 13.9, 12.9, 10.1 and 9.11% respectively. The results are also presented in Figure 3.

Double Oedometer Test
The results indicate that the initial void ratio has the most significant effect on the predicted collapse potential followed by initial water content with a relative importance 24.6 and 19.1%. The results also indicate that soaking pressure, gypsum content and dry unit weight have moderate impact on the collapse potential with a relative importance equals to 17.4, 15.5 and 14.4 %, respectively, while the total unit weight has the smallest impact on the collapse potential with relative importance of 9.1%. The results are also presented in Figure 4.

Rob
The mode correlation criteria tha ANNs h oedometer developing

Val
To assess t (CPS) and to predict show that t