Contents - Index


Artificial Neural Network

(This feature is only available in GenEx Enterprise)

 

Theory

Basically, a ANN is a multivariate model that takes the expression profiles of one or several genes from the (training) samples as input, and the data in one or several classification columns as output. The theory behind artificial neural networks (ANN) is to transform the original data in to one or more classification types through a series of nodes in a network. These classifications are defined in a classification column in the Data editor. Because of the many random events in the training of an ANN, every network will be different even for the same data, but the features and classification potential of the ANNs are typically preserved. If the data has been transposed, exchange all instances of "gene" for "sample", and vice versa in the following description.

 

How to

Open the Networks tab among the analysis tabs in the top of the main window, and press the Artificial Neural Network button to load the analysis into the Control panel. There are buttons to Save network and to Load network, as well as to start (Run) and Stop the building of a network.  The Convergence value is the threshold value that is used in the training process. When the sum of the total error for all training data in the model is below this value, the training process is completed. The lower the sum of the errors, the better. There is also a Settings button which will open a new window where all the nessacary parameter and settings to build an ANN can be changed. 

 

    

 

In the Advanced Settings window there are three tabs. Under the Network design tab, there are several parameters that are used in the building of an ANN that can be modified if wanted. The Nr of layers drop-down list is used to define how many layers of nodes there should be including the output layer. Select a layer in the upper box to change its settings, and don't forget to press Update to apply the changes. The settings to define the framework of the model include the following.

 

 

These settings define the framework of the model. If you have several classification columns in the active data file, make sure that the correct ones are selected in the Output classifications box. The parameters that have to do with the training of the model is the following.

 

 

    

 

There are also tabs in the Advanced settings window that let you define the number of models that will be built for each time the Run button is pressed, as well as additional information about the data that the model are based on. 

 

    

 

    

 

The result of the training process is presented as two plots showing the Total error and the Mean error of the model as a function of training time. The error should decrease over time. If the Use cross validation check box was ticked, the error for the randomly selected validation data (not used to build the model) is plotted separately in green. If there are test samples defined in the Data manager, a table is show with all input data of both training and test samples, the output data for the training samples, and the predicted classification for the test samples. The classifications are calculated as floating numbers, while classes typically defined by integers (classA=1, classB=2,...). The predicted classification can be rounded to obtain the predicted class. The difference between the calculated value and the predicted class (integer) indicates the accuracy of the model.

 

    

 

    

 

    

 

If you want to use a ANN to classify test samples, press the Load button and select the network file. The result is presented in the table with the input data and classification results for both the training (top) and test samples (bottom), just like the one described above. 

 

Warning: If you train a successful ANN, don't forget to save it! You will never get the exact same network back in a new training.