Περίληψη σε άλλη γλώσσα
In this dissertation the problem of the training of feedforward artificial neural networks
and its applications are considered. The presentation of the topics and the
results are organized as follows:
In the first chapter, the artificial neural networks are introduced. Initially, the
benefits of the use of artificial neural networks are presented. In the sequence, the
structure and their functionality are presented. More specifically, the derivation of
the artificial neurons from the biological ones is presented followed by the presentation
of the architecture of the feedforward neural networks. The historical notes and the
use of neural networks in real world problems are concluding the first chapter.
In Chapter 2, the existing training algorithms for the feedforward neural networks
are considered. First, a summary of the training problem and its mathematical
formulation, that corresponds to the uncostrained minimization of a cost function, are
given. In the sequence, trai ...
In this dissertation the problem of the training of feedforward artificial neural networks
and its applications are considered. The presentation of the topics and the
results are organized as follows:
In the first chapter, the artificial neural networks are introduced. Initially, the
benefits of the use of artificial neural networks are presented. In the sequence, the
structure and their functionality are presented. More specifically, the derivation of
the artificial neurons from the biological ones is presented followed by the presentation
of the architecture of the feedforward neural networks. The historical notes and the
use of neural networks in real world problems are concluding the first chapter.
In Chapter 2, the existing training algorithms for the feedforward neural networks
are considered. First, a summary of the training problem and its mathematical
formulation, that corresponds to the uncostrained minimization of a cost function, are
given. In the sequence, training algorithms based on the steepest descent, Newton,
variable metric and conjugate gradient methods are presented. Furthermore, the
weight space, the error surface and the techniques of the initialization of the weights
are described. Their influence in the training procedure is discussed.
In Chapter 3, a new training algorithm for feedforward neural networks based
on the backpropagation algorithm and the automatic two-point step size (learning
rate) is presented [81]. The algorithm uses the steepest descent search direction
while the learning rate parameter is calculated by minimizing the standard secant
equation [5]. Furthermore, a new learning rate parameter is derived by minimizing
the modified secant equation introduced in [90], that uses both gradient and function
value information. In the sequence a switching mechanism is incorporated into the
algorithm so that the appropriate stepsize to be chosen according to the status of the
current iterative point. Finaly, the global convergence of the proposed algorithm is
studied and the results of some numerical experiments are presented.
In Chapter 4, some efficient training algorithms, based on conjugate gradient optimization
methods, are presented [43]. In addition to the existing conjugate gradient
training algorithms, we introduce Perry’s conjugate gradient method as a training
algorithm [60]. Furthermore, a new class of conjugate gradient methods is proposed,
5
6
called self-scaled conjugate gradient methods, which are derived from the principles of
Hestenes-Stiefel, Fletcher-Reeves, Polak-Ribi`ere and Perry’s method. This class is based
on the spectral scaling parameter introduced in [5]. Furthermore, we incorporate
to the conjugate gradient training algorithms an efficient line search technique based
on the Wolfe conditions and on safeguarded cubic interpolation [77]. In addition, the
initial learning rate parameter, fed to the line search technique, was automatically
adapted at each iteration by a closed formula proposed in [77] and [80]. Finally, an
efficient restarting procedure was employed in order to further improve the effectiveness
of the conjugate gradient training algorithms and prove their global convergence.
Experimental results show that, in general, the new class of methods can perform
better with a much lower computational cost and better success performance.
In the last chapter of this dissertation, the Perry’s self-scaled conjugate gradient
training algorithm that was presented in the previous chapter was isolated and modified.
More specifically, the main characteristics of the training algorithm were
maintained but in this case a different line search strategy based on the nonmonotone
Wolfe conditions was utilized. Furthermore, a new initial learning rate parameter
was introduced ([41]) for use in conjunction with the self-scaled conjugate gradient
training algorithm that seems to be more effective from the initial learning rate parameter,
proposed by Shanno in [77], when used with the nonmonotone line search
technique. In the sequence the experimental results for differrent training problems
are presented. Finally, a feedforward neural network with the proposed algorithm
for the problem of brain astrocytomas grading was trained and compared the results
with those achieved by a probabilistic neural network [42].
The dissertation is concluded with the Appendix A’, where the training problems
used for the evaluation of the proposed training algorithms are presented.
περισσότερα