Evolutionäre Optimierung von Deep Neural Networks

Abstract

Convolutional neural networks is a widely spread class of powerful models from the deep learning domain. They have a specific architecture which allows them to tackle successfully many tasks such as image and speech recognition, video analysis etc. Convolutional architectures have a number of (hyper-)parameters which influence the final recognition error rate. Despite the fact that convolutional networks attract ever increasing interest within the research community, a search for good values for their hyper-parameters is carried out for the most part manually, which takes an extremely long time and is prone to overlook some promising values. The subject of this study is designing the algorithms to automatically search for hyper-parameters for convolutional architectures. These algorithms should encompass the existing knowledge about convolutional networks and yield very good hyper-parameter values in a relatively short time due to an appropriate search strategy. In this paper, three of such algorithms based on well-known metaheuristics named evolutionary algorithms and local search will be presented, adjusted to the use case of convolutional architectures and compared. Furthermore, it will be shown which algorithm produces the most successful architectures under which circumstances, using different datasets for image recognition.