Peralta B.(a), Saavedra A. (b), Caro L.(a)
(a) Escuela de Informática, Universidad Católica de Temuco, Temuco, Chile
(b) Departamento de Ciencias de Computación, Pontificia Universidad Católica de Chile, Santiago, Chile
XLIII LATIN AMERICAN COMPUTER CONFERENCE (CLEI)
Volumen: 2017 Páginas: 1-9
DOI: https://doi.org/10.1109/CLEI.2017.8226425
Fecha de publicación: 18 de diciembre de 2017
Abstract
In these days, there are a growing interest in pattern recognition for tasks as prediction of weather events, recommendation of the best route, intrusion detection or face detection. Each of these tasks can be modelled as classification problem, where a common alternative is to use an ensemble model of classification. A well-known example is given by Mixture-of-Experts model, which represents a probabilistic artificial neural network consisting of local experts classifiers weighted by a gate network, and whose combination creates an environment of competition among experts seeking to obtain patterns of the data source. We observe that this architecture assume that one gate influence only one data point, consequently the training can be misguided in real datasets where the data is better explained by multiple experts. In this work, we present a variant of regular Mixture-of-Experts model, which consists of maximizing of the entropy of gate network in addition to classification cost minimization. The results show the advantage of our approach in multiple datasets in terms of accuracy metric. As a future work, we plan to apply this idea to the Mixture-of-Experts with embedded feature selection.