Discrete Neighborhood Representations and Modified Stacked Generalization Methods for Distributed Regression
Héctor Allende-Cid (Pontifícia Universidad Católica de Valparaíso, Chile)
Héctor Allende (Universidad Técnica Federico Santa María, Chile)
Raúl Monge (Universidad Técnica Federico Santa María, Chile)
Claudio Moraga (European Centre for Soft Computing, Spain)
Abstract: When distributed data sources have different contexts the problem of Distributed Re-gression becomes severe. It is the underlying law of probability that constitutes the context of a source. A new Distributed Regression System is presented, which makes use of a discrete rep-resentation of the probability density functions (pdfs). Neighborhoods of similar datasets are detected by comparing their approximated pdfs. This information supports an ensemble-basedapproach, and the improvement of a second level unit, as it is the case in stacked generalization. Two synthetic and six real data sets are used to compare the proposed method with otherstate-of-the-art models. The obtained results are positive for most datasets.
Keywords: context-aware regression, distributed machine learning, similarity representation
Categories: G.3, I.2.11