Mining the ESROM: A study of breeding value classification in Manchego sheep by means of attribute selection and construction

Abstract

Manchego sheep breeding represents an important factor in the economy in the region of Castilla-La Mancha, Spain. For this reason, the selection scheme for Manchego sheep (ESROM) was created to improve milk production in ewes belonging to the Manchego breed. This scheme relies on the use of several tools that depend on ewes’ genetic merit, which is calculated by using a sophisticated linear regression model. This paper presents a study about how the use of data mining techniques can help to approximate the genetic qualities of a ewe, before the official 6 months assessment is carried out, and by using less input. This study focuses on two well-known data mining tasks: pre-processing and classification. In the pre-processing stage, state-of-the-art algorithms and new proposals are used to identify relevant subsets of features by means of selection and construction. By using these subsets of highly predictive variables, different classifiers are trained, which in turn, are used to assess the genetic quality merit of any given ewe. As a result, original and constructed relevant variables have been identified for the target problem, this being a valuable result in itself. Furthermore, from simulated tests, reliable classification rates have been obtained when using the identified classifiers to approach ESROM tasks.

Publication
Computers and Electronics in Agriculture 60(2)
Juan L. Mateo
Juan L. Mateo
Associate Professor

My research interests include Machine Learning and Bioinformatics.