Application of machine-learning algorithms to predict calving difficulty in Holstein dairy cattle

Avizheh, Mahdieh; Dadpasand, Mohammad; Dehnavi, Elena; Keshavarzi, Hamideh

doi:10.1071/AN22461

Author(s)

Avizheh, Mahdieh

Dadpasand, Mohammad

Dehnavi, Elena

Keshavarzi, Hamideh

Publication Date

2023-05-15

Abstract

Context. An ability to predict calving difficulty could help farmers make better farm-management decisions, thereby improving dairy farm profitability and welfare. Aims. This study aimed to predict calving difficulty in Iranian dairy herds using machine-learning (ML) algorithms and to evaluate sampling methods to deal with imbalanced datasets. Methods. For this purpose, the history records of cows that calved between 2011 and 2021 on two commercial dairy farms were used. Using WEKA software, four commonly used ML algorithms, namely naïve Bayes, random forest, decision trees, and logistic regression, were applied to the dataset. The calving difficulty was considered as a binary trait with 0, normal or unassisted calving, and 1, difficult calving, i.e. receiving any help during parturition from farm personnel involvement to surgical intervention. The average rate of difficult calving was 18.7%, representing an imbalanced dataset. Therefore, down-sampling and cost-sensitive techniques were implemented to tackle this problem. Different models were evaluated on the basis of F-measure and the area under the curve. Key results. The results showed that sampling techniques improved the predictive model (P = 0.07, and P = 0.03, for down-sampling and cost-sensitive techniques respectively). F-measure ranged from 0.387 (decision tree) to 0.426 (logistic regression) with the balanced dataset. However, when applied to the original imbalanced dataset, naïve Bayes had the best performance of up to 0.388 in terms of F-measure. Conclusions. Overall, sampling techniques improved the prediction model compared with original imbalanced dataset. Although prediction models performed worse than expected (due to an imbalanced dataset, and missing values), the implementation of ML algorithms can still lead to an effective method of predicting calving difficulty. Implications. This research indicated the capability of ML algorithms to predict the incidence of calving difficulty within a balanced dataset, but that more explanatory variables (e.g. genetic information) are required to improve the prediction based on an unbalanced original dataset.

Citation

Animal Production Science, 63(10-11), p. 1095-1104

ISSN

1836-5787

1836-0939

Link

https://hdl.handle.net/1959.11/61878

Language

en

Publisher

CSIRO Publishing

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Title

Application of machine-learning algorithms to predict calving difficulty in Holstein dairy cattle

Type of document

Journal Article

Entity Type

Publication

Author(s)	Avizheh, Mahdieh Dadpasand, Mohammad Dehnavi, Elena Keshavarzi, Hamideh
Publication Date	2023-05-15
Abstract	<p><b>Context.</b> An ability to predict calving difficulty could help farmers make better farm-management decisions, thereby improving dairy farm profitability and welfare. <b>Aims.</b> This study aimed to predict calving difficulty in Iranian dairy herds using machine-learning (ML) algorithms and to evaluate sampling methods to deal with imbalanced datasets. <b>Methods.</b> For this purpose, the history records of cows that calved between 2011 and 2021 on two commercial dairy farms were used. Using WEKA software, four commonly used ML algorithms, namely naïve Bayes, random forest, decision trees, and logistic regression, were applied to the dataset. The calving difficulty was considered as a binary trait with 0, normal or unassisted calving, and 1, difficult calving, i.e. receiving any help during parturition from farm personnel involvement to surgical intervention. The average rate of difficult calving was 18.7%, representing an imbalanced dataset. Therefore, down-sampling and cost-sensitive techniques were implemented to tackle this problem. Different models were evaluated on the basis of F-measure and the area under the curve. <b>Key results.</b> The results showed that sampling techniques improved the predictive model (<i>P</i> = 0.07, and <i>P</i> = 0.03, for down-sampling and cost-sensitive techniques respectively). F-measure ranged from 0.387 (decision tree) to 0.426 (logistic regression) with the balanced dataset. However, when applied to the original imbalanced dataset, naïve Bayes had the best performance of up to 0.388 in terms of F-measure. <b>Conclusions.</b> Overall, sampling techniques improved the prediction model compared with original imbalanced dataset. Although prediction models performed worse than expected (due to an imbalanced dataset, and missing values), the implementation of ML algorithms can still lead to an effective method of predicting calving difficulty. <b>Implications.</b> This research indicated the capability of ML algorithms to predict the incidence of calving difficulty within a balanced dataset, but that more explanatory variables (e.g. genetic information) are required to improve the prediction based on an unbalanced original dataset.</p>
Citation	Animal Production Science, 63(10-11), p. 1095-1104
ISSN	1836-5787 1836-0939
Link	https://hdl.handle.net/1959.11/61878
Language	en
Publisher	CSIRO Publishing
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
Title	Application of machine-learning algorithms to predict calving difficulty in Holstein dairy cattle
Type of document	Journal Article
Entity Type	Publication

Application of machine-learning algorithms to predict calving difficulty in Holstein dairy cattle

Files: