Content area
Background
Reproduction is vital to welfare, health, and economics in animal husbandry and breeding. Health and reproduction are increasingly being considered because of the observed genetic correlations between reproduction, health, conformation, and performance traits in dairy cattle. Understanding the detailed genetic architecture underlying these traits would represent a major step in comprehending their interplay. Identifying known, putative or novel associations in genomics could improve animal health, welfare, and performance while allowing further adjustments in animal breeding.
Results
We conducted genome-wide association studies for 25 different traits belonging to four different complexes, namely reproduction (n = 13), conformation (n = 6), production (n = 3), and metabolism (n = 3), using a cohort of over 235,000 dairy cows. As a result, we identified genome-wide significant signals for all the studied traits. The obtained summary statistics collected served as the input for a Mendelian randomisation approach (GSMR) to infer causal associations between putative exposure and reproduction traits. The study considered conformation, production, and metabolism as exposure and reproduction as outcome. A range of 139 to 252 genome-wide significant SNPs per combination were identified as instrumental variables (IVs). Out of 156 trait combinations, 135 demonstrated statistically significant effects, thereby enabling the identification of the responsible IVs. Combinations of traits related to metabolism (38 out of 39), conformation (68 out of 78), or production (29 out of 39) were found to have significant effects on reproduction. These relationships were partially non-linear. Moreover, a separate variance component estimation supported these findings, strongly correlating with the GSMR results and offering suggestions for improvement. Downstream analyses of selected representative traits per complex resulted in identifying and investigating potential physiological mechanisms. Notably, we identified both trait-specific SNPs and genes that appeared to influence specific traits per complex, as well as more general SNPs that were common between exposure and outcome traits.
Conclusions
Our study confirms the known genetic associations between reproduction traits and the three complexes tested. It provides new insights into causality, indicating a non-linear relationship between conformation and reproduction. In addition, the downstream analyses have identified several clustered genes that may mediate this association.
Introduction
Female fertility is of fundamental importance in dairy cattle. Despite this, over the decades, a decrease in reproductive performance has been observed [1, 2], while production traits like milk kg (MKG), fat kg (FKG) or protein kg (PKG) have improved and been the central focus of breeding programmes in dairy production systems [3]. Various factors can contribute to these circumstances, including an extremely low heritability for reproductive performance compared to production traits, a negative genetic correlation between these trait pairs, and management deficiencies [4, 5–6]. To consider these factors and improve functional traits, the genetic correlation between the traits is crucial [7, 8] and remains a persistent challenge in dairy cattle breeding programmes [9]. The modifications needed for enhancement are of further interest to fulfil society’s increasing awareness of animal welfare and the ongoing changes in husbandry conditions, such as climate change [10, 11]. Moreover, genetic correlation can arise for distinctive reasons, including but not limited to linkage between causal variants or forms of pleiotropy [12, 13], i.e., more than one trait is affected by a genetic variant.
Furthermore, it is important to differentiate between genetic correlation and causation and identify the underlying causal mechanisms [12, 13–14]. The extent and direction of genetic correlation may vary depending on the specific traits under consideration [9, 15]. Nevertheless, a detailed and direct inference about causality or causal associations from genetic correlations alone is not feasible [13, 16, 17]. It is necessary to move beyond global correlation to gain detailed genomic correlation and causality analysis and determine the connection between quantitative traits in dairy cattle breeding [18, 19].
Several established approaches and methods can be used to reveal potential causal associations between traits [14]. Randomised controlled trials (RCTs) are generally considered the gold standard for causality testing of possible exposure (e.g. treatment) against outcomes (e.g. disease) [20]. Nevertheless, RCTs are restricted in their ability to ascertain the causal relationships between quantitative traits due to the control challenges for potential confounding factors [21]. Moreover, the small sample size and controlled experimental conditions may not accurately reflect the conditions present on commercial dairy farms. It is important to note that only observational data is available, which can limit the reliability of the findings. One way to address this problem is to use single nucleotide polymorphisms (SNPs) as instrumental variables (IVs) for achieving randomisation, a method known as the Mendelian randomisation (MR) approach [12, 22, 23]. This method allows for the inference of potential causal association effects between two traits. The design of the method defines the case and control groups, revealing the potential association effects between exposure and outcome.
Burgess et al. [12] provided guidelines for MR analyses and compared the various available methods. In general, three key assumptions must be fulfilled by the IVs [24, 25]: (1) the IV is associated with the exposure, (2) the IV does not affect the outcome except potentially via the exposure, and (3) the IV is not associated with the outcome due to confounding pathways. This study used summary statistics from genome-wide association studies (GWAS) and performed MR using the generalised summary-data-based Mendelian randomisation (GSMR) approach [17]. This method identified and quantified genetic effects between traits and narrowed down potential mediators for causal association effects, such as pleiotropy [17, 26].
Briefly, trait interrelationship may arise due to varying reasons, simplified here into three major categories: (1) real (biological) pleiotropy, indicating one variant is affecting two traits similar to one variant is causative for both traits (2) mediated (vertical/indirect) pleiotropy, whereby one variant affects one trait through another, and (3) supposed real (horizontal/direct) pleiotropy, where causal variants for two traits are in linkage disequilibrium (LD) with a variant associated with both traits but may even fall in distinct loci [12, 27].
Notably, the second category, mediated pleiotropy, is particularly interesting, as it encompasses the genetic correlation employed to ascertain the underlying causal association [13]. Consequently, the principal objective of the GSMR method is to infer the causal association between traits mediated by the IVs [12, 17]. Moreover, this method has additional advantages, including identifying horizontal pleiotropic variants that affect both traits and their removal from further analyses, as well as approximating the causal effect of an exposure on an outcome [17, 26]. The most notable distinction between MR studies in livestock species and humans is the discrepancy in the availability of potential exposure data and the observational setting. While voluntary and self-determined treatments for dairy cattle are scarcely available, human studies commonly focus on self-determined behaviours such as smoking [22], alcohol consumption [28] or physical activities [29].
To better understand the interconnections between various trait complexes, we analysed traits related to reproduction, production, conformation, and metabolism to assess their potential influences on one another. The traits associated with the complexes of conformation, production and metabolism were selected for their putative exposures on the reproduction traits as the outcome, based on known or presumed genetic connections (e.g. body condition score and calving interval [30] or negative energy balance and hypothalamo-hypophyseal axis [31]).
Ketosis (KET), displaced abomasum (LMV) and milk fever (MIF) were used as potential indicators of metabolism. In addition to the body condition score (BCS), milk type (MTY) and several overall scores for body (OBS), udder (OUS), feet and legs (OFL), or conformation (OCS) were used to outline conformation. Here, we used an approach based on GWAS summary statistics for 25 traits, including 235,164 German Holstein Friesian dairy cows. The identified IVs were then used for downstream analyses to identify potential candidate genes responsible for the detected effects. Additionally, we performed genetic restricted maximum likelihood (GREML) analyses to estimate variance components within a smaller dataset of 48,118 animals. The study combined information from GSMR and GREML to investigate the overall interrelationship between traits based on SNPs from GREML and IVs identified from GSMR.
Material and methods
Animals and phenotypes
Phenotypic and genotypic data were provided by the national computing centre VIT (Vereinigte Informationssysteme Tierhaltung w.V., Verden, Germany) and part of the ‘KuhVision’ dataset based on German Holstein cows, which represent the genomic pattern of the German Holstein Friesian reference population [8].
The number of animals with observational data per trait ranged from 110,629 to 192,188 (Table 1), representing 235,164 animals in total. We estimated the variance components using a subset of 48,118 animals that simultaneously had complete observations for all traits. To ensure clarity, we have employed the terms and abbreviations defined by VIT [32]. A total of 25 traits related to conformation, production, metabolism, or reproduction were analysed.
Table 1. Complex, trait, abbreviation and number of animals (No.) used for GWAS and subsequent GSMR analyses
Complex | Trait | Abbreviation | No. | h2 (SE) |
|---|---|---|---|---|
Exposure | ||||
Conformation | Body condition score | BCS | 163,127 | 0.318 (0.007) |
Milk type | MTY | 163,127 | 0.298 (0.007) | |
Overall conformation score | OCS | 163,127 | 0.281 (0.007) | |
Overall body score | OBS | 163,127 | 0.339 (0.007) | |
Overall udder score | OUS | 163,127 | 0.286 (0.007) | |
Overall feet and legs score | OFL | 163,127 | 0.108 (0.005) | |
Metabolism | Ketosis | KET | 148,515 | 0.070 (0.004) |
Displaced abomasum | LMV | 133,525 | 0.102 (0.005) | |
Milk fever | MIF | 145,802 | 0.068 (0.004) | |
Production | Milk kg | MKG | 180,217 | 0.439 (0.007) |
Fat kg | FKG | 180,217 | 0.372 (0.007) | |
Protein kg | PKG | 180,217 | 0.324 (0.007) | |
Outcome | ||||
Reproduction | Non-return rate cows | NRc | 163,663 | 0.063 (0.004) |
Non-return rate heifers | NRh | 187,004 | 0.106 (0.005) | |
First to last insemination cows | FSc | 149,289 | 0.146 (0.005) | |
First to last insemination heifers | FSh | 176,091 | 0.091 (0.004) | |
Calving to first insemination cow | CFc | 165,212 | 0.109 (0.005) | |
Days open cow | DOc | 165,212 | 0.167 (0.006) | |
Calving ease direct | CEd | 130,495 | 0.095 (0.005) | |
Calving ease maternal | CEm | 185,153 | 0.097 (0.005) | |
Stillbirth direct | SBd | 138,592 | 0.103 (0.005) | |
Stillbirth maternal | SBm | 192,188 | 0.132 (0.005) | |
Retained placenta | NGV | 142,395 | 0.068 (0.004) | |
Metritis/Endometritis | MET | 126,842 | 0.054 (0.003) | |
Ovary cycle disturbances | ZYS | 110,629 | 0.070 (0.004) | |
h2 = SNP-based heritability estimates of the traits with their standard errors (SE)
The 13 reproduction traits reflect the calving-related traits with calving ease direct/maternal (CEd/CEm), endometritis/metritis (MET), retained placenta (NGV), stillbirth direct/maternal (SBd/SBm) or ovary cycle disturbances (ZYS). Additionally, they also include fertility indicators such as calving to first insemination cow (CFc), days open cow (DOc), first to last insemination cow/heifer (FSc/FSh) or non-return rate cows/heifers (NRc/NRh).
For this analysis, we used phenotypic data as deregressed proofs (DRPs) obtained from VIT’s 2021 breeding value estimation. The calculations were made using the deregression method by Jairath et al. [33], which aligned with its application described by Liu and Masuda [34]. However, it is important to highlight the informative value of the DRPs, which were corrected for environmental factors and the population mean. Therefore, they were taken from different steps of the national breeding value estimation in Germany. For example, a lower value for MET implies a higher susceptibility, while a lower value for CFc indicates a shorter period than the population mean.
Genotyping data, quality control
The genotypes comprising 45,613 SNPs were obtained through routine genetic evaluation. The study mainly included animals imputed to the 50K level using various versions of the EuroGenomics low-density chips (Eurogenomics, Amsterdam, NL), with a minority genotyped using various versions of the Illumina 50K chips (Illumina Inc., San Diego, CA) and EuroGenomics medium-density chips (Eurogenomics, Amsterdam, NL).
The imputation procedure described in Segelke et al. was implemented [35]. To perform further analyses, we filtered the data using PLINK v1.90 [36], removing SNPs with a minor allele frequency below 1%. The genotype dataset used in the GCTA tool version 1.93.2 beta contained 44,994 SNPs on Bos taurus (BTA) autosomes and the X chromosome (BTAX) [37].
Variance components and genetic correlations
Variance components and genetic correlations between traits were estimated using GREML [38] as implemented in GCTA [37]. The analysis was conducted on a subset of 48,118 animals. The trait variances were partitioned into an additive genetic and a residual component. This partition was based on the genetic relationship between the individuals using the autosomal SNPs (44,136) [38] in a univariate analysis. The SNP heritability (the proportion of phenotypic variance explained by the additive effects of common SNPs) was estimated. A bivariate GREML (biREML) analysis was conducted to assess the genetic correlations between traits, as described by Lee et al. [39].
Genome-wide association studies
Each trait was split into two equal cohorts to accommodate the large number of animals per trait and meet the computational demand. GWAS was conducted separately for each cohort. To identify the genomic regions associated with each one, single-trait GWAS was performed for all 25 using GCTA [37] with a mixed-linear model approach (MLMA) using a single SNP regression [40], as implemented with the model below:
In this case, the phenotype vector represented as DRPs is denoted by , while is a vector of ones and µ signifies the mean term. The additive (fixed) effect of the individual candidate SNP being tested for association is represented by b, and the vector stores the SNP genotype indicator variables coded as 0, 1 or 2 for all animals for one of the 44,994 SNPs on the autosomes and BTAX. The genetic relationship matrix (GRM) captures the polygenic effect, represented by , while the residual is represented by . To account for the kinship structure of the study population, the GRM was constructed using all autosomal SNPs (44,136). Subsequently, meta-analyses were conducted using the METAL program (version 2020-05-05) [41] to combine both cohorts per trait and obtain a joint test statistic. METAL utilises GWAS test statistics obtained from MLMA, which include estimated effects and standard errors per SNP. Furthermore, METAL considers individual cohorts’ sample sizes to combine p-values and convert them into merged signed Z-scores [41].
In addition to the default settings, the flag for genomic control correction was set to restrict the genomic inflation factor (lambda, λ) of every test statistic to a maximum value of one. To ensure accurate results, we conducted a genomic control parameter [42] estimation for each input file. We then applied genomic control correction to input statistics before performing a meta-analysis and creating the joint summary statistic.
The meta-analyses were performed twice for each trait [41]. First, the separate cohorts were combined to obtain a joint test statistic. Second, the obtained combined test statistic was re-processed to correct any putative genomic inflation arising from the combination. Therefore, to prevent p-value inflation, the λ values for genomic inflation were restricted to a maximum of one (λ ≤ 1). Subsequently, the Z-scores obtained from METAL were transformed back to beta values using the method described by Zhu et al. [26]. The basic principle of transformation relies on bzx (effect of the single SNP (z) on the trait, here exposure (x)) that could be interpreted in standard deviation (SD) units, i.e. bzx = with p being the allele frequency and n being the sample size [26].
For considering genetic variants as associated, the GWAS threshold for significance was Bonferroni-corrected on a genome-wide level to account for multiple SNP testing. Therefore, a type-I error value of α = 0.05 was used, divided by the number of tests (p = 1.11 * 10−6 [(0.05/44,994), − log10 (p) ≈ 5.95]). The Manhattan plots for graphical representation were generated using the R package ggplot2 [43] in R version 4.2.0 [44].
Mendelian randomisation
Mendelian randomisation was performed using summary statistics from the above GWAS and the GSMR approach implemented in GCTA [17]. This method extends the summary-data-based MR (SMR) method [26]. It integrates the estimates of SNP effects and the distinction between causality; in this case, the influence of the SNP on the outcome is mediated by the exposure. Furthermore, it incorporates horizontal pleiotropy; in this case, the SNP has a different impact on exposure and outcome, thus accounting for LD structure and variance in both [17].
Zhu et al. provide a comprehensive explanation of the method [17]. However, briefly, the following filter settings were used: the GWAS threshold p-value was set to 1.11 * 10−6 after applying Bonferroni correction. The clumping r2 threshold was set to 0.05, with a minimum of ten genome-wide significant and quasi-independent SNPs used as the default for analysis. The fundamental concept is that if an exposure (x) affects an outcome (y), any instrumental variables (IVs, SNPs, z) that are causally associated with x will also affect y. Moreover, the effect of x on y (bxy) is expected to be identical for any SNPs without pleiotropy [17].
In this study, the MR estimate of the GSMR method, which indicates the causal effect of a specific exposure on a given outcome trait, is calculated for the individual variant (i) as bxy(i) = . For each combination of individual exposure against the individual outcome, the overall bxy equals the sum of all the single IV effects estimated, following a normal distribution given ~ N(bxy,V). In this context, V represents the variance–covariance matrix of bxy, which includes the LD correlation between the IVs [17]. Using the HEIDI (heterogeneity in dependent instruments) outlier filtering [17], we crucially removed horizontal pleiotropic variants to avoid any possible violation of the MR assumptions outlined above (e.g. [25]). The fundamental concept was to test every IV for a significant deviation (di) of the individual bxy(i) from a target SNP that had a strong association with the exposure tested (bxy(top)).
Interestingly, the capacity to identify heterogeneity is enhanced as the correlation between the SNP and the exposure in question strengthens. However, simply selecting the most exposure-associated SNP is not feasible as there is a possibility it may have a markedly strong pleiotropic effect [17]. Therefore, to increase robustness, the distribution of bxy is ordered as a function of − log10(p-value) for bzx, and the SNP with the strongest association is then selected in the third quintile of the distribution itself. Given the approximation of di = bxy(i) − bxy(top) and var(di) = var(bxy(i) − bxy(top)), considering the LD among IVs and filtering for LD that was not removed by the clumping filter beforehand [17].
Each potential exposure trait was tested against all reproduction-associated traits as separate outcomes. For every trait combination, we obtained the instrumental variables (SNPGSMR) that represented them, the estimated mediated effect of the exposure on the outcome (bxy), both overall and per individual IV, as well as the corresponding p-value (pGSMR) after HEIDI filtering for the removal of pleiotropic SNPs [17]. To correct for the multiple testing on 13 different reproduction traits jointly against each single potential exposure trait and to determine the significance threshold for each combination (single exposure vs. single outcome), we used the remaining number of SNPs after applying the GWAS threshold and LD-clumping (= SNPINDEX) for correction. As a result, we used a threshold of pGSMR < (0.05/SNPINDEX) for significance.
Downstream analysis
The study used the Ensembl Variant Predictor (VEP) release 94 [45] with markers identified by the GSMR method and the results for SNPGSMR per trait combination as the data input. The Bos taurus genome (assembly ARS-UCD1.2) was screened for putative genes within a 1000 base pair (bp) range downstream and upstream of each identified marker. Only known genes with confirmed symbols approved by the gene nomenclature [46] were considered. As a result of the GSMR analysis, four traits from three complexes were selected: BCS and OBS for conformation, MKG for production, and KET for metabolism.
For the conformation traits, we chose a direct trait and a linear score trait to compare. Finally, the R package VennDiagram [47] was used to generate a Venn diagram showing shared candidate genes between the four selected traits.
Results
Variance components and genetic correlations
Our results show that heritability estimates were low for traits representing reproduction, metabolism, and partly conformation (e.g. OFL h2 = 0.108, Table 1). Regarding reproductive and metabolic traits, MET had the lowest heritability (h2 = 0.054) and DOc the highest (h2 = 0.167). Moderate estimates were obtained for the other conformation traits (like BCS h2 = 0.318) and for production, demonstrating even higher heritability estimates (MKG h2 = 0.439).
We used the biREML option to estimate genetic correlations, which involved all 44,136 autosomal SNPs in the panel. For this study, these estimates are presented at three different levels: (1) between all exposure and all outcome combinations, (2) within exposure only, and (3) within outcome traits only. The results of the exposure-outcome analysis are presented in Table 2, which includes SNP-based correlation estimates and their corresponding standard errors (SE).
Table 2. SNP-based genetic correlation estimates (rG) between exposure (vertical) and outcome traits (horizontal)
Trait | CEd | CEm | CFc | DOc | FSc | FSh | MET | NGV | NRc | NRh | SBd | SBm | ZYS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BCS | 0.005 (0.028) | 0.003 (0.028) | − 0.373 (0.024) | − 0.293 (0.023) | − 0.194 (0.024) | − 0.152 (0.028) | − 0.019 (0.033) | 0.118 (0.031) | 0.081 (0.031) | 0.040 (0.027) | 0.007 (0.027) | − 0.020 (0.026) | 0.166 (0.030) |
FKG | − 0.032 (0.027) | − 0.033 (0.027) | 0.283 (0.024) | 0.288 (0.022) | 0.259 (0.023) | 0.215 (0.026) | 0.066 (0.032) | 0.036 (0.030) | − 0.208 (0.029) | − 0.131 (0.026) | − 0.039 (0.027) | 0.020 (0.025) | − 0.140 (0.029) |
KET | 0.163 (0.038) | 0.145 (0.038) | − 0.160 (0.037) | − 0.248 (0.033) | − 0.264 (0.033) | − 0.238 (0.038) | 0.260 (0.043) | 0.275 (0.040) | 0.211 (0.042) | 0.119 (0.037) | 0.120 (0.038) | 0.088 (0.036) | 0.277 (0.040) |
OBS | − 0.167 (0.027) | 0.074 (0.028) | 0.210 (0.026) | 0.229 (0.023) | 0.207 (0.024) | 0.128 (0.028) | − 0.165 (0.032) | − 0.101 (0.031) | − 0.159 (0.031) | − 0.095 (0.027) | − 0.089 (0.027) | 0.005 (0.025) | − 0.157 (0.030) |
LMV | 0.192 (0.035) | 0.107 (0.035) | − 0.144 (0.034) | − 0.198 (0.030) | − 0.193 (0.031) | − 0.162 (0.035) | 0.333 (0.038) | 0.245 (0.037) | 0.134 (0.039) | 0.088 (0.034) | 0.152 (0.034) | 0.107 (0.032) | 0.285 (0.036) |
MIF | 0.108 (0.039) | 0.121 (0.039) | − 0.185 (0.037) | − 0.184 (0.034) | − 0.159 (0.035) | − 0.159 (0.035) | 0.190 (0.044) | 0.138 (0.042) | 0.063 (0.043) | 0.027 (0.038) | 0.088 (0.039) | 0.110 (0.036) | 0.258 (0.036) |
MKG | 0.010 (0.026) | − 0.011 (0.026) | 0.456 (0.022) | 0.385 (0.020) | 0.287 (0.022) | 0.239 (0.025) | 0.009 (0.031) | − 0.087 (0.029) | − 0.152 (0.029) | − 0.100 (0.025) | 0.007 (0.026) | 0.005 (0.024) | − 0.171 (0.028) |
MTY | − 0.044 (0.028) | − 0.006 (0.028) | 0.388 (0.024) | 0.319 (0.023) | 0.217 (0.024) | 0.154 (0.028) | − 0.030 (0.033) | − 0.111 (0.031) | − 0.090 (0.031) | − 0.049 (0.027) | − 0.026 (0.028) | 0.002 (0.026) | − 0.166 (0.030) |
OCS | 0.013 (0.029) | 0.087 (0.028) | 0.021 (0.028) | 0.004 (0.025) | − 0.013 (0.025) | − 0.016 (0.029) | 0.067 (0.033) | − 0.005 (0.031) | 0.015 (0.032) | 0.005 (0.028) | 0.020 (0.028) | 0.050 (0.026) | 0.049 (0.031) |
OFL | 0.136 (0.035) | 0.028 (0.035) | − 0.119 (0.033) | − 0.088 (0.030) | − 0.050 (0.031) | − 0.049 (0.035) | 0.139 (0.040) | 0.086 (0.038) | 0.025 (0.039) | 0.004 (0.034) | 0.135 (0.034) | 0.020 (0.032) | 0.146 (0.037) |
OUS | 0.014 (0.028) | 0.034 (0.028) | − 0.088 (0.027) | − 0.110 (0.024) | − 0.126 (0.025) | − 0.150 (0.029) | 0.090 (0.030) | 0.050 (0.031) | 0.129 (0.032) | 0.079 (0.028) | 0.017 (0.028) | − 0.014 (0.026) | 0.209 (0.030) |
PKG | 0.035 (0.028) | − 0.009 (0.028) | 0.456 (0.023) | 0.393 (0.021) | 0.302 (0.023) | 0.254 (0.027) | 0.057 (0.033) | − 0.066 (0.031) | − 0.173 (0.030) | − 0.123 (0.027) | 0.024 (0.027) | 0.022 (0.025) | − 0.170 (0.030) |
Values for rG > |0.200| are displayed in italic. The standard error (SE) for each combination is displayed in parentheses. Trait abbreviations are explained in Table 1
Overall, the genetic correlations (rG) varied between rG = − 0.373 ± 0.024 for BCS against CFc (BCS‑CFc) and rG = 0.456 ± 0.023 for PKG‑CFc. For most combinations, rG values were low, with absolute values of the estimates close to the standard errors. Moderate negative genetic correlations were found for BCS‑CFc (rG = − 0.373 ± 0.024) as well as BCS‑DOc (rG = − 0.293 ± 0.023), along with FKG‑NRc (rG = − 0.208 ± 0.029).
In contrast, moderate positive genetic correlations were found for several traits, such as FKG‑CFc (rG = 0.283 ± 0.024), FKG‑DOc (rG = 0.288 ± 0.022), MKG‑FSc (rG = 0.287 ± 0.022) and PKG‑FSh (rG = 0.254 ± 0.027).
For the exposure traits (Additional file 1, Table S1), the genetic correlations ranged from rG = -0.346 ± 0.019 for BCS‑PKG to rG = 0.831 ± 0.005 for MKG‑PKG. For BCS‑MTY, the genetic correlation estimation failed; however, testing another model excluding the residual covariance revealed an rG close to 1 between both traits (rG = − 0.988 ± 0.001).
The genetic correlations for most traits (45 out of 65 combinations) were found to be low (rG < |0.200|). Within the different fields, moderate (KET‑MIF) up to high genetic correlations (MKG‑PKG) were observed with rG = 0.582 ± 0.028 and rG = 0.831 ± 0.005, respectively.
The outcome traits (Additional file 1, Table S2) showed a range of rG values, from rG = − 0.786 ± 0.015 for FSh‑NRh to rG = 0.912 ± 0.006 for DOc‑FSc. Of 76 trait combinations, 31 had a low genetic correlation (rG < |0.200|). The estimation for two combinations, FSc‑NRc and FSh‑NRc, also failed. However, dropping the residual covariance revealed a high genetic correlation of rG = − 0.985 ± 0.002 and − 0.994 ± 0.002, respectively.
Genome-wide association studies
For all 25 traits (12 exposure, 13 outcome), genetic variants were found that passed the genome-wide significance threshold (p = 1.11 * 10 −6). Additional file 2, Tables S3 to S4, lists all genome-wide significant variants per trait and their corresponding p-values. Manhattan plots for all traits not shown in this section are displayed in Additional file 3, Figures S1 to S4.
Production traits
Signals for all three production traits were found on BTA1, 2, 3, 5, 6, 11, 14, 15, 19, 20, 26, 27, 28, 29 and BTAX. On BTA14, a region between 1,463,676 and 2,909,929 bp contained a prominent peak for all three traits, including SNP ARS-BFGL-NGS-4939 as the most significant hit.
Metabolic traits
The metabolic traits analysed were KET, LMV and MIF. Several SNPs reached the genome-wide significance threshold, with peaks on BTA6 and BTAX (KET and MIF in Fig. 1). The region between 88,592,295 and 88,891,318 bp on BTA6 contained several significant variants, three of which were common to all traits. SNP BTB-01654826 was the most significant hit across all traits in this region.
Fig. 1 [Images not available. See PDF.]
Manhattan plots for exemplary results of GWAS for metabolic traits. Ketosis (KET) and milk fever (MIF). Negative decadic logarithm of p-value of each SNP is shown on the y-axes, and on the x-axes, the 29 autosomes and X chromosome are shown. The red line represents the significance threshold on genome-wide level p = 1.11 * 10–6
Conformation traits
For conformation, the traits BCS, OBS, MTY, OCS, OFL, and OUS were analysed, and peaks on BTA2, 4, 5, 6, 7, 8, 9, 11, 13, 14, 18, 19, 20, 21, 25, 26, 28, 29 and BTAX were found. The strongest signals were on BTA5, 6, 8 and 11 for BCS, OBS and MTY, including ARS-BFGL-NGS-118182 on BTA6 for BCS and MTY as well as ARS-BFGL-NGS-11105 on BTA11 for OBS. The results for BCS and OBS are displayed as examples in Fig. 2.
Fig. 2 [Images not available. See PDF.]
Manhattan plots for exemplary results of GWAS for conformation traits. Body condition score (BCS) and overall body score (OBS). Negative decadic logarithm of p-value of each SNP is shown on the y-axes, and on the x-axes, the 29 autosomes and X chromosome are shown. The red line represents the significance threshold on genome-wide level p = 1.11 * 10−6
Reproduction traits
Genome-wide significant signals were identified for the 13 reproduction traits (CEm and CFc in Fig. 3) on BTA5, 6, 7, 9, 12, 14, 15, 17, 18, 20, 21, 23, 26, 28, 29, and BTAX. In total, two to 33 genome-wide significant SNPs were identified per trait. Furthermore, within the context of all chromosomes, BTAX revealed the highest number of significant associations, with a total of 12 distinct traits. Of all autosomes, BTA6 and BTA18 followed, including significant signals for a maximum of seven different traits concurently. The strongest signal for both autosomes was found for BTB-00277427 on BTA6 and ARS-BFGL-NGS-109285 on BTA18.
Fig. 3 [Images not available. See PDF.]
Manhattan plots for exemplary results of GWAS for reproduction traits. Calving ease maternal (CEm) and calving to first insemination (CFc). Negative decadic logarithm of p-value of each SNP is shown on the y-axes, and on the x-axes, the 29 autosomes and X chromosome are shown. The red line represents the significance threshold on genome-wide level p = 1.11 * 10−6
Mendelian randomisation
For GSMR analyses, 135 out of 156 trait combinations surpassed the pGSMR threshold for significance (pGSMR < (0.05/number of SNPINDEX)). Following the homogeneity filtering process, the number of identified SNPGSMR as instrumental variables ranged from three to 31. Furthermore, the number of SNPs tested for the 13 reproduction-associated outcome traits (SNPOUT) was constant. However, the number of SNPs tested per exposure (SNPEXP) varied depending on the trait. An overview of the outcomes is shown in Table 3, while Additional file 4, Table S5 displays detailed results for the number of identified SNPGSMR and the corresponding pGSMR value for each combination.
Table 3. Overview of GSMR result table for all exposures and outcomes
Exposure | SNPEXP | SNPOUT | SNPINDEX | Significant traits | SNPGSMR | Range bxy |
|---|---|---|---|---|---|---|
Conformation | ||||||
BCS | 101 | 135 | 31 | 12 | 10–17 | − 0.499–0.386 |
MTY | 75 | 135 | 30 | 13 | 09–18 | − 0.400–0.514 |
OCS | 5 | 135 | 21 | 8 | 04–09 | − 0.323–0.262 |
OBS | 68 | 135 | 40 | 12 | 09–20 | − 0.307–0.298 |
OUS | 32 | 135 | 30 | 13 | 08–15 | − 0.526–0.320 |
OFL | 4 | 135 | 25 | 10 | 04–12 | − 0.933–1.335 |
Metabolism | ||||||
KET | 8 | 135 | 29 | 13 | 03–16 | − 0.909–1.543 |
LMV | 5 | 135 | 26 | 12 | 04–14 | − 0.822–0.883 |
MIF | 6 | 135 | 27 | 13 | 10–22 | − 1.496–1.244 |
Production | ||||||
FKG | 117 | 135 | 43 | 10 | 10–18 | − 0.208–0.415 |
MKG | 110 | 135 | 56 | 10 | 16–31 | − 0.207–0.349 |
PKG | 72 | 135 | 41 | 9 | 08–14 | − 0.231–0.557 |
Results for different exposure traits from the different complexes tested. SNPEXP represents the number of genome-wide significant (p = 1.11 * 10 −6) SNPs for exposure traits, and SNPOUT all genome-wide significant (p = 1.11 * 10 −6) SNPs for reproduction traits, respectively. SNPINDEX equals the number of SNPs after the additional clumping filter. Significant traits indicate how many of the 13 tested reproduction traits reached the pGSMR significance threshold. At the same time, SNPGSRM and range bxy represent the number of SNPs and effect size for each of the remaining significant combinations, respectively. Trait abbreviations are explained in Table 1
Out of the 39 combinations tested for the three analysed production traits (FKG, MKG, PKG), ten were below the threshold for statistical significance, ranging from eight to 31 SNPGSMR and a bxy between bxy = − 0.231 for PKG as exposure for ZYS (PKG → ZYS) and bxy = 0.557 (PKG → FSc). Additionally, MKG had the highest SNPINDEX (n = 56) within this group and across all production traits, followed by FKG with a SNPINDEX of 43 and PKG with a SNPINDEX of 41.
All combinations except one (LMV → SBd), KET, LMV, and MIF, representing metabolic traits, exceeded the corresponding significance threshold. The combination of KET → CFc had the lowest number of SNPGSMR, with three different SNPGSMR, while MIF → ZYS had the highest number, with 22 different SNPGSMR. The range of bxy for metabolism was higher than all combinations, varying between bxy = − 1.496 (MIF → FSc) and bxy = 1.543 (KET → NGV). The lowest number of SNPINDEX was observed for LMV (n = 26), followed by MIF with 27 different SNPGSMR. The highest number was observed for KET, with 29 different SNPGSMR.
There were 78 combinations of the six traits and conformation scores tested, of which 68 exceeded the significance threshold. Within this group, we identified four to 20 different SNPGSMR, each with a corresponding bxy value. These values ranged from bxy = − 0.933 (OFL → NRh) to bxy = 1.335 (OFL → FSh). The highest number of SNPINDEX was obtained for OBS, including 40 different SNPGSMR, while the lowest amount was observed for OCS, with only 21. Additional file 4, Table S5 provides further details on all six traits, and Table 4 displays the results for the detected effect sizes bxy.
Table 4. bxy results for all combinations of exposure and outcome
Trait | CEd | CEm | CFc | DOc | FSc | FSh | MET | NGV | NRc | NRh | SBd | SBm | ZYS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BCS | 0.224 (sign) | 0.351 (sign) | − 0.431 (sign) | − 0.499 (sign) | − 0.435 (sign) | − 0.291 (sign) | 0.196 (sign) | 0.242 (sign) | 0.278 (sign) | – | 0.226 (sign) | 0.318 (sign) | 0.386 (sign) |
MTY | − 0.310 (sign) | − 0.365 (sign) | 0.351 (sign) | 0.203 (sign) | 0.514 (sign) | 0.367 (sign) | − 0.227 (sign) | − 0.293 (sign) | − 0.356 (sign) | − 0.206 (sign) | − 0.299 (sign) | − 0.177 (sign) | − 0.400 (sign) |
OCS | – | − 0.315 (sign) | − 0.242 (sign) | – | – | − 0.253 (sign) | 0.262 (sign) | 0.217 (sign) | − 0.297 (sign) | 0.210 (sign) | – | − 0.323 (sign) | – |
OBS | − 0.285 (sign) | – | 0.298 (sign) | 0.153 (sign) | − 0.211 (sign) | 0.208 (sign) | − 0.209 (sign) | 0.100 (sign) | − 0.307 (sign) | − 0.085 (sign) | − 0.171 (sign) | 0.231 (sign) | 0.254 (sign) |
OUS | − 0.514 (sign) | 0.240 (sign) | − 0.321 (sign) | − 0.335 (sign) | − 0.227 (sign) | − 0.316 (sign) | 0.222 (sign) | 0.282 (sign) | 0.320 (sign) | − 0.144 (sign) | − 0.526 (sign) | 0.114 (sign) | 0.290 (sign) |
OFL | – | – | − 0.464 (sign) | 0.743 (sign) | 0.581 (sign) | 1.335 (sign) | 0.405 (sign) | 0.374 (sign) | − 0.225 (sign) | − 0.933 (sign) | 0.207 (sign) | – | 0.468 (sign) |
KET | 0.430 (sign) | 0.561 (sign) | − 0.225 (sign) | − 0.836 (sign) | − 0.909 (sign) | − 0.547 (sign) | 0.806 (sign) | 1.543 (sign) | 0.575 (sign) | 0.786 (sign) | 0.389 (sign) | 0.541 (sign) | 0.617 (sign) |
LMV | − 0.409 (sign) | 0.585 (sign) | − 0.591 (sign) | − 0.822 (sign) | − 0.723 (sign) | − 0.470 (sign) | 0.857 (sign) | 0.395 (sign) | 0.883 (sign) | 0.619 (sign) | – | 0.539 (sign) | 0.678 (sign) |
MIF | 0.404 (sign) | 0.517 (sign) | − 0.977 (sign) | − 1.371 (sign) | − 1.496 (sign) | − 0.645 (sign) | 0.497 (sign) | 1.244 (sign) | 0.502 (sign) | 0.769 (sign) | 0.409 (sign) | 0.578 (sign) | 0.789 (sign) |
FKG | – | − 0.094 (sign) | 0.250 (sign) | 0.377 (sign) | 0.415 (sign) | 0.319 (sign) | − 0.084 (sign) | 0.209 (sign) | − 0.177 (sign) | − 0.127 (sign) | – | – | − 0.208 (sign) |
MKG | – | 0.077 (sign) | 0.296 (sign) | 0.350 (sign) | 0.267 (sign) | 0.199 (sign) | − 0.207 (sign) | − 0.161 (sign) | 0.103 (sign) | 0.048 (sign) | – | – | − 0.120 (sign) |
PKG | – | − 0.215 (sign) | 0.438 (sign) | 0.353 (sign) | 0.557 (sign) | 0.358 (sign) | – | − 0.175 (sign) | − 0.143 (sign) | − 0.091 (sign) | – | – | − 0.231 (sign) |
Exposure traits are listed vertically, and outcome traits are listed horizontally. The exposure traits are grouped by complex (conformation, metabolism, production). Values for bxy greater than |0.700| are shown in italics; non-significant combinations are left empty. Trait abbreviations are explained in Table 1. Detailed results and corresponding p-values in Additional file 4, Table S5 and Additional file 5, Figures S5 to S7
The exposures KET, MIF, MTY, and OUS significantly affected all outcomes. Conversely, all tested exposures displayed significant interrelationships with CFc, FSh, NGV, and NRc. The traits related to production had the lowest effect size, ranging from bxy = − 0.231 to 0.557, followed by conformation traits ranging from bxy = − 0.933 to 1.335. The complex that exhibited the largest variation in effect size was associated with metabolic traits, with a range of bxy values from bxy = − 1.496 to 1.244. In terms of conformation, OFL displayed a notably wider range of variation compared to the other conformation and production traits, which had a relatively similar variance (Table 3).
Downstream analysis
The number of SNPGSMR to be tested for their proximity to known genes per exposure exceeded the range previously presented as SNPEXP. This outcome was due to the different individual counts of SNPGSMR for each unique trait combination. Furthermore, for each exposure, we identified both common and trait-specific SNPs. We included any SNPGSMR in the data set associated with at least one of the significant reproduction traits per exposure. Of the four selected exposures, we used 128 different SNPGSMR that exceeded the significance threshold per complex. The results of this selection were 47 genes being identified.
Thirty-two unique SNPGSMR were analysed for BCS, and 12 nearby genes were identified, including GYS2 on BTA5, KCNQ1 on BTA29, and NPFFR2 on BTA6. Similarly, 28 SNPGSMR were identified for KET analysis, enclosing ten genes, including ST8SIA1 on BTA5 and NPFFR2 on BTA6. Additionally, for OBS, we obtained 40 different SNPGSMR representing 14 genes, including LIN28A on BTA2 and INSR on BTA7. Finally, 53 different SNPGSMR were detected for MKG, leading to the identification of 22 genes, including PAAF1 on BTA15 and GHR on BTA20.
Of the four selected exposures, 18 SNPGSMR were present in at least two different exposure traits involving nine genes. These genes included NPFFR2, AFF1 (both BTA6), DGAT1 (BTA14), NF2 (BTA17), and on BTAX GPC3, AFF2, TRPC5, NHS, and PUDP. All identified genes and overlaps between exposures are shown in Fig. 4.
Fig. 4 [Images not available. See PDF.]
Venn diagram of the selected traits per complex for exposures. Results for the 47 different genes that were identified within the four selected exposure traits as having a significant effect on the outcome. Overall body score (OBS), ketosis (KET), body condition score (BCS) and milk kg (MKG)
Discussion
Genome-wide association studies
We conducted a GWAS on several traits using 235,164 different German Holstein dairy cows as input data for the chosen MR approach. We observed significant genome-wide signals on the autosomes and BTAX for all tested traits, with most hits seen for performance traits and the known lowly heritable functional traits [48]. For instance, in the case of KET, we detected several associated markers on BTA6, including ARS-BFGL-NGS-118182, ARS-BFGL-NGS-17376, and BTB-01654826. Notably, Nayeri et al. [49] also identified ARS-BFGL-NGS-118182 as a marker for KET. Soares et al. [50] described a window on BTA6 ranging from 87,840,175 to 88,893,733 bp for subclinical KET, including all three markers found in this study.
Moreover, several authors identified different SNPs next to the GC gene region (86,953,984 – 87,007,062 bp) directly adjacent to BTA6, which are associated with ketosis or other metabolic disorders in Holstein dairy cattle [51, 52–53]. This finding aligns with additional markers found on BTA6 for other metabolic traits (e.g. BTB-00133212 for MIF). According to McCarthy et al. [54], a larger sample size or meta-analysis could increase the statistical power needed to detect significant associations, especially for lower incidence or complex traits. Furthermore, the numerous hits on BTAX support Sanchez et al.’s findings [55] that BTAX harbours great potential to obtain deeper insights into the genetic architecture of complex traits given the high number of markers and associated physiological pathways.
Variance components
We employed the biREML approach to estimate the direction and relative extent of interrelationships between the traits [39]. Due to computational constraints, we limited the number of animals in the biREML analysis to 48,118. Only those animals that showed expression for both phenotypes tested were included. The biREML analysis revealed weak genetic correlations between exposure and outcome in certain combinations with higher standard errors (see Table 2). This weak genetic correlation was especially noticeable for functional traits like BCS and overall scores.
Furthermore, we used DRPs as phenotypes for our analyses, which were already adjusted based on a mean term [34]. Using DRPs proved advantageous for the GWAS as it increased variance within the partially binary-coded traits. However, it must be noted that pre-correction could introduce bias when estimating variance components or genetic correlations, compared to potentially reduced bias within GWAS [56]. Similarly, it is crucial to avoid biased GWAS results as they can lead to biased MR outcomes [57]. The DRPs were collected at different stages of the routine breeding value estimation. As a result, some traits were corrected for the population mean and, therefore, inverted, for example a lower DRP value for MET implies a higher susceptibility, which is essential when interpreting the GSMR results in relation to the overall bxy. biREML was also used to perform independence control within the exposures and outcomes themselves, as this is one of the main rules for MR [12].
To address potential collider bias, Zhu et al. [17] originally implemented the mtCOJO (multi-trait-based conditional and joint analysis) method. However, in this study, we did not attain the 200,000 markers required for LD correction [58, 59]. As previously reported in the literature (e.g. [60]), we identified strong dependencies between the PKG, MKG, and FKG exposure traits and between individual traits from the metabolic and conformational domains. However, the analysis could not quantify three combinations: FSc and FSh versus NRc and BCS versus MTY. After removing the residual covariance, the estimation model revealed a strong genetic correlation among trait combinations (FSh‑NRc rG = − 0.994 ± 0.002, FSc‑NRc rG = − 0.985 ± 0.002, BCS‑MTY rG = − 0.988 ± 0.001).
Based on the reliability of our GSMR results, biREML had the potential to determine the direction and quantification of causal associations between two traits. When both methods were applied, the same DRPs and genotype data (except for BTAX) were subjected to the same potential limitations or sources of bias mentioned above (e.g. correction of the DRPs). The joint use of GSMR and biREML data enabled an investigation into the relationship between traits based on SNPs from biREML and IVs identified from GSMR. The results for each exposure trait were plotted to facilitate a comparison of the two approaches, and the overall correlation between the obtained rG from biREML and bxy of GSMR results was estimated (Additional file 5, Figures S5 to S7).
Results for the linear overall scores we obtained from both methods exhibited minimal correlation without statistical significance (e.g. OFL: rG and bxy r = − 0.01, p = 0.961; OCS: rG and bxy r = − 0.14, p = 0.630). Nevertheless, there were high correlations between the individual direct traits (e.g. PKG: rG and bxy r = 0.92, p < 0.001; KET: rG and bxy r = 0.93, p < 0.001), which emphasise the informative value and suitability of our approach. The potential limitations resulting in non-significant estimates or low correlations between methods could be attributed to the definition of the trait itself [61] or the independence of the individual traits used as instrumental variables [12].
Moreover, the linear overall scores do not directly quantify the traits; rather, they are extrapolated through other traits that are directly measured and subsequently indexed. This indirect quantification is evident when the results are compared between OBS (rG and bxy r = 0.46, p = 0.111) and BCS (rG and bxy r = 0.89, p < 0.001) (Additional file 5, Figure S5). Both traits are part of the complex conformation. BCS is scored directly and used as an independent trait, whereas OBS is indexed and considers additional factors such as stature, body depth, and chest width.
Mendelian randomisation
Zhu et al.’s GSMR approach [17] has already been applied to smaller cattle cohorts [62, 63]. This study observed significant effects for all tested exposure traits, with at least eight distinct reproduction-associated traits as outcomes. We identified and removed potential (horizontal) pleiotropic outliers following the HEIDI-outlier approach [26] for all these combinations. Notably, vertical pleiotropy does not invalidate the instrumental variable assumptions for MR, whereas horizontal pleiotropy would violate them [12].
In general, three major assumptions must be fulfilled by the IVs [24, 25]: (1) the IV is associated with the exposure, (2) the IV does not affect the outcome except potentially via the exposure, and (3) the IV is not associated with the outcome due to confounding pathways. We fulfilled these aspects by using a strict GWAS p-value significance threshold after Bonferroni correction, which is considered putatively overly conservative [64] but helps to avoid potential bias.
We also applied a strict clumping filter (r2 = 0.05) to remove SNPs in LD blocks with the most significant SNPs per trait. The application reduced the correlation between remaining SNPs while retaining those with the strongest trait-specific statistical evidence [65]. This approach was also relevant for confounding [58] and is supported by the assumption and testing for constant bxy under the hypothesis of direct trait-mediated effects. Thus, there is no heterogeneity in the variant-specific estimates [12, 17]. Although previous studies have described general relationships (e.g. [52]), the main objective was identifying the potential IVs responsible for such effects within a large, somewhat undistorted dataset.
One potential limitation of the GSMR approach is the unavailability of exposure traits and their definitions. In contrast to human studies, where self-determined behaviours such as smoking [22], alcohol consumption [28], or physical activities [29] are frequently used as exposures, comparable characteristics are elusive or unattainable in dairy cattle husbandry. However, negative energy balance could theoretically be used as an estimation characteristic instead of BCS as it does not directly display the breakdown of energy intake into allocation or acquisition over time [66], nor does it show clear interdependency with the investigated reproduction parameters [67]. In addition, measuring devices such as pedometers could be used to collect data on activity behaviour for exposure. However, it is worth noting that activity behaviour can be influenced by physiological processes that affect both the reproductive process [68] and diseases of the movement apparatus, such as lameness [69].
Furthermore, individual factors, such as management conditions, housing structure, herd hierarchy, or herd density, may serve as potential stratification factors for this [70], as opposed to human studies that lack such stratifications. It is also important to note that not all traits have linear relationships, which is not accounted for in the linear approximation of GSMR [17].
For instance, human studies demonstrated a U-shaped relationship between body mass index and mortality [26]. These studies align with the findings of Roche et al. [71], who reported negative impacts on performance, reproduction, and metabolism in dairy cows with high and low BCS scores. However, the selected method enables the quantification and chromosomal localisation of the IVs as drivers of the observed interrelationship. As a result, we can more precisely understand the genetic architecture and distinguish between horizontal and vertical pleiotropic effects.
Regarding the breeding value estimation, the results can be employed to account for known and novel interactions between traits. Moreover, achieving more accurate chromosomal localisation can enhance fine mapping or aid in choosing markers that offer greater informational value. Subsequently, the GSMR and variance component results could improve our understanding of the relationship between complex traits and facilitate further consideration of different traits and their associated weightings within the breeding schemes. A more accurate indexing of different traits could be achieved by better understanding their causal relationships.
Gene associations
In addition to the statistical evidence, we selected four traits to evaluate the possible physiological backgrounds shown by the IVs that might shed light on the causal associations found. We considered 47 different genes (38 on autosomes and 9 on BTAX) obtained from 128 different putative causal SNPGSMR through database research and subsequent literature review. Additional file 6, and Table S6 provide a more detailed overview of the results for all 128 identified SNPGSMR and their corresponding outcome traits.
As an illustration, the gene GYS2 (Glycogen Synthase 2) was identified through ARS-BFGL-NGS-103973 on BTA5 at 89,065,689 bp for BCS and has significant putative effects on 11 reproduction traits. Tribout et al. [72] identified GYS2 as associated with protein yield, while Wathes et al. [73] demonstrated up-regulated GYS2 expression in metabolically imbalanced Holstein Friesian dairy cows. In humans, GYS2 has been linked to polycystic ovary syndrome and obesity-related conditions [74]. These results directly link the observed causal connection between BCS conditions and reproductive performance. Furthermore, Dean [75] highlighted the importance of glycogen balance in mammals during early pregnancy and the contribution of GYS2 to endometrial glycogen levels.
Meanwhile, the gene NPFFR2 (neuropeptide FF receptor 2) located on BTA6 was identified through the same causal SNP (BTA-68275-no-rs) for three exposures: MKG, KET, and BCS. It was significantly linked to 2, 8, and 12 different reproductive traits, respectively. Examples of known and previously described traits affected by NPFFR2 in cattle include dairy form, productive life [76], milk, fat and protein yield [77], somatic cell score [76, 78], longevity [79], and clinical mastitis [80]. NPFFR2 is a functional G protein-coupled receptor involved in regulating the opioid system, cardiovascular function and neuroendocrine function [81]. Endogenous opioids interact with gonadotropin secretion in various mammalian species [82, 83], emphasising the link with reproductive traits.
Likewise, the gene GHR (growth hormone receptor, on BTA20 at 31,933,394 bp) identified for MKG showed a significant association with four different reproductive traits. The role of GHR in beef and dairy cattle has been widely discussed in the literature, and various associations have already been described, including growth, beef performance (e.g. [84]), and dairy performance (e.g. [85]). Waters et al. [85] identified a potential biomarker for yield evaluation in a GHR-related SNP (on BTA20 at 34,153,345 bp) within the untranslated medial exon. They also described the associations with lactation persistency [86], feed efficiency [87], and energy balance [88] as consistent with the suggested interrelationship discovered.
For BCS on BTA29, KCNQ1 (potassium voltage-gated channel subfamily Q member 1) was found to have a significant causal association with 11 different reproduction traits. Associations for cattle, including luteinising hormone [89], udder depth, and somatic cell score [72], have been described. Lafontaine and Sirard [90] highlighted the crucial role of DNA methylation of KCNQ1 in bovine follicles, while Chen et al. [91] demonstrated a link between KCNQ1 and the large offspring syndrome in bovine embryos. In a human study, Gómez-Úriz et al. [92] further highlighted an interaction between obesity and ischaemic stroke with KCNQ1 methylation.
BTAX, the second-largest chromosome in the bovine genome [93], represents more than 40% of the SNPs significantly identified in this study but less than 20% of the genes identified. Currently, most GWAS studies in cattle, such as Sahana et al. [80], exclude BTAX. However, Su et al. [94] demonstrated that considering markers on BTAX could improve the accuracy of genomic prediction. Moreover, Sanchez et al. [55] showed that X-linked genes impact various traits. In our study, three out of the nine genes we identified, namely GPC3 (glypican 3), MBNL3 (muscleblind-like splicing regulator 3), and TRPC5 (transient receptor potential cation channel subfamily C member 5), have previously been linked to dairy traits and stature [55]. Additionally, we found that the AFF2 gene (ALF transcription elongation factor 2) we identified on BTAX for KET and MKG was also reported in a recent study, associating it with milk urea nitrogen [95].
Although our study identified various SNPs, genes, and physiological processes that may be relevant to putative causal associations, we want to address the results of Waters et al. [85]. Using GHR as an example, they demonstrated the impact that a marker’s position could have on the detected effects. Increasing marker density through sequencing or imputation [96] may enhance the results of GWAS and the detection of effects, thereby facilitating subsequent analyses. However, it is important to note that the limited accuracy of genotype data for GWAS may introduce bias in subsequent analyses [57]. Simulations have demonstrated the effect of the chosen method on the accuracy of genotype imputation [97]. Moreover, accurately estimating the genetic architecture of complex traits requires a large number of individuals [98]. Therefore, a large number of animals and a sufficient density of markers are also required to estimate causal associations successfully between these traits.
Conclusion
In conclusion, this study estimated the GWAS summary statistics for 25 traits involving 235,164 individuals. Genome-wide significant SNPs for all traits were identified. Furthermore, we estimated variance components between these traits and analysed their causal associations. The GSMR approach detected significant causal association effects on reproduction traits. Subsequent gene association screening confirmed physiologically plausible effects for four traits related to conformation, metabolism, and production.
Moreover, we identified novel associations, providing leads for further analysis. A large sample size ensured reliable results, even for complex traits with strict thresholds. These findings can enhance our understanding of the genetic structure of these traits and their interrelationship beyond genetic correlation, which can be considered in future breeding strategies to preserve health and production simultaneously.
Acknowledgements
This work used the Scientific Compute Cluster at GWDG, the joint data centre of the Max Planck Society for the Advancement of Science (MPG) and the University of Göttingen. We further acknowledge support by the Open Access Publication Funds of the Göttingen University.
Author contributions
LS performed GWAS, GSMR, and downstream analyses and wrote the manuscript. JH provided the 50K SNP chip dataset, and ZL provided the DRPs. JT supervised the study and participated in its writing. JT, JB, and GT conceived and supervised the study. All authors have read and approved the final manuscript.
Funding
Open Access funding enabled and organized by Projekt DEAL. This work is part of the project ‘QTCC: From Quantitative Trait Correlation to Causation in dairy cattle’ and is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Projektnummer (project number: 448536632). The funders had no role in study design, data collection and analysis, interpretation of data, or writing the manuscript. The publication fee was covered by the Open Access Publication Funds of the Göttingen University. Open Access funding was provided by Projekt DEAL.
Availability of data and materials
The SNP chip genotype data, DRPs, and phenotypes cannot be shared openly, as they are the property of the national computing centre VIT (Vereinigte Informationssysteme Tierhaltung w.V., Verden) in Germany. Summary statistics for genome-wide significant GWAS and GSMR results are provided in the Additional files 1 to 6. Further summary statistics can be provided upon reasonable request.
Declarations
Ethics approval and consent to participate
Not applicable. No animals or animal materials have been used in this study, and ethical approval for the use of animals was not necessary.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Pryce, JE; Royal, MD; Garnsworthy, PC; Mao, IL. Fertility in the high-producing dairy cow. Livest Prod Sci; 2004; 86, pp. 125-135. [DOI: https://dx.doi.org/10.1016/S0301-6226(03)00145-3]
2. Hoekstra, J; van der Lugt, A; van der Werf, J; Ouweltjes, W. Genetic and phenotypic parameters for milk production and fertility traits in upgraded dairy cattle. Livest Prod Sci; 1994; 40, pp. 225-232. [DOI: https://dx.doi.org/10.1016/0301-6226(94)90090-6]
3. Walsh, SW; Williams, EJ; Evans, ACO. A review of the causes of poor fertility in high milk producing dairy cows. Anim Reprod Sci; 2011; 123, pp. 127-138.
4. Hansen, LB; Freeman, AE; Berger, PJ. Yield and fertility relationships in dairy cattle. J Dairy Sci; 1983; 66, pp. 293-305.
5. Simianer, H; Solbu, H; Schaeffer, LR. Estimated genetic correlations between disease and yield traits in dairy cattle. J Dairy Sci; 1991; 74, pp. 4358-4365.
6. Berry, DP; Wall, E; Pryce, JE. Genetics and genomics of reproductive performance in dairy and beef cattle. Animal; 2014; 8,
7. Hardie, LC; Haagen, IW; Heins, BJ; Dechow, CD. Genetic parameters and association of national evaluations with breeding values for health traits in US organic Holstein cows. J Dairy Sci; 2022; 105, pp. 495-508.
8. VIT. Jahresbericht 2020: Vereinigte Informationssysteme Tierhaltung w. V; 2021. https://www.vit.de/fileadmin/Wir-sind-vit/Jahresberichte/vit-JB2020-gesamt.pdf. Accessed 14 Sept 2022.
9. Berry, DP; Bermingham, ML; Good, M; More, SJ. Genetics of animal health and disease in cattle. Ir Vet J; 2011; 64, 5. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/21777492][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3102331][DOI: https://dx.doi.org/10.1186/2046-0481-64-5]
10. Gauly, M; Bollwein, H; Breves, G; Brügemann, K; Dänicke, S; Daş, G et al. Future consequences and challenges for dairy cow production systems arising from climate change in Central Europe - a review. Animal; 2013; 7, pp. 843-859.
11. Barkema, HW; von Keyserlingk, MAG; Kastelic, JP; Lam, TJGM; Luby, C; Roy, J-P et al. Invited review: Changes in the dairy industry affecting dairy cattle health and welfare. J Dairy Sci; 2015; 98, pp. 7426-7445.
12. Burgess, S; Davey Smith, G; Davies, NM; Dudbridge, F; Gill, D; Glymour, MM et al. Guidelines for performing Mendelian randomization investigations. Wellcome Open Res; 2019; 4, 186. [DOI: https://dx.doi.org/10.12688/wellcomeopenres.15555.2] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32760811]
13. O’Connor, LJ; Price, AL. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat Genet; 2018; 50, pp. 1728-1734. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30374074][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684375][DOI: https://dx.doi.org/10.1038/s41588-018-0255-0]
14. Hernán, MA; Robins, JM. Instruments for causal inference: an epidemiologist’s dream?. Epidemiology; 2006; 17, pp. 360-372. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/16755261][DOI: https://dx.doi.org/10.1097/01.ede.0000222409.00878.37]
15. van der Waaij, EH; Holzhauer, M; Ellen, E; Kamphuis, C; de Jong, G. Genetic parameters for claw disorders in Dutch dairy cattle and correlations with conformation traits. J Dairy Sci; 2005; 88, pp. 3672-3678. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/16162542][DOI: https://dx.doi.org/10.3168/jds.S0022-0302(05)73053-8]
16. Gratten, J; Visscher, PM. Genetic pleiotropy in complex traits and diseases: implications for genomic medicine. Genome Med; 2016; 8, 78.
17. Zhu, Z; Zheng, Z; Zhang, F; Wu, Y; Trzaskowski, M; Maier, R et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun; 2018; 9, 224.
18. Xiang, R; MacLeod, IM; Daetwyler, HD; de Jong, G; O’Connor, E; Schrooten, C et al. Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations. Nat Commun; 2021; 12, 860.
19. Werme, J; van der Sluis, S; Posthuma, D; de Leeuw, CA. An integrated framework for local genetic correlation analysis. Nat Genet; 2022; 54, pp. 274-282.
20. Cartwright, N. Are RCTs the gold standard?. BioSocieties; 2007; 2, pp. 11-20. [DOI: https://dx.doi.org/10.1017/S1745855207005029]
21. Mansournia, MA; Higgins, JPT; Sterne, JAC; Hernán, MA. Biases in randomized trials: a conversation between trialists and epidemiologists. Epidemiology; 2017; 28, pp. 54-59. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27748683][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5130591][DOI: https://dx.doi.org/10.1097/EDE.0000000000000564]
22. Larsson, SC; Burgess, S. Appraising the causal role of smoking in multiple diseases: a systematic review and meta-analysis of Mendelian randomization studies. EBioMedicine; 2022; 82,
23. Boehm, FJ; Zhou, X. Statistical methods for Mendelian randomization in genome-wide association studies: a review. Comput Struct Biotechnol J; 2022; 20, pp. 2338-2351.
24. Martens, EP; Pestman, WR; de Boer, A; Belitser, SV; Klungel, OH. Instrumental variables: application and limitations. Epidemiology; 2006; 17, pp. 260-267. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/16617274][DOI: https://dx.doi.org/10.1097/01.ede.0000215160.88317.cb]
25. Greenland, S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol; 2000; 29, pp. 722-729.
26. Zhu, Z; Zhang, F; Hu, H; Bakshi, A; Robinson, MR; Powell, JE et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet; 2016; 48, pp. 481-487.
27. Hackinger, S; Zeggini, E. Statistical methods to detect pleiotropy in human complex traits. Open Biol; 2017; 7, [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/29093210][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5717338][DOI: https://dx.doi.org/10.1098/rsob.170125] 170125.
28. Larsson, SC; Burgess, S; Mason, AM; Michaëlsson, K. Alcohol consumption and cardiovascular disease: a Mendelian randomization study. Circ Genom Precis Med; 2020; 13,
29. Papadimitriou, N; Dimou, N; Tsilidis, KK; Banbury, B; Martin, RM; Lewis, SJ et al. Physical activity and risks of breast and colorectal cancer: a Mendelian randomisation analysis. Nat Commun; 2020; 11, 597.
30. Pryce, JE; Coffey, MP; Brotherstone, S. The genetic relationship between calving interval, body condition score and linear type and management traits in registered Holsteins. J Dairy Sci; 2000; 83, pp. 2664-2671.
31. Butler, WR; Smith, RD. Interrelationships between energy balance and postpartum reproductive function in dairy cattle. J Dairy Sci; 1989; 72, pp. 767-783.
32. VIT. Estimation of breeding values for milk production traits, somatic cell score, conformation, productive life and reproduction traits in German dairy cattle; 2022. https://www.vit.de/fileadmin/DE/Zuchtwertschaetzung/Zws_Bes_eng.pdf. Accessed 1 Feb 2024.
33. Jairath, L; Dekkers, J; Schaeffer, LR; Liu, Z; Burnside, EB; Kolstad, B. Genetic evaluation for herd life in Canada. J Dairy Sci; 1998; 81, pp. 550-562.
34. Liu, Z; Masuda, Y. A deregression method for single-step genomic model using all genotype data. IB; 2021; 56, pp. 41-51.
35. Segelke, D; Chen, J; Liu, Z; Reinhardt, F; Thaller, G; Reents, R. Reliability of genomic prediction for German Holsteins using imputed genotypes from low-density chips. J Dairy Sci; 2012; 95, pp. 5403-5411.
36. Purcell, S; Neale, B; Todd-Brown, K; Thomas, L; Ferreira, MAR; Bender, D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet; 2007; 81, pp. 559-575.
37. Yang, J; Lee, SH; Goddard, ME; Visscher, PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet; 2011; 88, pp. 76-82.
38. Yang, J; Benyamin, B; McEvoy, BP; Gordon, S; Henders, AK; Nyholt, DR et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet; 2010; 42, pp. 565-569.
39. Lee, SH; Yang, J; Goddard, ME; Visscher, PM; Wray, NR. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics; 2012; 28, pp. 2540-2542.
40. Yang, J; Zaitlen, NA; Goddard, ME; Visscher, PM; Price, AL. Advantages and pitfalls in the application of mixed-model association methods. Nat Genet; 2014; 46, pp. 100-106. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/24473328][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3989144][DOI: https://dx.doi.org/10.1038/ng.2876]
41. Willer, CJ; Li, Y; Abecasis, GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics; 2010; 26, pp. 2190-2191.
42. Devlin, B; Roeder, K. Genomic control for association studies. Biometrics; 1999; 55, pp. 997-1004.
43. Wickham, H. Ggplot2: Elegant graphics for data analysis; 2016; Cham, Springer international publishing: [DOI: https://dx.doi.org/10.1007/978-3-319-24277-4]
44. R Core Team. R: a language and environment for statistical computing; 2022; R Foundation for Statistical Computing:
45. McLaren, W; Gil, L; Hunt, SE; Riat, HS; Ritchie, GRS; Thormann, A et al. The ensembl variant effect predictor. Genome Biol; 2016; 17, pp. 1-14. [DOI: https://dx.doi.org/10.1186/s13059-016-0974-4]
46. Tweedie, S; Braschi, B; Gray, K; Jones, TEM; Seal, RL; Yates, B; Bruford, EA. Genenames.org: the HGNC and VGNC resources in 2021. Nucleic Acids Res; 2021; 49, pp. D939-D946.
47. Hanbo Chen. VennDiagram: generate high-resolution venn and euler plots: R package version 1.7.3; 2022. https://CRAN.R-project.org/package=VennDiagram. Accessed 4 Mar 2024.
48. Egger-Danner, C; Cole, JB; Pryce, JE; Gengler, N; Heringstad, B; Bradley, A; Stock, KF. Invited review: overview of new traits and phenotyping strategies in dairy cattle with a focus on functional traits. Animal; 2015; 9, pp. 191-207.
49. Nayeri, S; Schenkel, F; Fleming, A; Kroezen, V; Sargolzaei, M; Baes, C et al. Genome-wide association analysis for β-hydroxybutyrate concentration in Milk in Holstein dairy cattle. BMC Genet; 2019; 20, 58.
50. Soares, RAN; Vargas, G; Duffield, T; Schenkel, F; Squires, EJ. Genome-wide association study and functional analyses for clinical and subclinical ketosis in Holstein cattle. J Dairy Sci; 2021; 104, pp. 10076-10089.
51. Pacheco, HA; da Silva, S; Sigdel, A; Mak, CK; Galvão, KN; Texeira, RA et al. Gene Mapping and Gene-Set Analysis for Milk Fever Incidence in Holstein Dairy Cattle. Front Genet; 2018; 9, 465.
52. Pralle, RS; Schultz, NE; White, HM; Weigel, KA. Hyperketonemia GWAS and parity-dependent SNP associations in Holstein dairy cows intensively sampled for blood β-hydroxybutyrate concentration. Physiol Genomics; 2020; 52, pp. 347-357.
53. Schmidtmann, C; Segelke, D; Bennewitz, J; Tetens, J; Thaller, G. Genetic analysis of production traits and body size measurements and their relationships with metabolic diseases in German Holstein cattle. J Dairy Sci; 2023; 106, pp. 421-438.
54. McCarthy, MI; Abecasis, GR; Cardon, LR; Goldstein, DB; Little, J; Ioannidis, JPA; Hirschhorn, JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet; 2008; 9, pp. 356-369.
55. Sanchez, M-P; Escouflaire, C; Baur, A; Bottin, F; Hozé, C; Boussaha, M et al. X-linked genes influence various complex traits in dairy cattle. BMC Genomics; 2023; 24, 338.
56. Uffelmann, E; Huang, QQ; Munung, NS; de Vries, J; Okada, Y; Martin, AR et al. Genome-wide association studies. Nat Rev Methods Primers; 2021; 1, pp. 1-21. [DOI: https://dx.doi.org/10.1038/s43586-021-00056-9]
57. Brumpton, B; Sanderson, E; Heilbron, K; Hartwig, FP; Harrison, S; Vie, GÅ et al. Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses. Nat Commun; 2020; 11, 3519.
58. Bulik-Sullivan, BK; Loh, P-R; Finucane, HK; Ripke, S; Yang, J; Patterson, N et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet; 2015; 47, pp. 291-295.
59. Finucane, HK; Bulik-Sullivan, B; Gusev, A; Trynka, G; Reshef, Y; Loh, P-R et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet; 2015; 47, pp. 1228-1235.
60. Strucken, EM; Bortfeldt, RH; Tetens, J; Thaller, G; Brockmann, GA. Genetic effects and correlations between production and fertility traits and their dependency on the lactation-stage in Holstein Friesians. BMC Genet; 2012; 13, 108.
61. Seidenspinner, T; Bennewitz, J; Reinhardt, F; Thaller, G. Need for sharp phenotypes in QTL detection for calving traits in dairy cattle. J Anim Breed Genet; 2009; 126, pp. 455-462.
62. Badia-Bringué, G; Canive, M; Fernandez-Jimenez, N; Lavín, JL; Casais, R; Blanco-Vázquez, C et al. Summary-data based Mendelian randomization identifies gene expression regulatory polymorphisms associated with bovine paratuberculosis by modulation of the nuclear factor Kappa β (NF-κß)-mediated inflammatory response. BMC Genomics; 2023; 24, 605. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37821814][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10568764][DOI: https://dx.doi.org/10.1186/s12864-023-09710-w]
63. Xiang, R; van den Berg, I; MacLeod, IM; Daetwyler, HD; Goddard, ME. Effect direction meta-analysis of GWAS identifies extreme, prevalent and shared pleiotropy in a large mammal. Commun Biol; 2020; 3, 88. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32111961][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048789][DOI: https://dx.doi.org/10.1038/s42003-020-0823-6]
64. Johnson, RC; Nelson, GW; Troyer, JL; Lautenberger, JA; Kessing, BD; Winkler, CA; O’Brien, SJ. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics; 2010; 11, 724. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/21176216][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023815][DOI: https://dx.doi.org/10.1186/1471-2164-11-724]
65. Marees, AT; de Kluiver, H; Stringer, S; Vorspan, F; Curis, E; Marie-Claire, C; Derks, EM. A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int J Methods Psychiatr Res; 2018; 27, [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/29484742][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6001694][DOI: https://dx.doi.org/10.1002/mpr.1608] e1608.
66. Puillet, L; Réale, D; Friggens, NC. Disentangling the relative roles of resource acquisition and allocation on animal feed efficiency: insights from a dairy cow model. Genet Sel Evol; 2016; 48, 72. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27670924][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037647][DOI: https://dx.doi.org/10.1186/s12711-016-0251-8]
67. Fenwick, MA; Llewellyn, S; Fitzpatrick, R; Kenny, DA; Murphy, JJ; Patton, J; Wathes, DC. Negative energy balance in dairy cows is associated with specific changes in IGF-binding protein expression in the oviduct. Reproduction; 2008; 135, pp. 63-75.
68. Løvendahl, P; Chagunda, MGG. On the use of physical activity monitoring for estrus detection in dairy cows. J Dairy Sci; 2010; 93, pp. 249-259. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/20059923][DOI: https://dx.doi.org/10.3168/jds.2008-1721]
69. Mazrier, H; Tal, S; Aizinbud, E; Bargai, U. A field investigation of the use of the pedometer for the early detection of lameness in cattle. Can Vet J; 2006; 47, pp. 883-886. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/17017653][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1555681]
70. Shepley, E; Vasseur, E. Graduate student literature review: the effect of housing systems on movement opportunity of dairy cows and the implications on cow health and comfort. J Dairy Sci; 2021; 104, pp. 7315-7322.
71. Roche, JR; Friggens, NC; Kay, JK; Fisher, MW; Stafford, KJ; Berry, DP. Invited review: Body condition score and its association with dairy cow productivity, health, and welfare. J Dairy Sci; 2009; 92, pp. 5769-5801.
72. Tribout, T; Croiseau, P; Lefebvre, R; Barbat, A; Boussaha, M; Fritz, S et al. Confirmed effects of candidate variants for milk production, udder health, and udder morphology in dairy cattle. Genet Sel Evol; 2020; 52, 55.
73. Wathes, DC; Cheng, Z; Salavati, M; Buggiotti, L; Takeda, H; Tang, L et al. Relationships between metabolic profiles and gene expression in liver and leukocytes of dairy cows in early lactation. J Dairy Sci; 2021; 104, pp. 3596-3616.
74. Hwang, J-Y; Lee, E-J; Jin Go, M; Sung, Y-A; Lee, HJ; Heon Kwak, S et al. Genome-wide association study identifies GYS2 as a novel genetic factor for polycystic ovary syndrome through obesity-related condition. J Hum Genet; 2012; 57, pp. 660-664.
75. Dean, M. Glycogen in the uterus and fallopian tubes is an important source of glucose during early pregnancy†. Biol Reprod; 2019; 101, pp. 297-305. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31201425][DOI: https://dx.doi.org/10.1093/biolre/ioz102]
76. Weller, JI; Bickhart, DM; Wiggans, GR; Tooker, ME; O’Connell, JR; Jiang, J et al. Determination of quantitative trait nucleotides by concordance analysis between quantitative trait loci and marker genotypes of US Holsteins. J Dairy Sci; 2018; 101, pp. 9089-9107.
77. Jiang, J; Ma, L; Prakapenka, D; VanRaden, PM; Cole, JB; Da, Y. A large-scale genome-wide association study in U.S. Holstein cattle. Front Genet; 2019; 10, 412.
78. Marete, A; Sahana, G; Fritz, S; Lefebvre, R; Barbat, A; Lund, MS et al. Genome-wide association study for milking speed in French Holstein cows. J Dairy Sci; 2018; 101, pp. 6205-6219.
79. Zhang, Q; Guldbrandtsen, B; Thomasen, JR; Lund, MS; Sahana, G. Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds. J Dairy Sci; 2016; 99, pp. 7289-7298.
80. Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, RF et al. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle. J Dairy Sci; 2014; 97, pp. 7258-7275.
81. Ankö, M-L; Panula, P. Regulation of endogenous human NPFF2 receptor by neuropeptide FF in SK-N-MC neuroblastoma cell line. J Neurochem; 2006; 96, pp. 573-584. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/16336216][DOI: https://dx.doi.org/10.1111/j.1471-4159.2005.03581.x]
82. Genazzani, AR; Genazzani, AD; Volpogni, C; Pianazzi, F; Li, GA; Surico, N; Petraglia, F. Opioid control of gonadotrophin secretion in humans. Hum Reprod; 1993; 8,
83. Goodman, RL; Coolen, LM; Anderson, GM; Hardy, SL; Valent, M; Connors, JM et al. Evidence that dynorphin plays a major role in mediating progesterone negative feedback on gonadotropin-releasing hormone neurons in sheep. Endocrinology; 2004; 145, pp. 2959-2967.
84. Sherman, EL; Nkrumah, JD; Murdoch, BM; Li, C; Wang, Z; Fu, A; Moore, SS. Polymorphisms and haplotypes in the bovine neuropeptide Y, growth hormone receptor, ghrelin, insulin-like growth factor 2, and uncoupling proteins 2 and 3 genes and their associations with measures of growth, performance, feed efficiency, and carcass merit in beef cattle. J Anim Sci; 2008; 86, pp. 1-16.
85. Waters, SM; McCabe, MS; Howard, DJ; Giblin, L; Magee, DA; MacHugh, DE; Berry, DP. Associations between newly discovered polymorphisms in the Bos taurus growth hormone receptor gene and performance traits in Holstein-Friesian dairy cattle. Anim Genet; 2011; 42, pp. 39-49.
86. Do, DN; Bissonnette, N; Lacasse, P; Miglior, F; Zhao, X; Ibeagha-Awemu, EM. A targeted genotyping approach to enhance the identification of variants for lactation persistency in dairy cows. J Anim Sci; 2019; 97, pp. 4066-4075. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31581300][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776498][DOI: https://dx.doi.org/10.1093/jas/skz279]
87. Abo-Ismail, MK; Lansink, N; Akanno, E; Karisa, BK; Crowley, JJ; Moore, SS et al. Development and validation of a small SNP panel for feed efficiency in beef cattle. J Anim Sci; 2018; 96, pp. 375-397.
88. Oikonomou, G; Angelopoulou, K; Arsenos, G; Zygoyiannis, D; Banos, G. The effects of polymorphisms in the DGAT1, leptin and growth hormone receptor gene loci on body energy, blood metabolic and reproductive traits of Holstein cows. Anim Genet; 2009; 40, pp. 10-17.
89. Fortes, MRS; Reverter, A; Hawken, RJ; Bolormaa, S; Lehnert, SA. Candidate genes associated with testicular development, sperm quality, and hormone levels of inhibin, luteinizing hormone, and insulin-like growth factor 1 in Brahman bulls. Biol Reprod; 2012; 87, 58. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/22811567][DOI: https://dx.doi.org/10.1095/biolreprod.112.101089]
90. Lafontaine, S; Sirard, M-A. IGF2R, KCNQ1, PLAGL1, and SNRPN DNA methylation is completed in bovine by the early antral follicle stage. Mol Reprod Dev; 2022; 89, pp. 290-297.
91. Chen, Z; Robbins, KM; Wells, KD; Rivera, RM. Large offspring syndrome: a bovine model for the human loss-of-imprinting overgrowth syndrome Beckwith-Wiedemann. Epigenetics; 2013; 8, pp. 591-601.
92. Gómez-Úriz, AM; Milagro, FI; Mansego, ML; Cordero, P; Abete, I; de Arce, A et al. Obesity and ischemic stroke modulate the methylation levels of KCNQ1 in white blood cells. Hum Mol Genet; 2015; 24, pp. 1432-1440. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/25429063][DOI: https://dx.doi.org/10.1093/hmg/ddu559]
93. Zimin, AV; Delcher, AL; Florea, L; Kelley, DR; Schatz, MC; Puiu, D et al. A whole-genome assembly of the domestic cow. Bos taurus Genome Biol; 2009; 10, R42. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/19393038][DOI: https://dx.doi.org/10.1186/gb-2009-10-4-r42]
94. Su, G; Guldbrandtsen, B; Aamand, GP; Strandén, I; Lund, MS. Genomic relationships based on X chromosome markers and accuracy of genomic predictions with and without X chromosome markers. Genet Sel Evol; 2014; 46, 47. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/25080199][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4137273][DOI: https://dx.doi.org/10.1186/1297-9686-46-47]
95. van den Berg, I; Ho, PN; Nguyen, TV; Haile-Mariam, M; MacLeod, IM; Beatson, PR et al. GWAS and genomic prediction of milk urea nitrogen in Australian and New Zealand dairy cattle. Genet Sel Evol; 2022; 54, 15. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35183113][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8858489][DOI: https://dx.doi.org/10.1186/s12711-022-00707-9]
96. Druet, T; Macleod, IM; Hayes, BJ. Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity; 2014; 112, pp. 39-47.
97. Teng, J; Zhao, C; Wang, D; Chen, Z; Tang, H; Li, J et al. Assessment of the performance of different imputation methods for low-coverage sequencing in Holstein cattle. J Dairy Sci; 2022; 105, pp. 3355-3366.
98. Goddard, ME; Hayes, BJ. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet; 2009; 10, pp. 381-391.
Copyright BioMed Central Dec 2025