The impact of hydrophobicity on the functional significance of sequence variants has rarely been considered in a genome-wide context. Here we test the role of Hydrophobic interactions on functional impact using a set of 70,000 disease and non-disease associated single nucleotide polymorphisms.
Hydrophobic interactions have long been established as essential to stabilizing structured proteins as well as drivers of aggregation, but the impact of hydrophobicity on the functional significance of sequence variants has rarely been considered in a genome-wide context. Here we test the role of hydrophobicity on functional impact using a set of 70,000 disease and non-disease associated single nucleotide polymorphisms (SNPs), using enrichment of disease-association as an indicator of functionality. We find that functional impact is uncorrelated with hydrophobicity of the SNP itself, and only weakly correlated with the average local hydrophobicity, but is strongly correlated with both the size and minimum hydrophobicity of the contiguous hydrophobic domain that contains the SNP. Disease-association is found to vary by more than 6-fold as a function of contiguous hydrophobicity parameters, suggesting utility as a prior for identifying causal variation. We further find signatures of differential selective constraint on domain hydrophobicity, and that SNPs splitting a long hydrophobic region or joining two short regions of contiguous hydrophobicity are particularly likely to be disease-associated. Trends are preserved for both aggregating and non-aggregating proteins, indicating that the role of contiguous hydrophobicity extends well beyond aggregation risk.