- Scala Machine Learning Projects
- Md. Rezaul Karim
- 234字
- 2021-06-30 19:05:42
Population-Scale Clustering and Ethnicity Prediction
Understanding variations in genome sequences assists us in identifying people who are predisposed to common diseases, curing rare diseases, and finding the corresponding population group of individuals from a larger population group. Although classical machine learning techniques allow researchers to identify groups (that is, clusters) of related variables, the accuracy and effectiveness of these methods diminish for large and high-dimensional datasets such as the whole human genome.
On the other hand, Deep Neural Networks (DNNs) form the core of deep learning (DL) and provide algorithms to model complex, high-level abstractions in data. They can better exploit large-scale datasets to build complex models.
In this chapter, we apply the K-means algorithm to large-scale genomic data from the 1000 Genomes project analysis aimed at clustering genotypic variants at the population scale. Finally, we train an H2O-based DNN model and a Spark-based random forest model for predicting geographic ethnicity. The theme of this chapter is give me your genetic variants data and I will tell your ethnicity.
Nevertheless, we will configure H2O so that the same setting can be used in upcoming chapters too. Concisely, we will learn the following topics throughout this end-to-end project:
- Population-scale clustering and geographic ethnicity prediction
- The 1000 Genomes project, a deep catalog of human genetic variants
- Algorithms and tools
- Using K-means for population-scale clustering
- Using H2O for ethnicity prediction
- Using random forest for ethnicity prediction
- AWS:Security Best Practices on AWS
- 空間傳感器網絡復雜區域智能監測技術
- 樂高創意機器人教程(中級 下冊 10~16歲) (青少年iCAN+創新創意實踐指導叢書)
- INSTANT Varnish Cache How-to
- Pig Design Patterns
- Supervised Machine Learning with Python
- Windows 7寶典
- Implementing Splunk 7(Third Edition)
- Implementing AWS:Design,Build,and Manage your Infrastructure
- 新編計算機圖形學
- 工業機器人實操進階手冊
- 基于Proteus的單片機應用技術
- 未來學徒:讀懂人工智能飛馳時代
- Creating ELearning Games with Unity
- 人工智能:智能人機交互