- Effective Amazon Machine Learning
- Alexis Perrier
- 298字
- 2021-07-03 00:17:49
Missing from Amazon ML
Amazon ML offers supervised learning predictions for classification (binary and multiclass) and regression problems. It offers some very basic visualization of the original data and has a preset list of data transformations, such as binning or normalizing the data. It is efficient and simple. However, several functionalities that are important to the data scientist are unfortunately missing from the platform. Lacking these features may not be a deal breaker, but it nonetheless restricts the scope of problems Amazon ML can be applied to.
Some of the common machine learning features Amazon ML does not offer are as follows:
- Unsupervised learning: It is not possible to do clustering or dimensionality reduction of your data.
- A choice of models beside linear models: Non-linear Support Vector Machines, any type of Bayes classification, neural networks, and tree, based algorithms (decision trees, random forests, or boosted trees) are all absent models. All predictions, all experiments will be built on linear regression and logistic regression with the SGD.
- Data visualization capabilities are reduced to histograms and density plots.
- A choice of metrics: Amazon ML uses F1-score and ROC-AUC metrics for classification, and MSE for regression. It is not possible to assess the model performance with any other metric.
- You cannot download your trained model and use it anywhere else than Amazon ML.
Finally, although it is not possible to directly use your own scripts (R, Python, Scala, and so on) within the Amazon ML platform, it is possible and recommended to use other AWS services, such as AWS Lambda, to preprocess the datasets. Data manipulation beyond the transformations available in Amazon ML can also be carried out with SQL if your data is stored in one of the AWS SQL enabled services (Athena, RDS, Redshift, and others).
- 在你身邊為你設計Ⅲ:騰訊服務設計思維與實戰
- 算法競賽入門經典:習題與解答
- 正則表達式必知必會
- Python金融大數據分析(第2版)
- 數據化網站運營深度剖析
- 大數據Hadoop 3.X分布式處理實戰
- Sybase數據庫在UNIX、Windows上的實施和管理
- 數據庫技術實用教程
- “互聯網+”時代立體化計算機組
- 科研統計思維與方法:SPSS實戰
- Python數據分析與挖掘實戰(第3版)
- IPython Interactive Computing and Visualization Cookbook(Second Edition)
- 數據指標體系:構建方法與應用實踐
- 算法設計與分析
- Artificial Intelligence for Big Data