官术网_书友最值得收藏!

Chapter 3. Fast SVM Implementations

Having experimented with online-style learning in the previous chapter, you may have been surprised by its simplicity yet effectiveness and scalability in comparison to batch learning. In spite of learning just one example at a time, SGD can approximate the results well as if all the data resides in the core memory and you were using a batch algorithm. All you need is that your stream be indeed stochastic (there are no trends in data) and that the learner is tuned well to the problem (the learning rate is often the key parameter to be fixed).

Anyway, examining such achievements closely, the results are still just comparable to batch linear models but not to learners that are more sophisticated and characterized by higher variance than bias, such as SVMs, neural networks, or bagging and boosting ensembles of decision trees.

For certain problems, such as tall and wide but sparse data, just linear combinations may be enough according to the observation that a simple algorithm with more data often wins over more complex ones trained on less data. Yet, even using linear models and by resorting to explicitly mapping existing features into higher-dimensionality ones (using different order of interactions, polynomial expansions, and kernel approximations), we can accelerate and improve the learning of complex nonlinear relationships between the response and features.

In this chapter, we will therefore first introduce linear SVMs as a machine learning algorithm alternative to linear models, powered by a different approach to the problem of learning from data. Then, we will demonstrate how we can create richer features from the existing ones in order to solve our machine learning tasks in a better way when facing large scale data, especially tall data (that is, datasets having many cases to learn from).

In summary, in this chapter, we will cover the following topics:

  • Introducing SVMs and providing you with the basic concepts and math formulas to figure out how they work
  • Proposing SGD with hinge loss as a viable solution for large scale tasks that uses the same optimization approach as the batch SVM
  • Suggesting nonlinear approximations to accompany SGD
  • Offering an overview of other large scale online solutions besides SGD algorithm made available by Scikit-learn
主站蜘蛛池模板: 林西县| 保康县| 千阳县| 泰安市| 济宁市| 海阳市| 玉田县| 临沧市| 广昌县| 吉林市| 曲松县| 荣成市| 墨脱县| 新邵县| 城步| 钟山县| 称多县| 宜章县| 凌云县| 丰宁| 清远市| 略阳县| 宜兰市| 分宜县| 特克斯县| 塔河县| 惠安县| 兴仁县| 玛沁县| 尚义县| 巴彦淖尔市| 镇安县| 台中县| 永丰县| 阿勒泰市| 长顺县| 荔浦县| 清镇市| 扶绥县| 旌德县| 徐州市|