官术网_书友最值得收藏!

Why Python for data science and machine learning?

Before moving on with more technical discussions, I think it's helpful to explain the choice of Python as the programming language for this book. In the last decade, research in the field of data science and machine learning has seen exponential growth, with thousands of valuable papers and dozens of complete tools. In particular, thanks to its efficiency, elegance, and compactness, Python has been chosen by many researchers and programmers to create a complete scientific ecosystem that has been released for free.

Nowadays, packages such as scikit-learn, SciPy, NumPy, Matplotlib, pandas, and many others represent the backbone of hundreds of production-ready systems and their usage keeps growing. Moreover, complex deep learning applications such as Theano, TensorFlow, and PyTorch allow every Python user to create and train complex models without any speed limits. In fact, it's important to note that Python is not a scripting language anymore. It supports dozens of specific tasks (for example, web frameworks and graphics) and it can be interfaced with native code written in C or C++.

For such reasons, Python is an optimal choice in almost any data science project and due to its features all programmers with different backgrounds can easily learn to use it effectively in a short time. Other free solutions are also available (for example, R, Java, or Scala), however, in the case of R, there's complete coverage of statistical and mathematical functions but it lacks the support frameworks that are necessary to build complete applications. Conversely, Java and Scala have a complete ecosystem of production-ready libraries, but, in particular, Java is not as compact and easy to use as Python. Moreover, the support for native code is much more complex and the majority of libraries rely exclusively on the JVM (with a consequent performance loss).

Scala has gained an important position in the big data panorama, thanks to its functional properties and the existence of frameworks such as Apache Spark, (which can be employed to carry out machine learning tasks with big data). However, considering all the pros and cons, Python remains the optimal choice and that's why it has been chosen for this book.

主站蜘蛛池模板: 精河县| 隆昌县| 称多县| 哈尔滨市| 汶上县| 七台河市| 时尚| 三江| 巴林左旗| 额尔古纳市| 会宁县| 太康县| 嘉善县| 满洲里市| 法库县| 建昌县| 临沭县| 永泰县| 全南县| 五莲县| 孝感市| 宿州市| 将乐县| 新干县| 望谟县| 甘德县| 乐至县| 沅江市| 灵丘县| 隆安县| 宜城市| 井研县| 常宁市| 锡林浩特市| 东源县| 特克斯县| 连城县| 沙坪坝区| 太湖县| 合肥市| 凤翔县|