官术网_书友最值得收藏!

Scientific libraries used in the book

Throughout this book, certain libraries are necessary to implement the machine-learning techniques discussed in each chapter. We are going to briefly describe the most relevant libraries employed hereafter:

  • SciPy is a collection of mathematical methods based on the NumPy array objects. It is an open source project so it takes advantage of additional methods continuously written from developers around the world. Python software that employs a SciPy routine is part of advanced projects or applications comparable to similar frameworks such as MATLAB, Octave or RLab. There are a wide range of methods available from manipulating and visualizing data functions to parallel computing routines that enhance the versatility and potentiality of the Python language.
  • scikit-learn (sklearn) is an open source machine learning module for Python programming language. It implements various algorithms such as clustering, classification, and regression including support vector machines, Naive Bayes, Decision Trees, Random Forests, k-means, and Density Based Spatial Clustering of Applications with Noise (DBSCAN) and it interacts natively with numerical Python libraries such as NumPy and SciPy. Although most of the routines are written in Python, some functions are implemented in Cython to achieve better performance. For instance, support vector machines and logistic regression are written in Cython wrapping other external libraries (LIBSVM, LIBLINEAR).
  • The Natural Language Toolkit (NLTK), is a collection of libraries and functions for Natural Language Processing (NLP) for Python language processing. NLTK is designed to support research and teaching on NLP and related topics including artificial intelligence, cognitive science, information retrieval, linguistics, and machine learning. It also features a series of text processing routines for tokenization, stemming, tagging, parsing, semantic reasoning, and classification. NLTK includes sample codes and sample data and interfaces to more than 50 corpora and lexical databases.
  • Scrapy is an open source web crawling framework for the Python programming language. Originally designed for scraping websites, and as a general purpose crawler, it is also suitable for extracting data through APIs. The Scrapy project is written around spiders that act by providing a set of instructions. It also features a web crawling shell that allows the developers to test their concepts before actually implementing them. Scrapy is currently maintained by Scrapinghub Ltd., a web scraping development and services Company.
  • Django is a free and open source web application framework implemented in Python following the model view controller architectural pattern. Django is designed for creation of complex, database-oriented websites. It also allows us to manage the application through an administrative interface, which can create, read, delete, or update data used in the application. There are a series of established websites that currently use Django, such as Pinterest, Instagram, Mozilla, The Washington Times, and Bitbucket.
主站蜘蛛池模板: 登封市| 桃源县| 泾川县| 宜春市| 永兴县| 白水县| 和田市| 类乌齐县| 晋州市| 望都县| 和静县| 波密县| 新密市| 迭部县| 松潘县| 瑞安市| 泸溪县| 武强县| 涟源市| 河北省| 五指山市| 遂川县| 读书| 绥阳县| 余姚市| 湟源县| 商都县| 和平县| 黄骅市| 柘荣县| 璧山县| 景德镇市| 白水县| 铜川市| 合山市| 青海省| 景洪市| 东宁县| 英德市| 泰和县| 抚顺县|