官术网_书友最值得收藏!

Scientific libraries used in the book

Throughout this book, certain libraries are necessary to implement the machine-learning techniques discussed in each chapter. We are going to briefly describe the most relevant libraries employed hereafter:

  • SciPy is a collection of mathematical methods based on the NumPy array objects. It is an open source project so it takes advantage of additional methods continuously written from developers around the world. Python software that employs a SciPy routine is part of advanced projects or applications comparable to similar frameworks such as MATLAB, Octave or RLab. There are a wide range of methods available from manipulating and visualizing data functions to parallel computing routines that enhance the versatility and potentiality of the Python language.
  • scikit-learn (sklearn) is an open source machine learning module for Python programming language. It implements various algorithms such as clustering, classification, and regression including support vector machines, Naive Bayes, Decision Trees, Random Forests, k-means, and Density Based Spatial Clustering of Applications with Noise (DBSCAN) and it interacts natively with numerical Python libraries such as NumPy and SciPy. Although most of the routines are written in Python, some functions are implemented in Cython to achieve better performance. For instance, support vector machines and logistic regression are written in Cython wrapping other external libraries (LIBSVM, LIBLINEAR).
  • The Natural Language Toolkit (NLTK), is a collection of libraries and functions for Natural Language Processing (NLP) for Python language processing. NLTK is designed to support research and teaching on NLP and related topics including artificial intelligence, cognitive science, information retrieval, linguistics, and machine learning. It also features a series of text processing routines for tokenization, stemming, tagging, parsing, semantic reasoning, and classification. NLTK includes sample codes and sample data and interfaces to more than 50 corpora and lexical databases.
  • Scrapy is an open source web crawling framework for the Python programming language. Originally designed for scraping websites, and as a general purpose crawler, it is also suitable for extracting data through APIs. The Scrapy project is written around spiders that act by providing a set of instructions. It also features a web crawling shell that allows the developers to test their concepts before actually implementing them. Scrapy is currently maintained by Scrapinghub Ltd., a web scraping development and services Company.
  • Django is a free and open source web application framework implemented in Python following the model view controller architectural pattern. Django is designed for creation of complex, database-oriented websites. It also allows us to manage the application through an administrative interface, which can create, read, delete, or update data used in the application. There are a series of established websites that currently use Django, such as Pinterest, Instagram, Mozilla, The Washington Times, and Bitbucket.
主站蜘蛛池模板: 乌审旗| 水城县| 浦北县| 珠海市| 博客| 百色市| 阿尔山市| 通江县| 横峰县| 松潘县| 青海省| 建德市| 开阳县| 绥化市| 镇原县| 晋中市| 台安县| 石狮市| 永川市| 新昌县| 上林县| 汨罗市| 南岸区| 江口县| 边坝县| 蓬溪县| 万安县| 徐汇区| 大宁县| 女性| 新闻| 桂东县| 安福县| 南靖县| 云浮市| 大洼县| 盐池县| 大足县| 萍乡市| 霍城县| 唐山市|