官术网_书友最值得收藏!

Machine Learning Services

SQL Server 2016 has introduced a very important service for data science with SQL Server and this was the integration of the R language and R scripts into the SQL Server. With the release of SQL Server 2016, this feature was still named R Services, as the R Server was also available as a standalone product or add-on to other programs or frameworks. With SQL Server 2017, Microsoft has added the support for the Python language too, so you can choose from both major languages used in data science.

Neither of the languages are installed by default, so you need to include any of them during the installation, where you can choose the proper environment as a part of Machine Learning Services, which is the new name used since the 2017 release:

Installing languages R and Python

Machine Learning Services can be used to perform different work with the data stored or incoming to the database than we can regularly see on DBMS systems. As a common machine learning approach, with SQL Server, Machine Learning Services can be used to predict properties of new data by learning from a sample of data stored in the database. With Microsoft SQL Server 2016, the only way to run the R script inside of the database was to use a new stored procedure, sp_execute_external_script, which used, as one of the parameters, the R script that you would like to execute. With SQL Server 2017, you can also use a new function called PREDICT, which can be used for native scoring on the data, assuming that you have built and trained the models for machine learning before using that function.

Even if you install Machine Learning Services, running external scripts in R or Python is disabled by default. If you would like to enable the feature, you can use the following T-SQL code to make the configuration change on your instance:

sp_configure 'external scripts enabled',1
GO
RECONFIGURE
GO

When you use Python, you can use either Visual Studio, Visual Studio Code, Management Studio, or any other application used for Python development. However, you always need to consider which version of Python is being used with Machine Learning Services, as Python 2.x and 3.x differ a little. Microsoft is using Continuum Analytics Anaconda distribution of Python in Machine Learning Services; specifically, the version of Python 3.5.2 is being used at the time of writing this book.

To check which version of Python you are running, you can use following T-SQL code:

EXEC sp_execute_external_script @language = N'Python',
@script = N'import sys
print(sys.version)'

Important things to notice are the language parameter, which accepts two languages (either R or Python), marked with the N sign for unicode, and the script variable that contains the script itself, which is to be executed. If R is your language of choice, you can use a wide variety of tools available for development, if you keep in mind that version of R 3.3.3 is used with SQL Server. Using external tools for development can be more flexible and comfortable, compared with SQL Server Management Studio, but keep in mind that if you use any package that is not part of the SQL Server deployment, you'll need to do extra steps to install the package on the SQL Server, either for Python or for R.

As for Python, Microsoft has chosen the Anaconda distribution. The story with R language is different. R language was developed back in the 1980s, where it became a significant language used by data science and the academics community from the 1990s. When R become very popular and widely used, Microsoft joined the R consortium to help develop the R language and get better support for it. In 2015, Microsoft acquired a company named Revolution Analytics and rebranded many of Revolution Analytics's products. These included:

  • Microsoft R Server
  • Microsoft R Open
  • Data Science VM (available in Azure)

As a part of the effort with R language, Microsoft has created its own package repository called MRAN, available at https://mran.microsoft.com/download

Microsoft Open R is based on the R language release, but includes numerous improvements for performance, platforms, and scalability, while keeping full compatibility with packages available on the CRAN repository, the official R package repository for the R project, which is available at https://cran.r-project.org/.

While Microsoft R Open is free and is available for download for numerous platforms, Microsoft has also rebranded Revolution R Enterprise into two different products. One stream went into the SQL Server as R Services or Machine Learning Services in 2017 release and the other is available as a standalone installation for different operating systems and frameworks, such as the following:

  • Red Hat
  • Suse
  • Teradata
  • Hadoop
主站蜘蛛池模板: 田东县| 富裕县| 益阳市| 邢台县| 裕民县| 海宁市| 博野县| 鹿泉市| 专栏| 高碑店市| 贵定县| 郎溪县| 阳高县| 文化| 华容县| 稷山县| 安陆市| 德钦县| 伽师县| 云霄县| 莱西市| 湘潭市| 禄丰县| 拜泉县| 温州市| 白城市| 青铜峡市| 芒康县| 濮阳市| 章丘市| 乐平市| 伊宁县| 佛教| 蛟河市| 平果县| 大埔县| 波密县| 沁源县| 夏津县| 会宁县| 响水县|