- Hands-On Data Science with SQL Server 2017
- Marek Chmel Vladimír Mu?n?
- 523字
- 2021-06-10 19:13:57
SQL Server Integration Services
SQL Server Integration Services (SSIS) was first introduced with SQL Server 2005, which was a major release, where Microsoft revised most of the Sybase code into Microsoft code. Numerous changes did happen in the background of the SQL Server and three main business intelligence services were introduced, SSIS being one of them. The concept of integration service used with SQL Server was, however, introduced with SQL Server 7.0, known as Data Transformation Services (or DTS). As a relic of this service, you can still find executables, such as dtexec.exe, which are used to start an integration services package from the command line on the server.
The main idea of SSIS is to allow the Extract Transform and Load process to be running on your SQL Server. ETL is crucial for the data science work on SQL Server, because on many occasions, we don't work only with the data that is stored and processed on the SQL Server, but we need to get the data into the SQL Server from other systems.
The extract part of the ETL solution is where we can connect to various source systems, such as other DBMS systems, flat files, and other possible data sources. SSIS provides default connectors, but can be also extended by installing proper ODBC drivers to your SQL Server to be able to connect to more data sources if needed.
Data may not be stored in the proper format on the source systems, so frequently the data has to be modified, cleaned, or changed to be stored on the SQL Server. This is where the transform part of the ETL comes into play. SSIS provides numerous tasks that can be used in a sequence to work with incoming data, and to change the data in a way that is optimized for follow-up work with the dataset on the SQL Server.
A major development tool for Integration Services is SQL Server Data Tools (often shortened as SSDT). There are different versions available, depending on your version of SQL Server and also on your environment, and whether you're already running your own version of Visual Studio or not. If you don't have Visual Studio up and running, SSDT will install a minimal shell for the SSDT, so that the tools will be working for you.
To install SSDT, you'll need to download the installation media, as since SQL Server 2014 onward, Data Tools is no longer included on the installation and is provided as a separate download.
While installing Data Tools, you can select what features should be included on your instance, depending on the business intelligence features you have on the SQL Server for which you're getting ready to develop:

Once the installation is over, you can create your Integration Services projects, which will include the control and data flow for the tasks inside of the packages:
