15.1. Architecture
SSIS consists of four key parts:
Integration Services service
Manages storage of packages and monitors running Integration Services packages.
Integration Services object model
Comprises native and managed APIs for accessing Integration Services tools, command-line utilities, and custom applications.
Integration Services runtime
Saves the layout of packages, runs packages, and supports logging, breakpoints, configuration, connections, and transactions. SSIS runtime executables are the packages, containers, tasks, and event handlers that perform workflow functionality.
Data flows
Move data from source to destination with optional transformation. There are three types of data-flow
componentssource, transformation, and destination (load).
15.1.1. SSIS Objects
The SSIS object model is built on eight primary objects:
Package
A collection of connections, control-flow elements,
data-flow elements, event handlers, variables, and configurations either created using SSIS graphical-design tools or built programmatically.
Control flow
Tasks, containers, and constraints that connect executables into an ordered flow.
Data flow
Sources and destinations that extract and load the data, data transformations, and paths linking sources, transformations, and destinations. The data flow is created within a data-flow taskan executable that creates, orders, and runs the data flow.
Connection manager
Defines the connection string for accessing data that tasks, transformations, and event handlers in the package use.
Event handler
A workflow that runs in response to events raised by a package, task, or container.
Configuration
A set of name-value pairs that defines the properties of the package and its tasks, containers, variables, connections, and event handlers when the package runs. Separating configuration from the package lets you change the properties of the package without changing the package. These objects also facilitate deploying packages from development servers to production servers and moving packages between servers/environments.
Log provider
Defines the destination type and format used to log runtime information for packages, containers, and tasks.
System and user-defined variables
Stores values that SSIS packages, tasks, and event handlers use at runtime and exposes information about packages at runtime.
Figure 15-1 shows the relationship between the SSIS objects.
The following subsections describe each of these objects in more detail.
15.1.2. Control-Flow Elements
SSIS provides three different types of control-flow elements that can be nested:
Containers
Provide structures for grouping tasks
and implementing repeating control flow in packages. SSIS provides three types of containers
:
Foreach Loop container
Repeats control flow for each element in a collection.
For Loop container
Repeats control flow while a test expression evaluates as TRue.
Sequence container
Defines a subset of the control flow in a package. This lets you manage and execute a group of executables (tasks and containers) as a single unit.
Tasks
Provide functionality within the package. SSIS provides seven types of tasks:
Workflow tasks
Communicate with other processes to run packages or programs, send and receive messages between packages, send email messages, read Windows Management Instrumentation (WMI) data, and monitor WMI events.
Data-flow tasks
Define and run data flows that extract, transform, and load data.
Data-preparation tasks
Copy files and directories, download files and data, save data returned by web methods, and manipulate XML documents.
SQL Server tasks
Access, copy, create, modify, and delete SQL Server data and objects.
Analysis Services tasks
Create, modify, delete, and process Analysis Services objects. Analysis Services is discussed in Chapter 20.
Scripting tasks
Extend package functionality with custom scripts.
Maintenance tasks
Perform numerous administrative functionsback up the database, check database integrity, execute SQL Service Agent jobs, execute T-SQL statements, clean up history, notify operators, rebuild and reorganize indexes, shrink databases, and update statistics.
Precedent constraints
Connect containers and tasks within packages in an ordered flow. You can control the sequence of execution and specify conditions that determine whether containers and tasks run.
15.1.3. Data-Flow Components
Data-flow components are connected within a data-flow
task using integration service paths. These paths map the outputs of one data-flow component to the inputs of the next data-flow component in the data-flow task. SSIS provides three types of data-flow components:
Source
Retrieve data from an external data source available to components in the data flow. A source has one or more outputs that make source columns available to the next component in the data flow, and can have one or more error outputs. SSIS provides the sources described in Table 15-1. You can develop custom sources if these do not meet your needs.
Table 15-1. SSIS data-flow sourcesSource | Description |
---|
DataReader | Data from a .NET Framework data provider | Excel | Data from an Excel file | Flat File | Data from a flat file | OLE DB | Data from an OLE DB provider | Raw File | Raw data from a file | Script Component | Data from the results of executing a script | XML | Data from an XML file |
Transformation
Performs tasks such as updating, aggregating, cleaning, distributing, and merging data. A transformation can have single or multiple inputs and outputs depending on the task it performs, and can also have one or more error outputs. SSIS
provides business intelligence, row, rowset, split and join transformations, as well as other miscellaneous transformations. You can develop custom transformations if these do not meet your needs. See Microsoft SQL Server 2005 Books Online for more information about the built-in transformations.
Destination
Loads data from a data flow into external data sources or creates an in-memory DataSet. Destinations have one or more inputs and optionally one or more error outputs. SSIS provides the destinations described in Table 15-2. You can develop custom destinations if these do not meet your needs.
Table 15-2. SSIS data-flow destinationsDestination | Description |
---|
Data Mining Model Training | Trains a data-mining model | DataReader | Exposes data through an ADO.NET DataReader interface | Dimension Processing | Loads and processes an Analysis Services dimension | Excel | Loads an Excel file | Flat File | Loads a flat file | OLE DB | Loads an OLE DB data destination | Partition Processing | Loads and processes an Analysis Services partition | Raw File | Loads a raw file | Recordset | Exposes data through an ADO.NET Recordset | Script Component | Loads data using a script | SQL Server Mobile | Loads a SQL Server Mobile database | SQL Server Destination | Bulk loads data to a SQL Server 2005 table or view |
15.1.4. Connection Managers
A connection manager describes the connection to a data source for accessing data that tasks, transformations, and event handlers in the package use. SSIS creates the connections when a package runs. You can define multiple connections for a package. SSIS provides the connection manager types described in Table 15-3.
Table 15-3. SSIS connection manager typesType | Description |
---|
ADO | Connects to a data source using ADO | ADO.NET | Connects to a data source using the ADO.NET data provider | EXCEL | Connects to an Excel file | FILE | Connects to a single file or folder | FLATFILE | Connects to data in a single flat file | FTP | Connects to an FTP server | HTTP | Connects to a web service or web site | MSMQ | Connects to a Microsoft Message Queue (MSMQ) queue | MSOLAP90 | Connects to an instance of Analysis Services or to an Analysis Services project | MULTIFILE | Connects to multiple files and folders | MULTIFLATFILE | Connects to data in multiple flat files | ODBC | Connects to a data source using ODBC | OLEDB | Connects to a data source using OLE DB | SMOServer | Connects to SQL Server Management Objects (SMO) | SMTP | Connects to an SMTP server | SQLMOBILE | Connects to a SQL Server Mobile database | WMI | Connects to a WMI server |
15.1.5. Events
SSIS executablespackages, Foreach Loop containers, For Loop containers, Sequence containers, and task host containersraise events at runtime. You can write custom event handlers to extend package functionality and simplify administration. If an event does not have a handler, the event bubbles up to the next container in the package hierarchy until it is finally raised to the package. Table 15-4 describes the SSIS runtime events.
Table 15-4. SSIS runtime eventsEvent | Description |
---|
OnError | Raised by an executable when an error occurs | OnExecStatusChanged | Raised by an executable when its execution status changes | OnInformation | Raised by an executable during validation and execution to report information | OnPostExecute | Raised by an executable immediately after it finishes running | OnPostValidate | Raised by an executable immediately after it finishes validating | OnPreExecute | Raised by an executable immediately before it starts running | OnPreValidate | Raised by an executable immediately before it starts validating | OnProgress | Raised by an executable when progress has been made during execution | OnQueryCancel | Raised by an executable to determine whether it should stop running | OnTaskFailed | Raised by a task when it fails | OnVariableValueChanged | Raised by an executable when the value of a variable changes | OnWarning | Raised by an executable when a warning occurs |
15.1.6. Package Configurations
SSIS provides package configurations for updating property values at runtime. Each package configuration is a property-value pair. Configurations simplify deploying packages
to multiple servers, simplify moving packages between servers, and add flexibility to packages by allowing configuration properties to be easily changed. SSIS supports the package-configuration
types described in Table 15-5.
Table 15-5. SSIS package-configuration typesType | Description |
---|
XML configuration file | Configuration information is stored in an XML file, which can contain multiple configurations. | Environment variable | Configuration information is stored in an environment variable. | Registry entry | Configuration information is stored in the registry. | Parent package variable | Configuration information is stored in a variable in the package. | SQL Server table | Configuration information is stored in a SQL Server table, which can contain multiple configurations. |
15.1.7. Log Providers
SSIS includes log providers that implement logging in packages, containers, and tasks to help you audit and troubleshoot. SSIS offers five log providers, as described in Table 15-6. You can develop custom log providers if these do not meet your needs.
Table 15-6. Log providersProvider | ProgID | Description |
---|
Text file | DTS.LogProviderTextFile.1 | Writes log entries to a text file in comma-separated value (CSV) format | SQL Server Profiler | DTS.LogProviderSQLProfiler.1 | Writes log entries to SQL Server traces that can be viewed using SQL Server Profiler | SQL Server | DTS.LogProviderSQLServer.1 | Writes log entries to the sysdtslog90 table in a SQL Server 2005 database | Windows Event log | DTS.LogProviderEventLog.1 | Writes log entries to the Application log in the Windows Event log on the local computer | XML file | DTS.LogProviderXMLFile.1 | Writes log entries to an XML file |
15.1.8. Variables
Variables store values that SSIS
packages, tasks, and event handlers use at runtime. SSIS supports two types of variables
: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by SSIS. You can create user-defined variables for all SSIS container typespackages, Foreach Loop containers, For Loop containers, Sequence containers, tasks, and event handlers. Variables are scoped within the scope of a container and are accessible to the children of the container. A variable can raise an event when its value changes.
|