Novedades Denali Integration Services

Post on 07-Jul-2015

372 views 3 download

description

Novedades Denali Integration Services

Transcript of Novedades Denali Integration Services

SQL Server Integration Services

Novedades Denali Integration Services

Víctor Sánchez

BIN-226

DPA – BI Division SQL Server MCITP

VSanchez@SolidQ.com

Francisco González Mentor | Research SQL Server MCITP, MCT

FGonzalezDiaz@SolidQ.com

α El por qué de una herramienta de Integración

α Novedades en Denali CTP3 β Orientadas al Desarrollador

β Implementación, configuración y administración

α Data Quality Services

Agenda

El por qué de una herramienta para Integración

Un poco de historia

αDTS β SQL Server 7

β SQL Server 2000

αSSIS β SQL Server 2005

β SQL Server 2008

β SQL Server 2008R2

β SQL Server Denali

Integration Services

SQLU Summit

SSIS Source Destination

Novedades en Denali Integration Services

¿Has tenido que….

α Implementar variables expuestas en configuración?

α Pasar de variables en patrones Parent – Child?

α Generar de nuevo algún componente que has eliminado por error?

α Añadir anotaciones para documentar un paquete?

α Mapear componentes dependientes al sustituir un componente que genera nuevos id de linaje?

α Buscar elementos con expresiones?

α Esperar a que se resuelva una conexión (Validating…)

Como Developer:

Novedades SQL Server Denali – Integration Services

Mejoras en el interfaz de usuario

Ayuda a los nuevos usuarios

Aumento de la productividad

Administradores de conexiones

compartidos

Requisitos de Clientes

Deshacer

Flexibilidad en la autoría

(FOoA)

Formato DTSX (XML)

Implementación, configuración

y administración

Servidor SSIS

Parámetros

Resolución de problemas a

través del SSMS

Experiencia del desarrollador

¿Que hemos visto?

α Visual Studio 2010

α Soporte .NET 4

α Deshacer / Rehacer

α Nueva imagen

α Proyectos y parámetros

α FOoA

Migrando proyectos

αTodo lo que tienes ahora sigue funcionando en Denali

β Excepto…

γ Tareas Ejecutar Paquetes DTS 2000

γ Tareas ActiveX Script

αMigrar los proyectos al nuevo modelo es opcional

β La mayoría de los proyectos obtendrán beneficios

Qué necesitas saber

Preparando el cambio para el Project Deployment Model

αLos Proyectos γ Contienen paquetes dependientes? Tareas Ejecutar paquetes?

αParámetros β Se utilizan configuraciones?

β Se comparten configuraciones entre paquetes?

αAdministradores de conexión compartidos β Mis paquetes utilizan conexiones comunes?

αEjecución en el servidor γ Hay algún paquete almacenado externamente?

α Asistente para la conversión de proyectos

β Piensa en parámetros de proyecto en lugar de configuraciones compartidas entre paquetes

β No se requieren conexiones para las tareas Ejecutar paquete

α Redefinir en BIDS

β Considera utilizar Conexiones Compartidas globales

β Actualiza las tareas Ejecutar paquete para resolver las referencias basadas en expresiones

Migrando proyectos anteriores

Migrar un proyecto

¿Que hemos visto?

α Asistente Conversión de Proyectos

α Execute Package Task

α Conexiones compartidas

α Despliegue

Escenario: Despliegue de Proyectos

The Server Gives You…

Configuration

Set values for

parameters

Central connection

manager

configuration

Advanced property

override functionality

Security

Transparent

encryption of

projects and

parameter values

Row-level security to

control access to

packages

Management

Interactive package

execution and SQL

Agent integration

Dashboard and built

in reports for

troubleshooting

Integration Services

Integration Services

Versiones

Escenario: Resolviendo problemas

Troubleshooting Information

10,000 foot view 5000 foot view Ground view

Now Query execution state

while package is

running

• What is the current

status of this

package?

• How long has it

been running?

• How many rows

have been

transferred so far?

• Which phases has

my component

completed?

• How many buffers

are used by this

execution?

• How many buffers

have been spooled

to disk?

• Can I debug this

process?

Historical Control which

information is

captured at a server

level

• When this error

occurred, what was

the state of

components,

connection

manager, and

variables

• How long did this

package take to

run in the past?

• How much

memory was

available the last

time this package

was run?

• How many rows

does this package

usually transfer?

• How much time is

spent in each of

the components in

this data flow?

• Memory Dump files

(binary and textual)

Performance Issues

“These packages were running

well for the past 6 months, taking

less than an hour to complete.

Last night’s run took over 7

hours!”

“Where are the bottlenecks in my

package?”

“Can I get instance-specific

package execution counter

information?”

Component Timing & Performance Counters

Ability to find out time spent in the

data flow components

Time

Validate Pre Execute ProcessInput ProcessInput Post Execute

SELECT package_name, task_name, subcomponent_name, SUM(DATEDIFF(ms,start_time,end_time)) as active_time, DATEDIFF(ms,min(start_time),max(end_time)) as total_time FROM catalog.execution_component_phases WHERE execution_id = 1841 GROUP BY package_name, task_name, subcomponent_name, execution_path ORDER BY package_name, task_name, subcomponent_name, execution_path

Querying Performance

Information for a

running execution

select * from catalog.dm_execution_performance_counters (@execution_id)

dm_execution_performance_counter

Data Issues

“Some values in our data

warehouse don’t look right. What

went wrong?”

“No rows written for the last

nightly load. Are we dropping

data?”

“The package works on my dev

box... why is it failing in the

production machine?

Data Tap & Row Counts

Ability to perform data tap

Ability to find out the number of

rows transferred on the server

SELECT package_name, task_name, source_component_name, destination_component_name, rows_sent FROM catalog.execution_data_statistics WHERE execution_id =1836 ORDER BY source_component_name, destination_component_name

DashBoard

DashBoard

DashBoard

Escenario: Limpieza de datos

Limpieza de Datos.

id dni nombre fecha 1 1232 Paco 1

2 1232 Francisco A. 2

3 1232 Francisco Gonzalez 4

4 1234 Victoria Sanchez 5

5 1234 Victor Sanchez 9

sid nombre dni fecha id

1 Francisco Gonzalez 1232 1 1

1 Francisco Gonzalez 1232 2 2

1 Francisco Gonzalez 1232 4 3

2 Victor Sanchez 1234 5 4

2 Victor Sanchez 1234 9 5

Barriendo la casa

Common Data Quality Issues

Data

Quality

Issue Sample Data Problem

Standard Are data elements

consistently defined and

understood ?

Gender code = M, F, U in one system

and Gender code = 0, 1, 2 in another

system

Complete Is all necessary data present

?

20% of customers’ last name is blank,

50% of zip-codes are 99999

Accurate Does the data accurately

represent reality or a

verifiable source?

A Supplier is listed as ‘Active’ but

went out of business six years ago

Valid Do data values fall within

acceptable ranges?

Salary values should be between

60,000-120,000

Unique Data appears several times Both John Ryan and Jack Ryan

appear in the system – are they the

same person?

DQS High Level Scenarios

• Creating and managing the Data Quality

Knowledge Bases

• Discover knowledge from your org’s data samples

• Exploration and integration with 3rd party

reference data

Knowledge

Management &

Reference Data

• Correction, de-duplication and standardization of

the data Cleansing &

Matching

• Tools to monitor and control data quality

processes Administration

Batch Cleansing - Using SSIS

Microsoft Confidential—Preliminary Information Subject to

Change

Reference Data

Definition

Values/Rules

SSIS Data Flow

Source

+

Mappin

g

Data

correction

Component

SSIS Package

Destination

DQS & SSIS

SSIS Data Flow

No olvideis rellenar las evaluaciones en el Portal

del Summit!

Nos encontrareis en la zona de exposición en los

siguientes horarios α Todos los días de 09:30 a 18:00 horas

Víctor Sánchez

MCITP BI SQL 2008 | MCC 2011

VSanchez@SolidQ.com

DPA – BI Division Francisco González

SQL Server MCITP, MCT

FranciscoAGonzalez@SolidQ.com

Mentor | Researcher

Víctor Sánchez

MCITP BI SQL 2008 | MCC 2011

VSanchez@SolidQ.com

DPA – BI Division Francisco González

SQL Server MCITP, MCT

FranciscoAGonzalez@SolidQ.com

Mentor | Researcher