Novedades Denali Integration Services
-
Upload
solidq -
Category
Technology
-
view
371 -
download
3
description
Transcript of Novedades Denali Integration Services
SQL Server Integration Services
Novedades Denali Integration Services
Víctor Sánchez
BIN-226
DPA – BI Division SQL Server MCITP
Francisco González Mentor | Research SQL Server MCITP, MCT
α El por qué de una herramienta de Integración
α Novedades en Denali CTP3 β Orientadas al Desarrollador
β Implementación, configuración y administración
α Data Quality Services
Agenda
El por qué de una herramienta para Integración
Un poco de historia
αDTS β SQL Server 7
β SQL Server 2000
αSSIS β SQL Server 2005
β SQL Server 2008
β SQL Server 2008R2
β SQL Server Denali
Integration Services
SQLU Summit
SSIS Source Destination
Novedades en Denali Integration Services
¿Has tenido que….
α Implementar variables expuestas en configuración?
α Pasar de variables en patrones Parent – Child?
α Generar de nuevo algún componente que has eliminado por error?
α Añadir anotaciones para documentar un paquete?
α Mapear componentes dependientes al sustituir un componente que genera nuevos id de linaje?
α Buscar elementos con expresiones?
α Esperar a que se resuelva una conexión (Validating…)
Como Developer:
Novedades SQL Server Denali – Integration Services
Mejoras en el interfaz de usuario
Ayuda a los nuevos usuarios
Aumento de la productividad
Administradores de conexiones
compartidos
Requisitos de Clientes
Deshacer
Flexibilidad en la autoría
(FOoA)
Formato DTSX (XML)
Implementación, configuración
y administración
Servidor SSIS
Parámetros
Resolución de problemas a
través del SSMS
Experiencia del desarrollador
¿Que hemos visto?
α Visual Studio 2010
α Soporte .NET 4
α Deshacer / Rehacer
α Nueva imagen
α Proyectos y parámetros
α FOoA
Migrando proyectos
αTodo lo que tienes ahora sigue funcionando en Denali
β Excepto…
γ Tareas Ejecutar Paquetes DTS 2000
γ Tareas ActiveX Script
αMigrar los proyectos al nuevo modelo es opcional
β La mayoría de los proyectos obtendrán beneficios
Qué necesitas saber
Preparando el cambio para el Project Deployment Model
αLos Proyectos γ Contienen paquetes dependientes? Tareas Ejecutar paquetes?
αParámetros β Se utilizan configuraciones?
β Se comparten configuraciones entre paquetes?
αAdministradores de conexión compartidos β Mis paquetes utilizan conexiones comunes?
αEjecución en el servidor γ Hay algún paquete almacenado externamente?
α Asistente para la conversión de proyectos
β Piensa en parámetros de proyecto en lugar de configuraciones compartidas entre paquetes
β No se requieren conexiones para las tareas Ejecutar paquete
α Redefinir en BIDS
β Considera utilizar Conexiones Compartidas globales
β Actualiza las tareas Ejecutar paquete para resolver las referencias basadas en expresiones
Migrando proyectos anteriores
Migrar un proyecto
¿Que hemos visto?
α Asistente Conversión de Proyectos
α Execute Package Task
α Conexiones compartidas
α Despliegue
Escenario: Despliegue de Proyectos
The Server Gives You…
Configuration
Set values for
parameters
Central connection
manager
configuration
Advanced property
override functionality
Security
Transparent
encryption of
projects and
parameter values
Row-level security to
control access to
packages
Management
Interactive package
execution and SQL
Agent integration
Dashboard and built
in reports for
troubleshooting
Integration Services
Integration Services
Versiones
Escenario: Resolviendo problemas
Troubleshooting Information
10,000 foot view 5000 foot view Ground view
Now Query execution state
while package is
running
• What is the current
status of this
package?
• How long has it
been running?
• How many rows
have been
transferred so far?
• Which phases has
my component
completed?
• How many buffers
are used by this
execution?
• How many buffers
have been spooled
to disk?
• Can I debug this
process?
Historical Control which
information is
captured at a server
level
• When this error
occurred, what was
the state of
components,
connection
manager, and
variables
• How long did this
package take to
run in the past?
• How much
memory was
available the last
time this package
was run?
• How many rows
does this package
usually transfer?
• How much time is
spent in each of
the components in
this data flow?
• Memory Dump files
(binary and textual)
Performance Issues
“These packages were running
well for the past 6 months, taking
less than an hour to complete.
Last night’s run took over 7
hours!”
“Where are the bottlenecks in my
package?”
“Can I get instance-specific
package execution counter
information?”
Component Timing & Performance Counters
Ability to find out time spent in the
data flow components
Time
Validate Pre Execute ProcessInput ProcessInput Post Execute
SELECT package_name, task_name, subcomponent_name, SUM(DATEDIFF(ms,start_time,end_time)) as active_time, DATEDIFF(ms,min(start_time),max(end_time)) as total_time FROM catalog.execution_component_phases WHERE execution_id = 1841 GROUP BY package_name, task_name, subcomponent_name, execution_path ORDER BY package_name, task_name, subcomponent_name, execution_path
Querying Performance
Information for a
running execution
select * from catalog.dm_execution_performance_counters (@execution_id)
dm_execution_performance_counter
Data Issues
“Some values in our data
warehouse don’t look right. What
went wrong?”
“No rows written for the last
nightly load. Are we dropping
data?”
“The package works on my dev
box... why is it failing in the
production machine?
Data Tap & Row Counts
Ability to perform data tap
Ability to find out the number of
rows transferred on the server
SELECT package_name, task_name, source_component_name, destination_component_name, rows_sent FROM catalog.execution_data_statistics WHERE execution_id =1836 ORDER BY source_component_name, destination_component_name
DashBoard
DashBoard
DashBoard
Escenario: Limpieza de datos
Limpieza de Datos.
id dni nombre fecha 1 1232 Paco 1
2 1232 Francisco A. 2
3 1232 Francisco Gonzalez 4
4 1234 Victoria Sanchez 5
5 1234 Victor Sanchez 9
sid nombre dni fecha id
1 Francisco Gonzalez 1232 1 1
1 Francisco Gonzalez 1232 2 2
1 Francisco Gonzalez 1232 4 3
2 Victor Sanchez 1234 5 4
2 Victor Sanchez 1234 9 5
Barriendo la casa
Common Data Quality Issues
Data
Quality
Issue Sample Data Problem
Standard Are data elements
consistently defined and
understood ?
Gender code = M, F, U in one system
and Gender code = 0, 1, 2 in another
system
Complete Is all necessary data present
?
20% of customers’ last name is blank,
50% of zip-codes are 99999
Accurate Does the data accurately
represent reality or a
verifiable source?
A Supplier is listed as ‘Active’ but
went out of business six years ago
Valid Do data values fall within
acceptable ranges?
Salary values should be between
60,000-120,000
Unique Data appears several times Both John Ryan and Jack Ryan
appear in the system – are they the
same person?
DQS High Level Scenarios
• Creating and managing the Data Quality
Knowledge Bases
• Discover knowledge from your org’s data samples
• Exploration and integration with 3rd party
reference data
Knowledge
Management &
Reference Data
• Correction, de-duplication and standardization of
the data Cleansing &
Matching
• Tools to monitor and control data quality
processes Administration
Batch Cleansing - Using SSIS
Microsoft Confidential—Preliminary Information Subject to
Change
Reference Data
Definition
Values/Rules
SSIS Data Flow
Source
+
Mappin
g
Data
correction
Component
SSIS Package
Destination
DQS & SSIS
SSIS Data Flow
No olvideis rellenar las evaluaciones en el Portal
del Summit!
Nos encontrareis en la zona de exposición en los
siguientes horarios α Todos los días de 09:30 a 18:00 horas
Víctor Sánchez
MCITP BI SQL 2008 | MCC 2011
DPA – BI Division Francisco González
SQL Server MCITP, MCT
Mentor | Researcher
Víctor Sánchez
MCITP BI SQL 2008 | MCC 2011
DPA – BI Division Francisco González
SQL Server MCITP, MCT
Mentor | Researcher