SSIS and Sql Server Journey

Personal Notes of Sarabjit Singh and Bhavpreet Singh

SSIS Task: For loop Vs For Each Loop

For Loop

For Loop is used for looping through a number of tasks for a set number of times.

Simply:->  initialize, verify condition, increment\assign

For Each

ForEach Loop loops through various things such as files, objects, ADO connections, etc.

Above screen shot show we have task configured for “Foreach file enumerator” which can be used to move\copy\delete file (of given file) in a given folder

Enumerator configuration:

Folder: Specify the folder path from where we need to process the files

Flies: File name (format) \ file extension

  • Fully qualified: File name along with location is returned .Eg:  C:\Example.txt

  • Name and extension: The file name with its extension is returned.Eg: Example.txt

  • Name only: The file name without its extension is returned.Eg: Example

Variable Mapping

Result of the for loop will be assigned to variable and its value can be used further.

  • Foreach File Enumerator: Enumerates files in a folder

  • Foreach Item Enumeration: Enumerates items in a collection, such as the executables specified in an Execute Process task.

  • Foreach ADO Enumerator: Enumerates rows in a table, such as the rows in an ADO recordset.

  • Foreach ADO.NET Schema Rowset Enumerator: Enumerates schema information about a data source.

  • Foreach From Variable Enumerator: Enumerates a list of objects in a variable, such as an array or ADO.NET DataTable.

  • Foreach NodeList Enumeration: Enumerates the result set of an XML Path Language (XPath) expression.

  • Foreach SMO Enumerator: Enumerates a list of SQL Server Management Objects (SMO) objects, such as a list of views in a database.

Leave a comment »

SSIS: Delete Empty Error Log Files With Dynamic Names

Many a times we have a scenario when we have to read from a file write the dirty row into a text file.
In my case, I had to create the text file for each run hence I created a file with dynamic name.
SSIS would create the file even when there is no error.
Now we need to delete the file in case the file is empty i.e. there are no error rows.

As we all know rowcount task come in hand to calculate the no. of records which error out.

ErrorCount
As the name of the error file is dynamic, we use the EvaluateAsExpression property of the variable.
ErrorFilePathEvaluated
As you can see in the screen short above we have a variable name
DataErrorFilePathEvaluated and the property EvaluateAsExpression is set to true. The expression value is set to append datetime to it.

The variable DataErrorFilePathEvaluated is also used as the ConnectionString of the output error file and same is used as the ConnectionString for the file system task to delete the file in case no records error out.

package

Hope it Helps 🙂

 

Leave a comment »

Run Package in 32 Bit\ 64 Bit Mode from Command Prompt

This is an extension to my earlier post

On a 64 bit OS, the default path to the 64 bit would be "c:\Program Files\Microsoft SQL Server\100\DTS\Binn\DTExec.exe"

Still on 64 bit OS, the path to the 32 version would be "c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\DTExec.exe"

Following is the code to run an SSIS package from command prompt:

1. RUN In 32 Bit mode:

"c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\DTExec.exe" /file C:\PackageFolder\Package.dtsx /CONF C:\ConfigFile.dtsConfig

2. RUN In 64 Bit mode:

"c:\Program Files\Microsoft SQL Server\100\DTS\Binn\DTExec.exe" /file C:\PackageFolder\Package.dtsx /CONF C:\ConfigFile.dtsConfig
1 Comment »

SSIS Script task: FileNotFoundException: Could not load file or assembly While refering to third party DLL

While referring to a third party DLL in the SSIS Script task we encountered following error:

Error: System.Reflection.TargetInvocationException: Exception has been thrown by the target
of an invocation. —> System.IO.FileNotFoundException: Could not load file or assembly
‘xyzzz, Version=5.3.0.26835, Culture=neutral, PublicKeyToken=5b5f377bc08a4d32’ or one
of its dependencies. The system cannot find the file specified.
File name: ‘xyzzz, Version=5.3.0.26835, Culture=neutral,
PublicKeyToken=5b5f377bc08a4d32’
at ST_d83cb23ff54e47979cb61222ad9d6b85.csproj.ScriptMain.CreatePdf(String pstrFileName,
String pstrHtml)
at ST_d83cb23ff54e47979cb61222ad9d6b85.csproj.ScriptMain.GeneratePdf(DataSet pobjDs)
at ST_d83cb23ff54e47979cb61222ad9d6b85.csproj.ScriptMain.Main()

Solution: Both the solutions specified work fine. Either of them can be refereed.

Solution 1: Place the xyzzz.DLL to location “C:\Windows\assembly” (this is where the GAC
will place all the files). The file gets INSTALLED there and you need to UNINSTALL to remove it.

Solution 2 (for 32 bit processor): Place the xyzzz.DLL to location “C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn

Solution 2 (for 64 bit processor): Place the xyzzz.DLL to location “c:\Program Files\Microsoft SQL Server\100\DTS\Binn\”

Hope it Helps 🙂

Feel free to suggest a better option.

11 Comments »

Executing SSIS Package via command prompt using multiple config files

Normally we have only one config file and this is how we execute the package via command prompt::

/FILE "F:PackageFolder\Package.dtsx" /CONFIGFILE "F:\ConfigFileLocation\FirstPackageConfigFile.dtsConfig" /CHECKPOINTING OFF  /REPORTING EW 

In case we have multiple config files this is how the command goes:

/FILE "F:PackageFolder\Package.dtsx" /CONFIGFILE "F:\ConfigFileLocation\PackageConfigFile.dtsConfig" /CONFIGFILE  "F:\ConfigFileLocation\SecondPackageConfigFile.dtsConfig" /CHECKPOINTING OFF  /REPORTING EW 

Hope it helps! 🙂

1 Comment »

SSIS Package EXCEL Connection Error: Solution

Issue Description:

Package was running fine

Error:

[Connection manager “Dest File”] Error: SSIS Error Code DTS_E_OLEDB_EXCEL_NOT_SUPPORTED: The Excel Connection Manager is not supported in the 64-bit version of SSIS, as no OLE DB provider is available.

Solution:

You have an Excel connection. The package will need to run in 32 bit mode. To enable 32 bit mode, right click the project in the Solution Explorer and go to properties. Under Debugging, set the property Run64BitRunTime to False.

Leave a comment »

SSIS Container Good Examples

For Each Container Example

http://beyondrelational.com/modules/2/blogs/106/posts/11122/loop-through-each-record-in-a-text-file-recordset-destination.aspx

Mapping SP variable in OLEDB Source–>

Leave a comment »

Interview question – Quick Links

SSRS

http://www.venkateswarlu.co.in/FAQ/SSRS_Interview_Questions_1.aspx

http://www.venkateswarlu.co.in/FAQ/SSRS_Interview_Questions_2.aspx

http://venkateswarlu.co.in/FAQ/CHAPTER_10.aspx

http://sqlserversolutions.blogspot.com/2011/06/ssrs-interview-questions.html

Running value function: http://mangalpardeshi.blogspot.in/2009/03/runningvalue-function-reporting.html

SSRS Formatting

http://sqlsafety.blogspot.in/2010/02/ssrs-formatting.html

SSRS Functions:

http://www.venkateswarlu.co.in/BI/ssrs/SSRS_Common_Functions__Text.aspx

SSRS Tips and Tricks

http://www.sqlservercentral.com/articles/Reporting+Services+(SSRS)/67660/

Dataware housing

http://blog.sqlauthority.com/2007/07/25/sql-server-data-warehousing-interview-questions-and-answers-introduction/

SSIS

http://venkateswarlu.co.in/FAQ/Chapter_1.aspx — –V  Good

http://www.sql-server-business-intelligence.com/sql-server/interview-questions-and-answers/ssis-interview-questions-and-answers-pdf-download

http://www.aboutsql.in/

SSIS Event Handlers and Error Logging:

http://beyondrelational.com/modules/12/tutorials/24/tutorials/9686/getting-started-with-ssis-part-10-event-handling-and-logging.

Fuzzy Grouping:

http://www.bimonkey.com/2009/11/the-fuzzy-grouping-transformation/

http://f5debug.net/2011/04/25/sql-server-integration-services-ssis-%E2%80%93-part-45-%E2%80%93-fuzzy-grouping-transformation-in-ssis/

Fuzzy Look Up: http://www.bimonkey.com/2009/06/the-fuzzy-lookup-transformation/

SSIS Performance Tuning: http://technet.microsoft.com/en-us/library/cc966529.aspx

SSIS Check Points:

http://simonworth.wordpress.com/2009/11/08/ssis-package-properties-checkpoints/

Multicast Transformation vs Condition Split: 

http://www.mssqltips.com/sqlservertip/2047/ssis-multicast-transformation-vs-conditional-split-transformation/

.Net: http://faq.programmerworld.net/programming/asp.net-interview-questions-answers.htm

Sql Server Performance tuning: http://blog.sqlauthority.com/2008/04/25/sql-server-optimization-rules-of-thumb-best-practices/ <–do Check the comments to the post

SDLC:

http://www.wiziq.com/tutorial/119305-Software-Engineering-What-Why-amp-How

SQL:

http://blog.sqlauthority.com/sql-server-interview-questions-and-answers/

User Defined Datatypes: http://www.venkateswarlu.co.in/articles/SQL/UserDefined_Data_Types_in_sql_server.aspx

Performance Related: http://www.venkateswarlu.co.in/articles/SQL/SQL_performance.aspx

http://www.venkateswarlu.co.in/articles/SQL/Performance_Tuning.aspx

Rank, Dense_Rank, row_number: http://www.venkateswarlu.co.in/articles/SQL/RANK_RowNumber_DenseRank.aspx

Miscellaneous: http://venkateswarlu.co.in/sqlserver/articles.aspx

 

.Net:

http://www.dotnetfunda.com/interview/showcatquestion.aspx?category=33

http://faq.programmerworld.net/programming/asp.net-interview-questions-answers.htm

Single tier – MS access, MS excel- Single file – direct acess —usually single user application

2 tier – Client – Server application  –multi user apps.

Multi Tier applications?

–      3 tier or N tier

–      Presentation Layer , Business Logic and Database Layer

  • Presentation Layer – UI or Front end
    • Handles User interactions – Screen validations
    • Should not contain business logic
  • Business Logic Layer – Middle ware or Back end
    • Set of rules to process the information
    • Should not contain presentation or data access code
  • Data Layer
    • Database , Datasets , RDBMS\DBMS
    • Provide access to back-end i.e. database

3 Tier Architecture Advantages:

–      Easy to maintain

–      Components are resusable

–      Faster development by division of work

  • Web designer can do presentation
  • Software engineer can do the logic part
  • DB devloper and admin can handle the backend and data modeling

http://blog.simcrest.com/what-is-3-tier-architecture-and-why-do-you-need-it/

http://queens.db.toronto.edu/~papaggel/courses/csc309/docs/lectures/web-architectures.pdf

Design Patterns?

Solution to common problems defined by gang of four programmers.

  • Factory  –  Creates an instance of several derived classes
  • Abstract Factory –  Creates an instance of several families of classes
  • Singlton – single instance of a class can exists
  • Builder – Separates object construction from its representation

data modeling?  

  • Process of defining and analysing data requirements to support business  processes.
  • ER diagrams are designed in VISIO to model the data requirements

SDLC Models?

Software development life cycle :

  • Waterfall model
  • Rapid application development Model
  • Incremental\Iterative Model
  • Proto-type Model
  • Spiral Model

Waterfall model – requirement garthering, designing, implementation, verification\testing, maintainace

  • Each phase has to be completed before the next starts
  • A review takes place after a particular phase completes and the next starts. – review is with the aim to check if the project is on right path.
  • No overlapping of phases
  • Easy to use and implement
  • Not good for projects where requirements changes frequently.
  • Poor model for complex projects

Incremental\Iterative Model This method doesn’t require to start with full requirement.

  • Also called Multi – waterfall model
  • Development and delivery is broken down into increments

Proto-type Model In this model , a proto-type (an early approximation of a final system or product) is build for user acceptance and the improvements are done untill the users are convinced.

Spiral Model  Combined features of proto-type and waterfall model.

  • Similar to incremental model
  • Planing, Risk analysis, engineering and evaluation

Rapid application development Model Based on component integration techniques

  • Short development cycle and linear squential model
  • Phases: business modeling , data modeling , process modeling, application modeling, testing.

http://www.slideshare.net/SivaprasanthRentala1975/sdlc-models

White box testing(tranasparent testing) – internal – logical\data flow testing

Black box testing – input and desired output testing, not concerened with the inner code.

Grey box à combination of white and black box

SSIS – Tasks?

  • Data Flow task –extracts data from source, allow transformations, and the load data into target data destination
  • File System task – copy/move/delete files and folder over a file system
  • FTP task  – copy/move/delete files and folder over a FTP
  • Execute package
  • Execute SQL
  • Execute Script
  • Send Mail
  • Execute sql server agent job
  • Notify operator
  • Back up database
  • Bulk insert
  • For Loop:  Repeat a task a fixed number of times
  •  Foreach Loop: Repeat a task by enumerating over a group of objects

http://venkateswarlu.co.in/FAQ/Chapter_1.aspx#.UPvFTB1vNAM

SSIS Auditing and Error Handling?

http://hardik-bhavsar.blogspot.in/2011/08/auditing-and-error-handling-in-ssis.html

SSIS deploying packages:

  1. Database – schedule using Jobs
  2. File system – schedule using control M
  3. Can be called via command prompt or execute the .dtsx package

Sql Server

Normalisation – process of organising data in form of table and define relationship and hence form a realtional database.

RDBMS – ACID – atomicity , consistency, isolation, durability

1 NF – eliminating redundant groups

2 NF – eliminating redundant data

3NF – Eliminate Columns Not Dependent On Key

Trigers – DDL (drop table, create table, alter table or login events)

DML (insert update delete)

Instead off – fired instead off insert update delete

SSRS – variables – overview?

Delete duplicate rows


WITH CTE (COl1,Col2, DuplicateCount) AS
 (SELECT COl1,Col2,
 ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) ASDuplicateCount
 FROM DuplicateRcordTable)
 DELETE FROM CTE WHERE DuplicateCount >1

Master – contain information for all databases

Model – template for database —

MSDB –dts packages, jobs

Resource: read only – contain all system objects – sys.objects

Split Function: write a split function to get output of (1,2,3)  in a table in different rows

Scope Identity and @@identity

@@IDENTITY

It returns the last identity value generated for any table in the current session, across all scopes.

Let me explain this… suppose we create an insert trigger on table which inserts a row in another table with generate an identity column, then @@IDENTITY returns that identity record which is created by trigger.

SCOPE_IDENTITY

It returns the last identity value generated for any table in the current session and the current scope.

Let me explain this… suppose we create an insert trigger on table which inserts a row in another table with generate an identity column, then SCOPE_IDENTITY result is not affected but if a trigger or a user defined function is affected on the same table that produced the value returns that identity record then SCOPE_IDENTITY returns that identity record which is created by trigger or a user defined function.

IDENT_CURRENT

It returns the last identity value generated for a specific table in any session and any scope.

In other words, we can say it is not affected by scope and session, it only depends on a particular table and returns that table related identity value which is generated in any session or scope.

Date , date part functions

http://shawpnendu.blogspot.in/2009/05/datepart-dateadd-datediff-sql-server.html

Case and if statement: difference , syntax


SELECT   ProductNumber, Category =
CASE ProductLine
WHEN 'R' THEN 'Road'
WHEN 'M' THEN 'Mountain'
WHEN 'T' THEN 'Touring'
WHEN 'S' THEN 'Other sale items'
ELSE 'Not for sale'
END,
Name
FROM Production.Product
ORDER BY ProductNumber;

‘If statement’ cannot be used inside select statement <— v important

Table variable and #table difference – advantages and disadvantages:

http://www.codeproject.com/Articles/415184/Table-Variable-V-S-Temporary-Table

major disadvantage of table variable

  • Table variables are Transaction neutral. They are variables and thus aren’t bound to a transaction.
  • Temp tables behave same as normal tables and are bound by transactions

https://ssisdevelopers.wordpress.com/2013/03/04/temp-table-vs-table-variable-sql-server/

Power of CTE

http://stackoverflow.com/questions/14274942/sql-server-cte-and-recursion-example

http://blog.sqlauthority.com/2012/04/24/sql-server-introduction-to-hierarchical-query-using-a-recursive-cte-a-primer/

Can two different SPs create same #table or ##table (temp tables)

Rank dense rank row number

Pivot and unpivot

Find the top third row from a dataset

Find the day(Monday, Tuesday etc) on the first day of the current month (lets say today is 25th feb)

Steps for optimisation

OOPS Concepts? Inheritance, polimorphism (operation overloding and overridding) , encapsulation,abstraction

http://www.c-sharpcorner.com/UploadFile/eecabral/OOPSand.NET211102005075520AM/OOPSand.NET2.aspx

http://manishagrahari.blogspot.in/2011/08/oops.html

http://dotnetstories.wordpress.com/2009/06/21/object-oriented-programming-concepts-with-c3-0/

Boxing – converting a value type to reference type

unboxing — reverese i.e. converting a reference type to value type

http://stackoverflow.com/questions/2111857/why-do-we-need-boxing-and-unboxing-in-c

Delegates – function pointers

Cloud computing

Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). The name comes from the use of a cloud-shaped symbol as an abstraction for the complex infrastructure it contains in system diagrams. Cloud computing entrusts remote services with a user’s data, software and computation. There are many types of public cloud computing:[1]

  • Infrastructure as a service (IaaS)
  • Platform as a service (PaaS)
  • Software as a service (SaaS)
  • Network as a service (NaaS)
  • Storage as a service (STaaS)
  • Security as a service (SECaaS)

In the business model using software as a service, users are provided access to application software and databases. The cloud providers manage the infrastructure and platforms on which the applications run. SaaS is sometimes referred to as “on-demand software” and is usually priced on a pay-per-use basis. SaaS providers generally price applications using a subscription fee.

Advantages:

  • SaaS allows a business the potential to reduce IT operational costs by outsourcing hardware and software maintenance and support to the cloud provider.
  • This enables the business to reallocate IT operations costs away from hardware/software spending and personnel expenses, towards meeting other IT goals.
  • In addition, with applications hosted centrally, updates can be released without the need for users to install new software.

Disadvantages:

  • One drawback of SaaS is that the users’ data are stored on the cloud provider’s server. As a result, there could be unauthorized access to the data.
  • Basically from the security point of view it is not the safest as compared to others.

Software as a Service (SaaS)

Software as a service (SaaS) sometimes referred to as “on-demand software”, is a software delivery model in which software and associated data are centrally hosted on the cloud. SaaS is typically accessed by users using a thin client via a web browser.e.g. Facebook.

Leave a comment »

SSIS Expression To Append Time and Date with File Name Dynamically

Use Following expression to append date and time at the time of package execution with the file name.

"D:\output\LogFile"
+ (DT_WSTR, 50)(DT_DBDATE)GETDATE()+"#"
+(DT_STR,2,1252)DATEPART("hh",GETDATE())+"-"
+(DT_STR,2,1252)DATEPART("mi" ,GETDATE())+"-"
+(DT_STR,2,1252)DATEPART("ss",GETDATE())
+".txt"

Evaluated Expression would look some thing like this:

D:\output\LogFile2012-01-17#12-11-54.txt

Using Variable:

@[User::OutputFileLocation]  : Variable to hold file location

@[User::OutputFileName]  : Variable to hold file name

Expression:

@[User::OutputFileLocation]
+ @[User::OutputFileName]
+ (DT_WSTR, 50)(DT_DBDATE)GETDATE()+"#"
+(DT_STR,2,1252)DATEPART("hh",GETDATE())+"-"
+(DT_STR,2,1252)DATEPART("mi" ,GETDATE())+"-"
+(DT_STR,2,1252)DATEPART("ss",GETDATE())
+".txt"

Hope it Helps 🙂

5 Comments »

BIDS 2008 does not support XML datatype in OLE DB Command

Senario Overview:
We have some SP which expect some XML Type input parameters and also return XML Type parameter.
Having a requirement to insert data in bulk to Database we planned to use SSIS.
Working on the package, I found that SSIS 2008 doesn’t supports XML datatypes in OLE DB Command.
Error Details:
Operand type clash: int is incompatible with xml
Operand type clash: int is incompatible with xml
Googled.. and found that the issue has been solved in latest releases .. 🙂
But, No luck after installing the latest patch.
Found the following work around:
1. Create a new SP and use NVARCHAR(MAX) instead of Input XML Datatypes and call this SP from SSIS ( We could have modified the original though, but preferred avoiding dependencies issues)
2. Call the main SP from the new SP after type casting to XML
Now new issue:
As we have one XML Output datatype . It still went on giving error even after casting to NVARCHAR(MAX)
Finally, NVARCHAR(4000) worked fine.
Not a robust solution but good enuf to full fill our requirements 🙂
Leave a comment »