SQL Partition
| Written by Thomas Ivarsson | ||||||||
| Thursday, 29 May 2008 23:54 | ||||||||
| How do I group dimension members dynamically in MDX? Source: MSDN SSAS Newsgroup.
You can create calculated members for dimension and then use them in the query. Example below will create 3 calculated members based on filter condition:
Result will be
|
SSIS Community Tasks and Components
This is a list of SSIS Tasks and Components at codeplex:
In my work, I often to use SQL statement to view tables with a large number columns and check the data against the source data. It is not convenient to see the table in tabular format. Therefore, I wrote the following stored procedure (sp_ViewData) to view the data in form format (vertical view).
ETL Naming Conventions
This document guides ETL developers for the ETL Naming Conventions:
Definitions:
PascalCasing (aka Upper Camel Casing) – The first character of each word is capitalised. Example: FirstName
camelCasing – The first character of each word is capitalized except the first one Example: firstName
*UPPERCASE – All characters are uppercase. Use these identifiers only for abbreviations (eg: ADO) and for Transact SQL keywords (eg: SELECT Id FROM Table)
(reference: http://en.wikipedia.org/wiki/Naming_convention_(programming
) )
1. Package Name Convention
As the number of packages will grow during the project life, it is suggested to have naming convention for ETL package names so that the package can be manageable. The current development culture of defining namespaces
for objects. Accordingly, the following package naming convention should be applied to package names:
[ProjectName].[DestinationDatabaseName].[SubjectName].[Action]_[Table Name].dtsx
OR
[ProjectName].[DestinationDatabaseName].[SubjectName].[Package_Description_Purpose].dtsx
eg.
AdventureWork.EDW.Sales.Load_Internet_Sales.dtxs
AdventureWork.EDW.Sales.Transform_Internet_Sales.dtxs
AdventureWork.EDW.Sales.Load_DimCustomer.dtxs
| Package Name Part | Case |
|---|---|
| Namespace | Pascal Case |
| Package Name | Pascal Case separated by underscores for defining packagce so they stand out in the code e.g. Load_CostData |
2. Name Convention for Tasks/Compontents
Use naming conventions for your tasks and components. I suggest using acronymns at the start of the name and there are some suggestions for these acronymns at the end of this article. This approach does not help a great deal at design-time where the tasks and components are easily identifiable but can be invaluable at debug-time and run-time. e.g. My suggested acronymn for a Data Flow Task is DFT so the name of a data flow task that populates a table called MyTable could be “DFT Load MyTable”.
The acronymns below should be used at the beginning of the names of tasks to identify what type of task it is.
| Task | Prefix |
| For Loop Container | FLC |
| Foreach Loop Container | FELC |
| Sequence Container | SEQC |
| ActiveX Script | AXS |
| Analysis Services Execute DDL | ASE |
| Analysis Services Processing | ASP |
| Bulk Insert | BLK |
| Data Flow | DFT |
| Data Mining Query | DMQ |
| Execute DTS 2000 Package | EDPT |
| Execute Package | EPT |
| Execute Process | EPR |
| Execute SQL | SQL |
| File System | FSYS |
| FTP | FTP |
| Message Queue | MSMQ |
| Script | SCR |
| Send Mail | SMT |
| Transfer Database | TDB |
| Transfer Error Messages | TEM |
| Transfer Jobs | TJT |
| Transfer Logins | TLT |
| Transfer Master Stored Procedures | TSP |
| Transfer SQL Server Objects | TSO |
| Web Service | WST |
| WMI Data Reader | WMID |
| WMI Event Watcher | WMIE |
| XML | XML |
These acronymns should be used at the beginning of the names of components to identify what type of component it is.
| Component | Prefix |
| DataReader Source | DR_SRC |
| Excel Source | EX_SRC |
| Flat File Source | FF_SRC |
| OLE DB Source | OLE_SRC |
| Raw File Source | RF_SRC |
| XML Source | XML_SRC |
| Aggregate | AGG |
| Audit | AUD |
| Character Map | CHM |
| Conditional Split | CSPL |
| Copy Column | CPYC |
| Data Conversion | DCNV |
| Data Mining Query | DMQ |
| Derived Column | DER |
| Export Column | EXPC |
| Fuzzy Grouping | FZG |
| Fuzzy Lookup | FZL |
| Import Column | IMPC |
| Lookup | LKP |
| Merge | MRG |
| Merge Join | MRGJ |
| Multicast | MLT |
| OLE DB Command | CMD |
| Percentage Sampling | PSMP |
| Pivot | PVT |
| Row Count | CNT |
| Row Sampling | RSMP |
| Script Component | SCR |
| Slowly Changing Dimension | SCD |
| Sort | SRT |
| Term Extraction | TEX |
| Term Lookup | TEL |
| Union All | ALL |
| Unpivot | UPVT |
| Data Mining Model Training | DMMT_DST |
| DataReader Destination | DR_DST |
| Dimension Processing | DP_DST |
| Excel Destination | EX_DST |
| Flat File Destination | FF_DST |
| OLE DB Destination | OLE_DST |
| Partition Processing | PP_DST |
| Raw File Destination | RF_DST |
| Recordset Destination | RS_DST |
| SQL Server Destination | SS_DST |
| SQL Server Mobile Destination | SSM_DST |
3. Variable Names
Name convention for SSIS package should be in Pascal Casing
E.g :
IsOk
IsDateEarlierThanOctober
SourceConnectionString
The following common variable names should be used across packages
SourceConnectionString
TargetConnectionString
LogConnectionString
4. Script
SSIS supports both VB.NET & C# Scripts. In HICAR project, the VB.NET Script is used across the SSIS package, particularly Script component. The following table summarizes the capitalisation rules for identifiers and provides examples for the different types of identifiers.
| Identifier | Case | Example |
|---|---|---|
| Class | Pascal | AppDomain |
| Enumeration type | Pascal | ErrorLevel |
| Enumeration values | Pascal | FatalError |
| Event | Pascal | ValueChanged |
| Exception class | Pascal | WebException |
| Read-only static field | Pascal | RedValue |
| Interface | Pascal | IDisposable |
| Method | Pascal | ToString |
| Namespace | Pascal | System.Drawing |
| Parameter | Camel | typeName |
| Property | Pascal | BackColor |
Data Cleansing Strategies
There are serveral methods for data cleansing:
| Cleansing Strategy | Description |
|---|---|
| Remove | Does not populate the target table with error records |
| Custom | Custom function in the target table |
| Set to Min | Sets the attribute value of the error record to the minimum value defined in the data rule. |
| Set to Max | Sets the attribute value of the error record to the maximum value defined in the data rule. |
| Similarity | Uses a similarity algorithm based on permitted domain values to find a value that is similar to the error record. |
| Soundex | Uses a soundex algorithm based on permitted domain values to find a value that is similar to the error record. |
| Merge | Merge duplicate records into a single row. |
1) System Plfatform
DataStage: Windows, Unix, Other Platforms
Informatica: Windows, Unix
Microsoft SSIS: Windows
2) Parallelism
DataStage: Have both sequence (DataStage Server) and parallel (DataStage Enterprise Server )process
Informatica: ?
Microsoft SSIS: ?
3) Deployment facility
DataStage: No
Informatica: Yes
Microsoft SSIS: Yes
4) . Numbers of available transformation functions
Informatica:. 58
DataStage: 28
Microsoft SSIS: 30
5) Slowly Changing Dimension
Informatica:. Supports Full history, recent values, Current & Prev values.
DataStage: Supports only through Custom scripts. Does not have a wizard to do this
Microsoft SSIS: Yes, with a Wizard but difficult to modify
6). Time Dimension generation
Informatica:. Does not support.
DataStage: Does not support
Microsoft SSIS: Does not support, but Time Dimension can be generated by using Microsoft SSAS
7). Rejected Records
Informatica:. Can be captured
DataStage: Cannot be captured in separate file
Microsoft SSIS: Can use conditional split transformation and error redirection to either CSV files or tables.
8). Debugging Facility
Informatica:. Not Supported
DataStage: Supports basic debugging facilities for testing.
Microsoft SSIS: can use Breakpoint, Data Viewer and error handling
9) Application Integration Functionality, support for real Time Data Exchange
Informatica:. Not Available
DataStage: Not Available.
Microsoft SSIS: Not Available.
10) Metadata: Ability to view & navigate metadata on the web
Informatica:. Does not support
DataStage: Job sessions can be monitored using Informatica Classes
Microsoft SSIS: Does not support
11) Ability to Customize views of metadata for different users (DBA Vs Business user)
Informatica:. Supports.
DataStage: Not Available
Microsoft SSIS: Does not support
12) Metadata repository can be stored in RDBMS
Informatica:. Yes
DataStage: No. But the proprietary meta data can be moved to a RDBMS using the DOC Tool
Microsoft SSIS: Does not support
13) Support And Maintenance, Command line operation
Informatica:. Pmcmd –server interface for command line
DataStage: Not Available
Microsoft SSIS: Usage of Dtexec.exe
14) Ability to maintain versions of mappings
Informatica:. Yes
DataStage: No
Microsoft SSIS: No
15) Job Controlling & Scheduling, Alerts like sending mails
Informatica:. Supported.
DataStage: Does not support directly ( no option). But possible to call custom programs after the job get executed)
Microsoft SSIS: Job Agent, Email Notification
16) Languages
Informatica: ?
DataStage: Script, C++
Microsoft SSIS: C# (2005,2008), VB.NET (2005)
17) ETL Proccess
Informatica: ?
DataStage: DataStage Job
Microsoft SSIS: package
18) Support for looping the source row (For While Loop)
Informatica:. Supports for comparing immediate previous record
DataStage: Does not support
Microsoft SSIS: Yes. ForEach Loop
The below MDX fetches the Currency Codes based on the the @Agency parameter value. The STRTOSET function convert string to SET.
WITH MEMBER [Measures].[ParameterCaption] AS [Currency].[Currency Code].CURRENTMEMBER.MEMBER_CAPTION MEMBER [Measures].[ParameterValue] AS [Currency].[Currency Code].CURRENTMEMBER.UNIQUENAME MEMBER [Measures].[ParameterLevel] AS [Currency].[Currency Code].CURRENTMEMBER.LEVEL.ORDINAL SELECT {[Measures].[ParameterCaption], [Measures].[ParameterValue], [Measures].[ParameterLevel]} ON COLUMNS , FILTER ( [Currency].[Currency Code].CHILDREN,NOT ISEMPTY( [Measures].[Closing Balance] ) ) ON ROWS FROM ( SELECT ( STRTOSET(@Agency, CONSTRAINED) ) ON COLUMNS FROM [My Cube])
SQL, C# html code format
Usage of the below to format SQL codes: