Tuesday, February 1, 2011

pivot stage

This can be done very easily if you know how many columns will be generated after reverse pivoting(normalization) . For example if you have a table where you have data for all months in the same column but in different rows.

In this case you know that column will be changed to twelve different columns.For normalization you need a key. Process is as follows:

Assume your source data is like this:
Key month
aaa Jan
aaa Feb
aaa Mar
aaa Apr
aaa May
aaa Jun
aaa Jul
aaa Aug
aaa Sep
aaa oct
aaa nov
aaa Dec
bbb Jan
bbb Feb
: :
: :
: :

1.Sort your data based on that key column(key1)
2.Use transformer to assign key column value to stage variable stg1.
3.Initialize stg1 with the value which is not expected in the source data (like '')
3.Compare Key1 value with stg1 if it matches then then concatenate the value of another column

Output of the transformer should be :
Key month
aaa Jan
aaa Jan;Feb
aaa Jan;Feb;Mar
aaa Jan;Feb;Mar;Apr
aaa Jan;Feb;Mar;Apr;May
aaa Jan;Feb;Mar;Apr;May;Jun
aaa Jan;Feb;Mar;Apr;May;Jun;Jul
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;nov
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;nov;Dec
bbb Jan
bbb Jan;Feb

After transformer use hash file, aggregation or remove duplicate stage to keep the last value for a key column and output will be like :
Key month
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;nov;Dec
bbb Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;nov;Dec

This is required data which is delimited. You can insert it into the table as well.

*****************
actually my requirement is different, that is like
my source is like
ename month sal
10 jan 20000
10 feb 20000
10 mar 20000
10 apr 20000
10 may 20000
10 jun 20000
10 jul 20000
10 aug 20000
10 sep 20000
10 oct 20000
10 nov 20000
10 dec 20000

my output is like
ename jan feb mar apr may jun jul aug sep oct nov dec
10 20000 20000 20000 20000 20000 20000 20000 20000 20000 20000 2000 20000

how to do this requirement in datastage.
*************
Hers s one of the solution for this requirement.

Have a transformer with a stage variable derivation-
if @INROWNUM=1 then DSLink223.sal else stagevar:',':DSLink223.sal

At the output of the tfm, u ll ve 13 columns as mentioned by you
ename jan feb...................... dec
For these outputs the derivation would be
ename------>ename
jan----------->Field(stagevar,',',1)
Feb----------->Field(stagevar,',',2)
|
|
|
Dec------------>Field(stagevar,',',12)

Please note that the tfm must be in sequential mode.
Next to the tfm stage u can use a remove duplicate stage to filter out the last record. Key-ename.

2 comments:

  1. my requirement is like this

    col
    abc123
    abc124
    abc125

    output
    col1 col2
    abc 123
    abc 124
    abc 125

    how to solve this?

    ReplyDelete
  2. Hi Tiru,
    you can solve using seq.file->transformer->data set
    transformer stage---create two variable like A and B
    Left(dslink3.col,3)---A
    Right(DSlink3.col,3)--B

    derivation append two new columns Like-Col1 and Col2
    A-----Col1
    B.....Col2

    ReplyDelete