In this case you know that column will be changed to twelve different columns.For normalization you need a key. Process is as follows:
Assume your source data is like this:
Key month
aaa Jan
aaa Feb
aaa Mar
aaa Apr
aaa May
aaa Jun
aaa Jul
aaa Aug
aaa Sep
aaa oct
aaa nov
aaa Dec
bbb Jan
bbb Feb
: :
: :
: :
1.Sort your data based on that key column(key1)
2.Use transformer to assign key column value to stage variable stg1.
3.Initialize stg1 with the value which is not expected in the source data (like '')
3.Compare Key1 value with stg1 if it matches then then concatenate the value of another column
Output of the transformer should be :
Key month
aaa Jan
aaa Jan;Feb
aaa Jan;Feb;Mar
aaa Jan;Feb;Mar;Apr
aaa Jan;Feb;Mar;Apr;May
aaa Jan;Feb;Mar;Apr;May;Jun
aaa Jan;Feb;Mar;Apr;May;Jun;Jul
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;
bbb Jan
bbb Jan;Feb
After transformer use hash file, aggregation or remove duplicate stage to keep the last value for a key column and output will be like :
Key month
aaa Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;
bbb Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;oct;
This is required data which is delimited. You can insert it into the table as well.
*****************
actually my requirement is different, that is like
my source is like
ename month sal
10 jan 20000
10 feb 20000
10 mar 20000
10 apr 20000
10 may 20000
10 jun 20000
10 jul 20000
10 aug 20000
10 sep 20000
10 oct 20000
10 nov 20000
10 dec 20000
my output is like
ename jan feb mar apr may jun jul aug sep oct nov dec
10 20000 20000 20000 20000 20000 20000 20000 20000 20000 20000 2000 20000
how to do this requirement in datastage.
my source is like
ename month sal
10 jan 20000
10 feb 20000
10 mar 20000
10 apr 20000
10 may 20000
10 jun 20000
10 jul 20000
10 aug 20000
10 sep 20000
10 oct 20000
10 nov 20000
10 dec 20000
my output is like
ename jan feb mar apr may jun jul aug sep oct nov dec
10 20000 20000 20000 20000 20000 20000 20000 20000 20000 20000 2000 20000
how to do this requirement in datastage.
*************
Hers s one of the solution for this requirement.
Have a transformer with a stage variable derivation-
if @INROWNUM=1 then DSLink223.sal else stagevar:',':DSLink223.sal
At the output of the tfm, u ll ve 13 columns as mentioned by you
ename jan feb...................... dec
For these outputs the derivation would be
ename------>ename
jan----------->Field(stagevar,',',1)
Feb----------->Field(stagevar,',',2)
|
|
|
Dec------------>Field(stagevar,',',12)
Please note that the tfm must be in sequential mode.
Next to the tfm stage u can use a remove duplicate stage to filter out the last record. Key-ename.
Have a transformer with a stage variable derivation-
if @INROWNUM=1 then DSLink223.sal else stagevar:',':DSLink223.sal
At the output of the tfm, u ll ve 13 columns as mentioned by you
ename jan feb...................... dec
For these outputs the derivation would be
ename------>ename
jan----------->Field(stagevar,',',1)
Feb----------->Field(stagevar,',',2)
|
|
|
Dec------------>Field(stagevar,',',12)
Please note that the tfm must be in sequential mode.
Next to the tfm stage u can use a remove duplicate stage to filter out the last record. Key-ename.
my requirement is like this
ReplyDeletecol
abc123
abc124
abc125
output
col1 col2
abc 123
abc 124
abc 125
how to solve this?
Hi Tiru,
ReplyDeleteyou can solve using seq.file->transformer->data set
transformer stage---create two variable like A and B
Left(dslink3.col,3)---A
Right(DSlink3.col,3)--B
derivation append two new columns Like-Col1 and Col2
A-----Col1
B.....Col2