Tuesday, February 1, 2011

diff between sequential file vs dataset vs fileset

Seq File:
--->Extract/load from/to seq file max 2GB
---->when used as a source at the time of compilation it will be converted into native format from ASCII
----->doesnot support null values
------>A seg file can only be accessed on one node.
Dataset:
----->it preserves partition.it stores data on the nodes so when you read from a dataset you dont have to repartition the data
------>it stores data in binary in the internal format of datastage.so it takes less time to read/write from ds to any other
Fileset:
----->It stores data in the format similar to that of sequential file.Only advantage of using fileset over seq file is it preserves partition scheme
-----> you can view the data but in the order defined in partitiong scheme..

3 comments:

  1. very useful short and quick answer i was looking for

    ReplyDelete
  2. excellent answer that quoted the main difference in a precise way

    ReplyDelete
  3. Why dataset is faster compare to fileset??

    ReplyDelete