How to read a binary EBCDIC file that is on Hadoop

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
trenicar
Premium Member
Premium Member
Posts: 51
Joined: Thu Sep 18, 2003 4:38 am

How to read a binary EBCDIC file that is on Hadoop

Post by trenicar »

Hi

I have various binary EBCDIC files which I need to read from Hadoop.

I have read the files via a CFF stage (EBCDIC) when I have just the files. How can I read from Hadoop? Do I need to land the files somewhere and then use the CFF stage?

The problem is they are very large files 100's GB

Or is there another way?
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

I'm not yet familiar with the connectivity with Hadoop, so others will need to offer advice on that, but there should be two main options for you: read the file from the Hadoop server directly using CFF, or FTP the file to your local DataStage server first.

You can use FTP directly into your processing job. I have dozens of jobs doing that from the mainframe. However, an FTP session is less stable than a local read. CFF is definitely the preferred method.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply