Reading multiple xlsx files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
deesh
Participant
Posts: 193
Joined: Mon Oct 08, 2007 2:57 am

Reading multiple xlsx files

Post by deesh »

Hi,

My requirement also same like this requirement, getting multiple .xlsx files.
Can i know how the template looks?

My requirement :

1. Getting multiple files and read it into text file
2. In target we have max 100 columns, but each file has different column counts,
Example: file1 contains 10 columns, File2 contains 15 columns, File3 contains 100 columns.
3. .xlsx file header(column) contains spaces and special char, how to handle it.

Do you have an idea please help out to us.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:!: Split you out into your own topic.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

1) do you have the unstructured file stage that can read/write excel? If not, do you have a way to read from excel? Excel itself can write to a database directly, if you wanted to use a stage-table approach. You could also dump the excel to csv files, those are simple to handle.

2) this seems easy enough... you just need a way to relate the data across the files, whether that is row 1 = row 1 across files or some key etc.

3) There are all kinds of string processing functions that can handle specific or general problems like this. Rename the column by dropping/replacing the bad characters might be all you need, and simple to do.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

UCDI wrote:1) do you have the unstructured file stage that can read/write excel?
Not in version 8.

In version 8 you will need to save in CSV format, and read that, or use an ODBC driver for Microsoft Excel that is compatible both with DataStage version 8 and the .xlsx format for Excel. (And there are strict format rules when using ODBC.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply