Getting part of a filename

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Getting part of a filename

Post by karthi_gana »

Hi All,

I have 6 files.

say for example

FileName: 562009_cons_file.txt
Content:
Name Age Salary
aa 35 80000
bb 42 95000

Now i want to get the name 'cons' from the file name and need to store it in the table.

Table Structure:
create table emp
(
Name varchar(25),
Age int,
Salary int,
Platform varchar(25)
)

If you look at the file content, we don't have the value for Platform. We need to get this value from the filename.


if the file name looks like

FileName: 562009_uma_file.txt
Content:
Name Age Salary
aa 35 90000
bb 27 55000

then i have to get 'uma' from the filename and then need to insert it into the table.

Final Output:

Select * from emp will give the below output.

aa 35 80000 cons
bb 42 95000 cons
aa 35 90000 uma
bb 27 55000 uma

Inputs are welcome!
Karthik
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

What have you tried ?
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Hi ,

I hope you have MKSTOOL kit installled on your machine .So , you can use some unix commands to get that part from filename and pass it as a parameter to the job which is loading data into DB table .
Nag
nirdesh2
Participant
Posts: 56
Joined: Thu Nov 20, 2008 12:18 pm
Location: Noida

Post by nirdesh2 »

You can use 'File Name Column' option to get the filename with the file data.
Nirdesh Kumar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not in a Server job. You can easily use Field() to get what you want from the filename, the trick will be getting it. How do you process multiple files, one at a time?
-craig

"You can never have too many knives" -- Logan Nine Fingers
sbass1
Premium Member
Premium Member
Posts: 211
Joined: Wed Jan 28, 2009 9:00 pm
Location: Sydney, Australia

Post by sbass1 »

The way I might approach this is a script which gets a filename as an argument, then echos each line of that file to the console, appending the filename to the beginning or end of the line. You set that script as a filter for a sequential file stage, then use parsing functions in DS (eg. the Field function) to parse the filename to get the portion you want to load in the database.

This is much easier if you use Perl, vbscript, or PowerShell, but here is a rough .BAT file that might get you started:

Code: Select all

@echo off
set f=%1%
for /F "delims=~" %%l in (%f%) do echo %f% %%l
If I name this file ListLines.bat, then call it as ListLines.bat ListLines.bat, my output is:

Code: Select all

C:\My Batch Scripts>ListLines.bat ListLines.bat
ListLines.bat @echo off
ListLines.bat set f=%1%
ListLines.bat for /F "delims=
Oops on the last line...make sure you pick a delimiter that is NOT in your data. Again, this is easier in say Perl (probably faster too). If I were using Perl I might just parse the filename token in the script instead of DS.

Read "help for" in a cmd window for more details.

HTH,
Scott
Some people are the Michael Jordan's of Datastage. I'm more like Muggsy Bogues :-)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That was going to be the next suggestion. :wink:

Done that before on UNIX but had no idea how to do it in a batch file and was poking around online trying to stimulate my old brain cells. Thanks for saving them from an early death.
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

I have six files in a folder. But it may increase in the future.

ap_06052009_cons.txt
ap_05052009_cons.txt
ap_04052009_cons.txt
ap_06052009_uma.txt
ap_05052009_uma.txt
ap_04052009_uma.txt


so i used FOLDER stage to concatenate all the files into a single file.

Because all files has the same meta data. I need to insert all thos file content into a single table.

Folder Stage --> Transform --> Sequential File (It will contain all the values)

I created a job like above.

FOLDER Stage:

Output tab --> Properties --> Wildcard = ap*.txt
Column tab --> I have created two columns. 1) FileName 2) Content

Sequential Stage:

Input tab --> Genaral --> #loadfiledirectory#ap_all_files.txt ( merged result will be stored in this file)

As i said earlier the table doesn't have a column called 'Platform'.

so i need to get the Platform name form the filename itself.

Where i have to use this Field()?
How to implement .BAT file logic here?
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a Folder stage to read the file. This way you can get the file name. Then you can use a Row Splitter stage to get the separate records.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ps. The Folder stage does not 'concatenate all files to a single file'. It does, however, get you the entire file contents in a single row. One row per file, in essence, hence the need for the Row Splitter.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Use the Field() function downstream of the Folder stage in a transformer. Think of your filename field as an "underscore delimited string" and then ask Field() to get you the 3rd field from the string. Then either substring off everything in the result before the dot or just the first three characters if you know it is always three.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Unless the filename includes the full path (as Parallel does), in which case delimit on '/' first to get the actual filename then delimit on '_'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

True... from what I recall there is an option in the Folder stage to return either the full pathname or just the filename. Can't check right now.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Folder stage gives the option of fully qualified pathname or not. Choose not. Then, for example, apply an "MCN" Oconv() to the file name to get the numeric portion.
Last edited by ray.wurlod on Wed May 06, 2009 11:18 pm, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

8) Except the numbers aren't what they need from the filename.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply