Parsing of Data
Moderators: chulett, rschirm, roy
Parsing of Data
Hi all,
I have posted some part of this topic before.Not sure how to go with parsing of data. I have a single column Varchar(2000). I want to parse the row into multiple rows. Column looks like,
12:10:07: Dialing12:10:07: Connecting12:10:07: ABC: abc:9433769781$100.88.77.0:878712:10:07: ABCD: 000012:10:07: Agent Initializing12:10:18: On Call12:10:18: Assigned to operator12:10:18: Waiting for Supervisor12:10:18: Waiting for Manager12:11:13: Call Ended12:11:13: Call Not connected
Now, I want to parse the data like, in multiple rows. This is an example for a single time. So, time is not constant.
12:10:07: Dialing
12:10:07: Connecting
12:10:07: ABC: abc:9433769782$100.88.77.0:8787
12:10:07: ABCD: 0000
12:10:07: Agent Initializing
12:10:18: On Call
12:10:18: Assigned to operator
12:10:18: Waiting for Supervisor
12:10:18: Waiting for Manager
12:11:13: Call Ended
12:11:13: Call Not connected
The problem is the identifier.We have to probably tell datastage whenever you see any integer:integer:integer: then break it as a separate.
Any thought how we could achieve this....
I have posted some part of this topic before.Not sure how to go with parsing of data. I have a single column Varchar(2000). I want to parse the row into multiple rows. Column looks like,
12:10:07: Dialing12:10:07: Connecting12:10:07: ABC: abc:9433769781$100.88.77.0:878712:10:07: ABCD: 000012:10:07: Agent Initializing12:10:18: On Call12:10:18: Assigned to operator12:10:18: Waiting for Supervisor12:10:18: Waiting for Manager12:11:13: Call Ended12:11:13: Call Not connected
Now, I want to parse the data like, in multiple rows. This is an example for a single time. So, time is not constant.
12:10:07: Dialing
12:10:07: Connecting
12:10:07: ABC: abc:9433769782$100.88.77.0:8787
12:10:07: ABCD: 0000
12:10:07: Agent Initializing
12:10:18: On Call
12:10:18: Assigned to operator
12:10:18: Waiting for Supervisor
12:10:18: Waiting for Manager
12:11:13: Call Ended
12:11:13: Call Not connected
The problem is the identifier.We have to probably tell datastage whenever you see any integer:integer:integer: then break it as a separate.
Any thought how we could achieve this....
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Sounds like an ideal candidate for looping in a Transformer stage. Search for the next nn:nn:nn pattern, which signals the start of the next line. Or maybe search for the next colon character and parse around that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
This is a design issue on the source data. They should provide it to you with a clear delimiter, even just one space, showing the end of the text message for the timestamp. That way, if the text message ends with a numeric, you aren't forced to parse the data with byte-level edits.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
All,
Have one more question. Can we convert a :SPACE delimiter to just pipe or any other delimiter. I am trying to use convert function in transformer, but not sure if COLONSPACE together can be converted with other delimiter.
As an example,
12:10:07: Dialing
12:10:07: Connecting
I want to replace : space after 12:10:07 by a pipeline or any other delimiter.
Thanks,
Have one more question. Can we convert a :SPACE delimiter to just pipe or any other delimiter. I am trying to use convert function in transformer, but not sure if COLONSPACE together can be converted with other delimiter.
As an example,
12:10:07: Dialing
12:10:07: Connecting
I want to replace : space after 12:10:07 by a pipeline or any other delimiter.
Thanks,
So, use a Server job or from what I've read there is a PX version somewhere in the 9.x release. There's also a pxEReplace posted here which may or may not be problematical, an exact search should turn it (and the discussion of it) up.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Any help on looping structure of the data? I am kind of confused on how to break it when it comes to nn:nn:nn.ray.wurlod wrote:Sounds like an ideal candidate for looping in a Transformer stage. Search for the next nn:nn:nn pattern, which signals the start of the next line. Or maybe search for the next colon character and parse around that.
Thanks a lot.