Page 1 of 1

Splitting CSV fields with Quotes

Posted: Wed Feb 22, 2006 10:52 pm
by jzparad
Is there a nice simple way of splitting a CSV record that contains quoted strings with embedded delimiters?

Example:

Code: Select all

,,,"Pint, US liquid",PT,,,,

Thanks in advance!

Posted: Wed Feb 22, 2006 11:44 pm
by ray.wurlod
Yes. Get "them" to deliver you a legal CSV file! Nothing could be easier.

If it's an export from Excel, go for tab-delimited.

Posted: Wed Feb 22, 2006 11:57 pm
by jzparad
Thanks Ray.

I was hoping that since a lot of the plug-ins provide a way of defining a quote character that maybe there was some obscure function that offered the same functionality.

Posted: Thu Feb 23, 2006 2:09 am
by ray.wurlod
It's almost an impossible problem to solve without writing a routine. You read the entire row as a single string and process it through the routine that has a "WithinQuotes" flag that tells it whether to ignore the comma inside the quotes, and maybe change the real delimiters to something else, such as a tab.

Code: Select all

FUNCTION FixLine(TheLine, QuoteCharacter)
* Changes a comma-delimited line with embedded commas in quotes to a tab-delimited line

Equate Tab To Char(9)
Equate Comma To ","

String = TheLine
LenString = Len(TheLine)
WithinQuotes = @FALSE

Ans = ""

For Pos = 1 To LenString
   Ch = String[Pos,1]
   If Ch = QuoteCharacter
   Then
      WithinQuotes = Not(WithinQuotes)
   End
   Else
      If Ch = Comma
      Then
         If WithinQuotes
         Then
            Ans := Comma
         End
         Else
            Ans := Tab
         End
      End
      Else
         Ans := Ch
      End
   End
Next Pos

If WithinQuotes
Then
   Call DSTransformError("Mismatched Quotes", "FixLine")
End

RETURN(Ans)

Posted: Fri Feb 24, 2006 9:53 pm
by rleishman
It's the weekend, so I'm probably on delay. I know I'll regret this in the morning, but isn't

Code: Select all

,,,"Pint, US liquid",PT,,,,
a legal CSV record? The SEQ file stage would handle this easily delimiter set to (,) and Quote Character set to (").

I thought the optional quotes were there to handle embedded delimiters (commas).

Further, embedded quotes are nicely handled by doubling them:

Code: Select all

,,,"Pint, ""US"" liquid",PT,,,,

Posted: Fri Feb 24, 2006 10:55 pm
by jzparad
Yes, it would - if you used the SEQ file stage. What I wanted was a function that could be used in BASIC (which I should have stated in the original question).