I have very unfamiliar csv file where lines like this:
"31 lip 2021,""Inna opłata"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""-1,29"",""EUR"",""2 sie 2021"",""111"",""mBank *7981"",""Środki zostały wysłane"",""--"",""111"",""--"",--,""--"",""--"",""--"",""--"",""--"",""--"",""0%"",""--"",""--"",""--"",""--"",""--"",""-5,7"",""PLN"",""4,43151"",""FEE-111"",""Opłata za nazwę pomocniczą przedmiotu """
I’ve used GenericParserAdapter but result not happy:
Result (ItemArray):
[0] "31 lip 2021" object {string}
[1] ""Inna opłata"" object {string}
[2] ""--"" object {string}
[3] ""--"" object {string}
[4] ""--"" object {string}
[5] ""--"" object {string}
[6] ""--"" object {string}
[7] ""--"" object {string}
[8] ""--"" object {string}
[9] ""--"" object {string}
[10] """-1" object {string}
[11] "29""" object {string}
[12] ""EUR"" object {string}
[13] ""2 sie 2021"" object {string}
[14] ""111"" object {string}
[15] ""mBank *7981"" object {string}
[16] ""Środki zostały wysłane"" object {string}
[17] ""--"" object {string}
[18] ""111"" object {string}
[19] ""--"" object {string}
[20] "--" object {string}
[21] ""--"" object {string}
[22] ""--"" object {string}
[23] ""--"" object {string}
[24] ""--"" object {string}
[25] ""--"" object {string}
[26] ""--"" object {string}
[27] ""0%"" object {string}
[28] ""--"" object {string}
[29] ""--"" object {string}
[30] ""--"" object {string}
[31] ""--"" object {string}
[32] ""--"" object {string}
[33] """-5" object {string}
[34] "7""" object {string}
[35] ""PLN"" object {string}
[36] """4" object {string}
[37] "43151""" object {string}
[38] ""FEE-111"" object {string}
[39] """Opłata za nazwę pomocniczą przedmiotu " object {string}
Column 10 and 11 are split (36, 37 too) , but this is one value and cannot be split.
How to properly configure parser (or split idea) and resolve this issue? Any solution?
2
Answers
Finally i resolve this problem like this:
Somehow the full row is converted to a single field, and all double quotes are escaped with another double quote.
The row should look like this instead (which parses fine):
One solution might be to parse the data twice. First to convert to the original row, then to parse the data.