GenericParserAdapter - parse ebay csv transaction

RomiGrabarz
October 30, 2021
197 views
1 vote
2 Answers

I have very unfamiliar csv file where lines like this:

"31 lip 2021,""Inna opłata"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""-1,29"",""EUR"",""2 sie 2021"",""111"",""mBank *7981"",""Środki zostały wysłane"",""--"",""111"",""--"",--,""--"",""--"",""--"",""--"",""--"",""--"",""0%"",""--"",""--"",""--"",""--"",""--"",""-5,7"",""PLN"",""4,43151"",""FEE-111"",""Opłata za nazwę pomocniczą przedmiotu """

I’ve used GenericParserAdapter but result not happy:
Result (ItemArray):

        [0] "31 lip 2021"   object {string}
        [1] ""Inna opłata""   object {string}
        [2] ""--""    object {string}
        [3] ""--""    object {string}
        [4] ""--""    object {string}
        [5] ""--""    object {string}
        [6] ""--""    object {string}
        [7] ""--""    object {string}
        [8] ""--""    object {string}
        [9] ""--""    object {string}
        [10]    """-1"    object {string}
        [11]    "29"""    object {string}
        [12]    ""EUR""   object {string}
        [13]    ""2 sie 2021""    object {string}
        [14]    ""111""   object {string}
        [15]    ""mBank *7981""   object {string}
        [16]    ""Środki zostały wysłane""    object {string}
        [17]    ""--""    object {string}
        [18]    ""111""   object {string}
        [19]    ""--""    object {string}
        [20]    "--"    object {string}
        [21]    ""--""    object {string}
        [22]    ""--""    object {string}
        [23]    ""--""    object {string}
        [24]    ""--""    object {string}
        [25]    ""--""    object {string}
        [26]    ""--""    object {string}
        [27]    ""0%""    object {string}
        [28]    ""--""    object {string}
        [29]    ""--""    object {string}
        [30]    ""--""    object {string}
        [31]    ""--""    object {string}
        [32]    ""--""    object {string}
        [33]    """-5"    object {string}
        [34]    "7""" object {string}
        [35]    ""PLN""   object {string}
        [36]    """4" object {string}
        [37]    "43151""" object {string}
        [38]    ""FEE-111""   object {string}
        [39]    """Opłata za nazwę pomocniczą przedmiotu "    object {string}

Column 10 and 11 are split (36, 37 too) , but this is one value and cannot be split.
How to properly configure parser (or split idea) and resolve this issue? Any solution?

Tags: csv parsing

Answers

Chosen as BEST ANSWER

Finally i resolve this problem like this:

 var kodowanie = sciezkaPliku.GetEncoding();
            var plik = new StringBuilder();
            var linie = File.ReadAllLines(sciezkaPliku, kodowanie);
            for (int i = 0; i < File.ReadAllLines(sciezkaPliku, kodowanie).Length; i++)
            {
                plik.AppendLine(linie[i]
                    .Trim('"')
                    .Replace(",""", ";")
                    .Replace(""",", ";")
                    .Replace("""", ""));
            }
            sciezkaPliku = $"{sciezkaPliku}_parsed";
            if (File.Exists(sciezkaPliku))
            {
                File.Delete(sciezkaPliku);
            }
            File.AppendAllText(sciezkaPliku, plik.ToString(), kodowanie);
            using (var parser = new GenericParserAdapter(sciezkaPliku, sciezkaPliku.GetEncoding()))
            {
                parser.FirstRowHasHeader = true;
                parser.ColumnDelimiter = ';';
                var pozycje = parser.GetDataTable();

                foreach (var item in pozycje.Rows)
                {
//ToDo
                }
            }

(Edit)

"31 lip 2021,""Inna opłata"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""--"",""-1,29"",""EUR"",""2 sie 2021"",""111"",""mBank *7981"",""Środki zostały wysłane"",""--"",""111"",""--"",--,""--"",""--"",""--"",""--"",""--"",""--"",""0%"",""--"",""--"",""--"",""--"",""--"",""-5,7"",""PLN"",""4,43151"",""FEE-111"",""Opłata za nazwę pomocniczą przedmiotu """

Somehow the full row is converted to a single field, and all double quotes are escaped with another double quote.

The row should look like this instead (which parses fine):

31 lip 2021,"Inna oplata","--","--","--","--","--","--","--","--","-1,29","EUR","2 sie 2021","111","mBank *7981","Srodki zostaly wyslane","--","111","--",--,"--","--","--","--","--","--","0%","--","--","--","--","--","-5,7","PLN","4,43151","FEE-111","Oplata za nazwe pomocnicza przedmiotu "

One solution might be to parse the data twice. First to convert to the original row, then to parse the data.

Please signup or login to give your own answer.

Click here to cancel reply.

GenericParserAdapter – parse ebay csv transaction

Answers