I am trying to load a CSV file to my MYSQL database, however before I do so I need to modify it slightly. The CSV file is Pipe delimitated (|)
I have a column in the CSV file called Party:Identification. This column has results such as "a:hello, b:hi c:151 ……" This can go on infinitely.
I only need to get the value for c. I have come up with a method that works for this, however I am stuck on how to modify the value before the file is inserted into the database.
I tried replacing all the ":" in the headers with "" and then using .transform to modify the values, however this doesn’t appear to change the values in the column, only the header. Code is attached below.
csv.parseFile(req.file.path, {
headers: headers => headers.map(function (header) {
const newHeaders = header.replaceAll(" ", "").replaceAll(":", "")
console.log(newHeaders)
return newHeaders
}),
delimiter: '|'
})
.transform(function(data) {
console.log(data)
PartyIdentification: getPartyID(data.partyIdentification)
})
.on("error", (err) => console.error(err))
.on("finish", function () {
query("LOAD DATA LOCAL INFILE '" +
file +
"' INTO TABLE table " +
" FIELDS TERMINATED BY '|'" +
" LINES TERMINATED BY 'n'" +
" IGNORE 1 ROWS;").then(r =>
console.log(file)
)
})
function getPartyID(str) {
if (str === undefined) return ""
const split = str.split(",")
const value = split.find(val => {
return val.includes("c")
})
if(value === undefined) return ""
return (value.split(":")[1].trim())
}
2
Answers
You can use a regex to parse the value of
c:123
in a string:Output:
Explanation of regex:
b
— word boundaryc:
— literal text([^ ]*)
— capture group 1 with value, up to and excluding spaceUPDATE 1: Based on additional question on how to insert modified data into MySQL, here is a solution that does not use
INFILE
, but instead loads the file into memory (here simulated withconst input
), modifies the data as needed, and constructs a SQL statement that inserts all the data. IMPORTANT: You likely want to add escapes against SQL injections.Output:
Don’t bother fixing the csv file before loading, simply toss the unwanted columns as you
LOAD
it.This, for example, will load only the 3rd column:
That is, capture the unwanted columns into @variables that you will then ignore.
If you need to remove the
c:
before storing into the table, then(or whatever expression will work. See also
SUBSTRING_INDEX
in case it would work better.)LOAD DATA
is plenty fast, even in this wasteful mode. And a lot less coding on your part.