I have a large number of txt files with a single column of data. There are no headers in the files.
The data are an email address followed by a :
and then a string of varchar, which sometimes includes :
s.
My goal is to convert the following
[email protected]:v@rch:r$tR:ng
[email protected]::multipleTypes
[email protected]:&ofTxtGoAfT3rThe:
To a tsv with headers.
column1 column2
[email protected] v@rch:r$tR:ng
[email protected] :multipleTypes
[email protected] &ofTxtGoAfT3rThe:
These files will then be uploaded into a postgres database.
Any insight/advice is greatly appreciated.
4
Answers
sed
can do that. Without/g
, its substitution replaces the first occurrence on each line:In postgres you can split the text at the appropriate position and export the hole with
COPY
to an TSVthis would look like
db<>fiddle here
@choroba is correct with Sed. This can also be done in python.
This splits the text at the first
:
Using bash parameter expansion for splitting the lines and printf to format the output: