skip to Main Content

sorry for the confusing title but I hope this explains my question.

I have a tab delimited text file that looks like this if I were to display it as a table:

 ════════╤════════╤════════╤═════════╤═════════╤═════════╤═════════ 
  item   │ id     │ info   │ PERSON1 │ PERSON2 │ PERSON3 │ PERSON4  
 ════════╪════════╪════════╪═════════╪═════════╪═════════╪═════════ 
  [data] │ [data] │ [data] │ YES     │ NO      │ NO      │ N/A      
 ────────┼────────┼────────┼─────────┼─────────┼─────────┼───────── 
  [data] │ [data] │ [data] │ NO      │ N/A     │ NO      │ N/A      
 ════════╧════════╧════════╧═════════╧═════════╧═════════╧═════════ 

My issue is I have hundreds of these "PERSON_X" columns. I want to upload this file into MySQL but in a way where I can combine the "person" columns into individual counts, like so:

 ════════╤════════╤════════╤════════════╤═══════════╤════════════ 
  item   │ id     │ info   │ PERSON_YES │ PERSON_NO │ PERSON N/A  
 ════════╪════════╪════════╪════════════╪═══════════╪════════════ 
  [data] │ [data] │ [data] │ 1          │ 2         │ 1           
 ────────┼────────┼────────┼────────────┼───────────┼──────────── 
  [data] │ [data] │ [data] │ 0          │ 2         │ 2           
 ════════╧════════╧════════╧════════════╧═══════════╧════════════ 

Is there a way to do this? I am inexperienced/learning. The data found in the "PERSON_X" columns will always be ‘YES’, ‘NO’, or "n/a". What can I do?

I tried creating and uploading a table to start and possibly work from there, but there are way too many columns to establish in the "CREATE TABLE" command that it became infeasible. I have tried to look into different methods of loading the data, but it hasn’t gone well so I am trying my luck asking here.

2

Answers


  1. It will be wordy, but you can do it by assigning the fields of the file to user variables.

    LOAD DATA INFILE 'filename'
    INTO TABLE tablename
    FIELDS TERMINATED BY 't'
    (item, id, info, @person1, @person2, @person3, ...)
    SET person_yes = (@person1 = 'YES') + (@person2 = 'YES') + (@person3 = 'YES') + ...,
        person_no = (@person1 = 'NO') + (@person2 = 'NO') + (@person3 = 'NO') + ...,
        person_na = (@person1 = 'N/A') + (@person2 = 'N/A') + (@person3 = 'N/A') + ...
    

    As suggested in a comment, this will be easier in a programming language that has arrays that you can loop over, instead of having to define hundreds of variables that you have to name explicitly.

    Login or Signup to reply.
  2. You can create 3 additional columns in your excel file

    For PERSON_YES, PERSON_NO and PERSON_NA

    and define the formulas of

    • =COUNTIF(D2:I2,"YES")
    • =COUNTIF(D3:I3,"YES")
    • =COUNTIF(D2:I2,"NO")
    • =COUNTIF(D3:I3,"NO")
    • =COUNTIF(D2:I2,"N/A")
    • =COUNTIF(D3:I3,"N/A")

    the first two formulas being for YES, NO and N/A respectively, of course you can adjust the range to your thousands as well as the rows if you have more rows. Then you can create the formula of

    =CONCAT("insert into yourtable(id, item, info, person_yes, person_no, person_na) values('", B2, "', '", A2, "', '", C2, "', ", J2, ", ", K2, ", ", L2, ");")
    

    for the second row and drag the rule downwards to generate the script for all your records (the last 3 items will be different for you instead of J2, K2 and L2 due to the differences between the number of columns), resulting in scripts like this:

    insert into yourtable(id, item, info, person_yes, person_no, person_na) values('1', 'first', 'inf1', 3, 2, 1);
    insert into yourtable(id, item, info, person_yes, person_no, person_na) values('2', 'second', 'inf2', 1, 3, 2);
    

    See

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search