skip to Main Content

I want to change my text from the form below (JSON) into a sas dataset.

[
  {
    "id":86374,
    "course":[
      "artificial intelligence",
      "data structure",
      "linear algebra"
     ]
   }
   {
    "id":12319,
    "course":[
      "data structure",
      "algorithm",
      "database",
      "linear algebra"
     ]
   }
...


]

Since the number of courses vary by ID, I don’t know how to save all course names (maybe in array form) into sas dataset.

2

Answers


  1. For Json file, it could be imported by proc groovy, here is data step. If it is unknown how many courses each student has taken, so temporary array is used, then transpose data to wide data for each ID.

    filename myjson 'd:json.txt';
    data temp;
       array temp (100) $100. _temporary_;
       infile myjson lrecl=10000 truncover scanover ;
           input @'"id":' ID $10.;
           input;    
           do i=1 by 1 ;
           input temp(i) $100. ;
           if not anyalpha(temp(i)) then leave;
           end;
           id=compress(id,',');
           do j=1 by 1;
              var=compress(temp(j),',""');
              if not anyalpha(temp(j)) then leave;
              output;
           end;
           drop i j;
    run;
    
    proc transpose data=temp out=want(drop=_name_) prefix=Couse;
    by id notsorted;
    var var;
    run;
    
    Login or Signup to reply.
  2. I would avoid using arrays for the initial import if you don’t know how many courses any one ID might have, and just import the data in long format initially. That way, transposing it can be dealt with as a separate problem later on.

    Here is an alternative approach for importing the data into long format that will work for any number of courses per ID:

    data want;
    infile cards scanover dlm=',' dsd;
    input @'"id":' id;
    do until(index(_INFILE_,']'));
        input @;
        if not(indexc(_INFILE_,'[]')) then do;
            input course :$100.;
            output;
        end;
        else input;
    end;
    cards;
    [
      {
        "id":86374,
        "course":[
          "artificial intelligence",
          "data structure",
          "linear algebra"
         ]
       }
       {
        "id":12319,
        "course":[
          "data structure",
          "algorithm",
          "database",
          "linear algebra"
         ]
       }
    ]
    ;
    run;
    

    This assumes that none of the course names contain square bracket characters. You could allow for this possibility with some slightly more complex logic to check for square brackets at the start or end of a json line.

    As this seems to be causing some confusion, here’s how you need to modify this to use it:

    filename myjson "/path/to/my/json/file.txt";
    
    data want;
    infile myjson scanover dlm=',' dsd;
    input @'"id":' id;
    do until(index(_INFILE_,']'));
        input @;
        if not(indexc(_INFILE_,'[]')) then do;
            input course :$100.;
            output;
        end;
        else input;
    end;
    run;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search