skip to Main Content

I have a problem with mysqldump that might be solved by configuring mysqldump differently, but probably will be solved by just piping the output through a shell script.

Basically, mysqldump always outputs the tables in the same order, but it list all columns (other than id) for each table in random order.

So, the first run might output this…

create TABLE `ONE` ( 
  `id` int NOT NULL AUTO_INCREMENT,
  `column_a` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_b` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_c` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
create TABLE `TWO` ( 
  `id` int NOT NULL AUTO_INCREMENT,
  `column_x` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_y` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_z` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

…and on the second run, it might produce something like this:

create TABLE `ONE` ( 
  `id` int NOT NULL AUTO_INCREMENT,
  `column_c` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_b` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_a` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
create TABLE `TWO` ( 
  `id` int NOT NULL AUTO_INCREMENT,
  `column_y` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_x` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `column_z` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

I’d like to pipe the result through a shellscript that always sorts the lines in the same way. What script would achieve this? The script needs to run on a build agent that runs on Ubuntu, so if it is possible and practical to use standard GNU tools like awk then that would be superior to using custom tools.

2

Answers


  1. how to sort adjacent lines that have the same level of indentation

    I would harness GNU AWK for this task following way, let file.txt content be

    ABLE
      CHARLIE
      BAKER
        123
      EASY
      DOG
      FOX
    

    then

    awk 'BEGIN{PROCINFO["sorted_in"]="@val_str_asc";indent=-1}{match($0,/[^[:space:]]/)}RSTART==indent{arr[NR]=$0;next}{indent=RSTART;for(i in arr){print arr[i]};delete arr;arr[NR]=$0}END{for(i in arr){print arr[i]}}' file.txt
    

    gives output

    ABLE
      BAKER
      CHARLIE
        123
      DOG
      EASY
      FOX
    

    Explanation: I inform GNU AWK that Scanning Order should be values-as-strings, ascending and set to indent to -1 (value never reached). For each line I use match string function to detect indent size, by finding position of 1st non-white-space character. If indent is same as previous I only add current line to array arr under key being number of line, otherwise I save indent size in variable indent, output all from array in predefined order, cleanse array (so it become empty) and add current line to array. After all lines are processed (END) I output content of array arr.

    (tested in GNU Awk 5.1.0)

    Login or Signup to reply.
  2. The man-page of the basic UNIX sort commands starts with:

    DESCRIPTION
       Write sorted concatenation of all FILE(s) to standard output.
       ...
    
       ... Ordering options:
    
       -b, --ignore-leading-blanks
              ignore leading blanks
    

    Why not try, using sort -b?

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search