skip to Main Content

I tried to implement the quicksort in x86-64 Assembler, on Linux. Since I’m not fully comfortable with it yet, I wrote the partition algorithm in C. It seems to work but something must be off, because adding a call to fprintf in my C code (that the Asm calls into) causes a segfault. I figure I must be messing up the stack, or not honoring the calling convention, but I can’t figure out where.

Something worth noting is that the segfault only happens if extra arguments are passed to printf. If only the format string is passed, it works fine.

There are two files: quicksort.s and quicksort.c. I compiled them on GCC 9.4.0 under Ubuntu 20.04.1, with the command gcc quicksort.c quicksort.s -o quicksort, then ran the resulting program using ./quicksort.

I first wrote the code in C:

void quicksort(int* arr, size_t n) {
    if (n <= 1) return;
    int p = partition(arr, n);
    quicksort(arr, p);
    p += 1;
    quicksort(arr+p, n-p);
}

And then translated it to x86-64 Assembler, yielding this:

# quicksort.s

# Takes an array of integers in %rsi
# Takes the size of the array in %rsi
.global quicksort
quicksort:
    cmpq $1, %rsi
    jle early_exit

    pushq %rbp
    movq %rsp, %rbp

    # rbx, r12 and r13 are callee-saved
    pushq %rbx
    pushq %r12
    pushq %r13

    # put arguments in callee-saved registers for later use
    movq %rdi, %rbx
    movq %rsi, %r12

    # p = partition(arr, n)
    call partition

    # put result of partition in callee-saved register for later use
    movq %rax, %r13

    # quicksort(arr, p)
    movq %rbx, %rdi
    movq %r13, %rsi
    call quicksort

    # p += 1
    addq $1, %r13

    # quicksort(arr+p*sizeof(int), n-p)
    leaq (%rbx, %r13, 4), %rdi
    movq %r12, %rsi
    subq %r13, %rsi
    call quicksort

    popq %r13
    popq %r12
    popq %rbx

    movq %rbp, %rsp
    popq %rbp

early_exit:
    ret

Here is the C code it calls into:

// quicksort.c
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>

void quicksort(int* arr, size_t n);

size_t partition(int* arr, size_t n) {
    // commenting the next line fixes the segfault
    fprintf(stderr, "%d", 10);

    size_t l = 1;
    size_t r = n;
    int p = arr[0];

    while (l < r) {
        while (l < r && arr[l] <= p) l++;
        while (l < r && arr[r-1] > p) r--;
        if (l == r) break;
        int temp = arr[l];
        arr[l] = arr[r-1];
        arr[r-1] = temp;
        l++;
        r--;
    }
    arr[0] = arr[r-1];
    arr[r-1] = p;
    return r-1;
}

int main() {
    #define LEN 100
    int arr[LEN] = { };

    for (int i = 0; i < LEN; ++i)
        arr[i] = rand();

    quicksort(arr, LEN);

    for (int i = 0; i < LEN; ++i)
        printf(" %d", arr[i]);
    putchar('n');
}

2

Answers


  1. Your stack isn’t aligned right for call partition, which means it won’t be aligned right in fprintf either, which that function depends on (as the ABI entitles every function to). Do sub $8, %rsp or one more dummy push before that to align it, and then undo it afterwards.

    Login or Signup to reply.
  2. You’re not maintaining stack alignment. The ABI says that %rsp must always be a multiple of 16 right before a call instruction. call itself pushes one 8-byte quantity (the return address), so the stack pointer is always congruent to 8 (mod 16) at function entry, and it’s your responsibility to fix that before you make another call.

    This only causes a crash when the call to fprintf is uncommented because fprintf is actually doing something that takes advantage of this ABI requirement (specifically, using some of the x86-64 vector instructions, probably to accelerate binary-to-decimal conversion). partition by itself doesn’t do anything that cares.

    The easiest way for you to fix it will be to junk the frame pointer. It’s not required on x86-64 and that way you will be pushing an odd number of registers, which gives you the proper stack alignment as a side effect.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search