I tried to implement the quicksort in x86-64 Assembler, on Linux. Since I’m not fully comfortable with it yet, I wrote the partition algorithm in C. It seems to work but something must be off, because adding a call to fprintf
in my C code (that the Asm calls into) causes a segfault. I figure I must be messing up the stack, or not honoring the calling convention, but I can’t figure out where.
Something worth noting is that the segfault only happens if extra arguments are passed to printf. If only the format string is passed, it works fine.
There are two files: quicksort.s
and quicksort.c
. I compiled them on GCC 9.4.0 under Ubuntu 20.04.1, with the command gcc quicksort.c quicksort.s -o quicksort
, then ran the resulting program using ./quicksort
.
I first wrote the code in C:
void quicksort(int* arr, size_t n) {
if (n <= 1) return;
int p = partition(arr, n);
quicksort(arr, p);
p += 1;
quicksort(arr+p, n-p);
}
And then translated it to x86-64 Assembler, yielding this:
# quicksort.s
# Takes an array of integers in %rsi
# Takes the size of the array in %rsi
.global quicksort
quicksort:
cmpq $1, %rsi
jle early_exit
pushq %rbp
movq %rsp, %rbp
# rbx, r12 and r13 are callee-saved
pushq %rbx
pushq %r12
pushq %r13
# put arguments in callee-saved registers for later use
movq %rdi, %rbx
movq %rsi, %r12
# p = partition(arr, n)
call partition
# put result of partition in callee-saved register for later use
movq %rax, %r13
# quicksort(arr, p)
movq %rbx, %rdi
movq %r13, %rsi
call quicksort
# p += 1
addq $1, %r13
# quicksort(arr+p*sizeof(int), n-p)
leaq (%rbx, %r13, 4), %rdi
movq %r12, %rsi
subq %r13, %rsi
call quicksort
popq %r13
popq %r12
popq %rbx
movq %rbp, %rsp
popq %rbp
early_exit:
ret
Here is the C code it calls into:
// quicksort.c
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
void quicksort(int* arr, size_t n);
size_t partition(int* arr, size_t n) {
// commenting the next line fixes the segfault
fprintf(stderr, "%d", 10);
size_t l = 1;
size_t r = n;
int p = arr[0];
while (l < r) {
while (l < r && arr[l] <= p) l++;
while (l < r && arr[r-1] > p) r--;
if (l == r) break;
int temp = arr[l];
arr[l] = arr[r-1];
arr[r-1] = temp;
l++;
r--;
}
arr[0] = arr[r-1];
arr[r-1] = p;
return r-1;
}
int main() {
#define LEN 100
int arr[LEN] = { };
for (int i = 0; i < LEN; ++i)
arr[i] = rand();
quicksort(arr, LEN);
for (int i = 0; i < LEN; ++i)
printf(" %d", arr[i]);
putchar('n');
}
2
Answers
Your stack isn’t aligned right for
call partition
, which means it won’t be aligned right infprintf
either, which that function depends on (as the ABI entitles every function to). Dosub $8, %rsp
or one more dummypush
before that to align it, and then undo it afterwards.You’re not maintaining stack alignment. The ABI says that
%rsp
must always be a multiple of 16 right before acall
instruction.call
itself pushes one 8-byte quantity (the return address), so the stack pointer is always congruent to 8 (mod 16) at function entry, and it’s your responsibility to fix that before you make another call.This only causes a crash when the call to
fprintf
is uncommented becausefprintf
is actually doing something that takes advantage of this ABI requirement (specifically, using some of the x86-64 vector instructions, probably to accelerate binary-to-decimal conversion).partition
by itself doesn’t do anything that cares.The easiest way for you to fix it will be to junk the frame pointer. It’s not required on x86-64 and that way you will be pushing an odd number of registers, which gives you the proper stack alignment as a side effect.