skip to Main Content

I’m coding in nasm, and I don’t get what is going on. The Linux distro is an Ubuntu 16 64bits, but the NASM is operating in 32bits.

Expected output –> "Number is: 2"

Actual output –> "number is: 134520868"

Code:

%include "io.inc"

section .data
    n1 db 2 ; i know it's a bad practice to define a db variable, it's just for a test
    msg: db 'number is: %d',10,0

section .text
extern printf
global main

main:
    push ebp
    mov ebp, esp

    push dword n1
    push msg
    call printf

    mov esp, ebp
    pop ebp
    ret

I’ve tried to define n1 with dd, or pushing the content of n1, even with a register like eax.

Update, even when I do push dword [n1] the only thing that changes is that now the output is "Number is: 1836404226"…

3

Answers


  1. You want a 32-bit number, so use dd and you don’t want to print the address of the variable, but its contents. So:

    n1 dd 2 ; ...
    

    and

    push dword [n1]
    push msg
    call printf
    
    Login or Signup to reply.
  2. i know it’s a bad practice to define a db variable,

    It is not bad practice.

    If the variable is meant to be a byte, then using db is the absolute right thing to do.

    On the other hand, if the variable is not meant to be a byte, then using db is still not bad practice; it is a mistake.

    it’s just for a test

    It being a test does not somehow invert the laws of the universe and make wrong things right.

    You have two options:

    Since you are using the x86 tag, you are targeting a 16-bit architecture, where an int is 16-bits long. (Of course, the number 1836404226 that you are seeing is in conflict with this, but that is another story.)

    So, since you are targeting a 16-bit architecture, declare your variable using dw, which is 16 bits, and use push word [n1] or push [word ptr n1] or push [n1] depending on your assembler’s syntax flavor. (I can’t be bothered to look up nasm syntax right now.) The rest of the code remains the same.

    Alternatively, keep the variable defined with db, but use movsx ax, [n1] or movzx ax, [n1] followed by push ax to sign-extend or zero-extend that byte into a word, and then push it into the stack. The rest of the code remains the same.

    If, by any chance, you are targeting x64 and not x86, then use dd instead of dw, use dword instead of word, and use eax instead of ax.

    To answer the actual question, your printf() prints a large number because you are not printing the actual variable, you are printing the address of the variable. (And addresses tend to be arbitrary large numbers.)

    However, even if you did push word [n1] it would still print some nonsense number, because n1 has been defined with db, so it is a byte, and this byte is followed by another byte, (the letter ‘n’ in your case,) and when these two bytes are read together as a word they form some other nonsensical number which is different from 2.

    Login or Signup to reply.
  3. First of all, push n1 pushes the address. push dword [n1] loads 4 bytes from memory and pushes them.
    Basic use of immediates vs. square brackets in YASM/NASM x86 assembly


    msg: db ... comes right after n1: db 2, so a 4-byte (dword) load gets 3 bytes of ASCII characters as the high bytes of the integer. A %d format takes all 4 of those bytes as the int to print.

    x86 is little-endian, so if you’d used %x in your format string you’d see 0x6d756e02 – notice that the low byte is the 2 you loaded. 6e is 'n' in ASCII / UTF-8, etc. You could also use GDB to look at stack memory before the call. See the bottom of the x86 tag wiki for asm GDB tips.


    I know it’s a bad practice to define a db variable, it’s just for a test

    It’s not bad practice to have a char, uint8_t, or int8_t global variable. It’s no worse than a uint32_t global. Whatever size your data is, you need to use appropriate instructions for it. This is assembly, there’s no compiler to implicitly convert types for you.
    In this case, you need to load just a byte if you don’t want to pull in other garbage.

    The things you can do include:

    Zero- or Sign-extend a byte into a register and push that

    movzx eax, byte [n1] / push eax will load just a byte from memory, zero-extending to 32 bits (the width of EAX). movsx is the same but with sign-extension.
    This is what a C compiler would do for printf("...", n1) with int8_t n1 = 2;

    Load high garbage but tell printf to only look at the low byte

    Since you know you have valid data after your byte in this case, you can’t segfault from going off the end of a page into an unmapped one.
    push dword [n1] and use %hhd in your format string to treat arg as int8_t, only looking at the low byte. See the Glibc printf man page.

    Loading garbage past the end of a variable isn’t something you can express in C, except maybe with memcpy(&tmp_int, &n1, sizeof(tmp_var));. A very clever compiler could potentially do this asm optimization if you used a %hhd format string so it knew the high 3 bytes of what it pushed didn’t matter. Your n1 happens to be 4-byte aligned (since it’s at the start of your .data in this file), so a 4-byte load can’t be split across cache lines, so there’s no downside.

    Note that standard calling conventions including i386 System V allow narrow args to contain high garbage; they don’t have to be zero or sign extended to 32-bit. (Except maybe as an undocumented extension required by clang, at least that’s the case for register args in x86-64 Sys V.) And anyway, printf in C terms is taking an int and %hd / %hhd conversions truncate it, since it’s variadic so the default argument promotions apply: narrow integer types promote to int.)

    Reserve 4 bytes for your global

    n1: dd 2 allows push dword [n1] to load 0x0000002 from memory, so a %d conversion to print the whole thing as an int will print just 2.

    In C terms, this is static int n1 = 2; instead of static int8_t n1 = 2;

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search