skip to Main Content

I recently started learning about memory management and I read about relative addresses and physical addresses, and a question appeared in my mind:

When I print a variable’s address, is it showing the relative (virtual) address or the physical address in where the variable located in the memory?

And another question regarding memory management:

Why does this code produce the same stack pointer value for each run (from Shellcoder’s Handbook, page 28)?
Does any program that I run produce this address?

// find_start.c
unsigned long find_start(void)
{
    __asm__("movl %esp, %eax");
}
int main()
{
    printf("0x%xn",find_start());
}

If we compile this and run this a few times, we get:

shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8

I would appreciate if someone could clarify this topic to me.

3

Answers


  1. When I print a variable’s address, is it showing the relative ( virtual ) address or the physical address in where the variable located in the memory ?

    The counterpart to a relative address is an absolute address. That has nothing to do with the distinction between virtual and physical addresses.

    On most common modern operating systems, such as Windows, Linux and MacOS, unless you are writing a driver, you will never encounter physical addresses. These are handled internally by the operating system. You will only be working with virtual addresses.

    Why does this code produces the same stack pointer value for each run ( from shellcoder’s handbook , page 28) ?

    On most modern operating systems, every process has its own virtual memory address space. The executable is loaded to its preferred base address in that virtual address space, if possible, otherwise it is loaded at another address (relocated). The preferred base address of an executable file is normally stored in its header. Depending on the operating system and CPU, the heap is probably created at a higher address, since the heap normally grows upward (towards higher addresses). Because the stack normally grows downward (towards lower addresses), it will likely be created below the load address of the executable and grow towards the address 0.

    Since the preferred load address is the same every time you run the executable, it is likely that the virtual memory addresses are the same. However, this may change if address layout space randomization is used. Also, just because the virtual memory addresses are the same does not mean that the physical memory address are the same, too.

    Does any program that I will run produce this address ?

    Depending on your operating system, you can set the preferred base address in which your program is loaded into virtual memory in the linker settings. Many programs may still have the same base address as your program, probably because both programs were built using the same linker with default settings.

    The virtual addresses are only per program? Let’s say I have 2 programs: program1 and program2. Can program2 access program1’s memory?

    It is not possible for program2 to access program1’s memory directly, because they have separate virtual memory address spaces. However, it is possible for one program to ask the operating system for permission to access another process’s address space. The operating system will normally grant this permission, provided that the program has sufficient priviledges. On Windows, this is can be accomplished for example with the function WriteProcessMemory. Linux offers similar functionality by using ptrace and writing to /proc/[pid]/mem. See this link for further information.

    Login or Signup to reply.
  2. You get virtual addresses. Your program never gets to see physical addresses. Ever.

    can program2 access program1’s memory ?

    No, because you can don’t have addresses that point to program1’s memory. If you have virtual address 0xabcd1234 in the program1 process, and you try to read it from the program2 process, you get program2’s 0xabcd1234 (or a crash if there is no such address in program2). It’s not a permission check – it’s not like the CPU goes to the memory and sees “oh, this is program1’s memory, I shouldn’t access it”. It’s program2’s own memory space.

    But yes, if you use “shared memory” to ask the OS to put the same physical memory in both processes.

    And yes, if you use ptrace or /proc/<pid>/mem to ask the OS nicely to read from the other process’s memory, and you have permission to do that, then it will do that.

    why does this code produces the same stack pointer value for each run ( from shellcoder’s handbook , page 28) ? does any program that I will run will produce this address ?

    Apparently, that program always has that stack pointer value. Different programs might have different stack pointers. And if you put more local variables in main, or call find_start from a different function, you will get a different stack pointer value because there will be more data pushed on the stack.

    Note: even if you run the program twice at the same time, the address will be the same, because they are virtual addresses, and every process has its own virtual address space. They will be different physical addresses but you don’t see the physical addresses.

    In stack overflow example in the book I mentioned, they overwrite the return address in the stack to an address of an exploit in the enviroment variables. how does it work ?

    It all works within one process.

    Login or Signup to reply.
  3. Only focusing on a small part of your question.

    #include <stdio.h>
    // find_start.c
    unsigned long find_start(void)
    {
        __asm__("movl %esp, %eax");
    }
    unsigned long nest ( void )
    {
        return(find_start());
    }
    int main()
    {
        printf("0x%lxn",find_start());
        printf("0x%lxn",nest());
    }
    
    gcc so.c -o so
    ./so
    0x50e381a0
    0x50e38190
    

    There is no magic here. The virtual space allows for programs to be built the same. I don’t need to know where my program is going to live, each program can be compiled for the same address space, when loaded and run the can see the same virtual address space because they are all mapped to separate/different physical address spaces.

    readelf -a so
    

    (don’t but I prefer objdump)

    objdump -D so

    Disassembly of section .text:
    
    0000000000000540 <_start>:
     540:   31 ed                   xor    %ebp,%ebp
     542:   49 89 d1                mov    %rdx,%r9
     545:   5e                      pop    %rsi
    
    ....
    
    
    000000000000064a <find_start>:
     64a:   55                      push   %rbp
     64b:   48 89 e5                mov    %rsp,%rbp
     64e:   89 e0                   mov    %esp,%eax
     650:   90                      nop
     651:   5d                      pop    %rbp
     652:   c3                      retq   
    
    0000000000000653 <nest>:
     653:   55                      push   %rbp
     654:   48 89 e5                mov    %rsp,%rbp
     657:   e8 ee ff ff ff          callq  64a <find_start>
     65c:   5d                      pop    %rbp
     65d:   c3                      retq   
    
    000000000000065e <main>:
     65e:   55                      push   %rbp
     65f:   48 89 e5                mov    %rsp,%rbp
     662:   e8 e3 ff ff ff          callq  64a <find_start>
     667:   48 89 c6                mov    %rax,%rsi
     66a:   48 8d 3d b3 00 00 00    lea    0xb3(%rip),%rdi        # 724 <_IO_stdin_used+0x4>
     671:   b8 00 00 00 00          mov    $0x0,%eax
     676:   e8 a5 fe ff ff          callq  520 <printf@plt>
     67b:   e8 d3 ff ff ff          callq  653 <nest>
    

    So, two things or maybe more than two things. Our entry point _start is in ram at a low address. low virtual address. On this system with this compiler I would expect all/most programs to start in a similar place or the same or in some cases it may depend on what is in my program, but it should be somewhere low.

    The stack pointer though, if you check above and now as I type stuff:

    0x355d38d0
    0x355d38c0
    

    it has changed.

    0x4ebf1760
    0x4ebf1750
    
    0x31423240
    0x31423230
    
    0xa63188d0
    0xa63188c0
    

    a few times within a few seconds. The stack is a relative thing not absolute so there is no real need to create a fixed address that is the same every time. Needs to be in a space that is related to this user/thread and virtual since it is going through the mmu for protection reasons. There is no reason for a virtual address to not equal the physical address. The kernel code/driver that manages the mmu for a platform is programmed to do it a certain way. You can have the address space for code start at 0x0000 for every program, and you might wish the address space for data to be the same, zero based. but for stack it doesn’t matter. And on my machine, my os, this particular version this particular day it isn’t consistent.

    I originally thought your question was different depending on factors that are specific to your build, and settings. For a specific build a single call to find_start is going to be at a fixed relative address for the stack pointer each function that uses the stack will put it back the way it was found, assuming you can’t change the compilation of the program while running the stack pointer for a single instance of the call the nesting will be the same the stack consumption by each function along the way will be the same.

    I added another layer and by looking at the disassembly, main, nest and find_start all mess with the stack pointer (unoptimized) so that is why for these runs they are 0x10 apart. if I added/removed more code per function to change the stack usage in one or more of the functions then that delta could change.

    But

    gcc -O2 so.c -o so
    objdump -D so > so.txt
    ./so
    0x0
    0x0
    
    Disassembly of section .text:
    
    0000000000000560 <main>:
     560:   48 83 ec 08             sub    $0x8,%rsp
     564:   89 e0                   mov    %esp,%eax
     566:   48 8d 35 e7 01 00 00    lea    0x1e7(%rip),%rsi        # 754 <_IO_stdin_used+0x4>
     56d:   31 d2                   xor    %edx,%edx
     56f:   bf 01 00 00 00          mov    $0x1,%edi
     574:   31 c0                   xor    %eax,%eax
     576:   e8 c5 ff ff ff          callq  540 <__printf_chk@plt>
     57b:   89 e0                   mov    %esp,%eax
     57d:   48 8d 35 d0 01 00 00    lea    0x1d0(%rip),%rsi        # 754 <_IO_stdin_used+0x4>
     584:   31 d2                   xor    %edx,%edx
     586:   bf 01 00 00 00          mov    $0x1,%edi
     58b:   31 c0                   xor    %eax,%eax
     58d:   e8 ae ff ff ff          callq  540 <__printf_chk@plt>
     592:   31 c0                   xor    %eax,%eax
     594:   48 83 c4 08             add    $0x8,%rsp
     598:   c3                      retq   
    

    The optimizer didn’t recognize the return value for some reason.

    unsigned long fun ( void )
    {
        return(0x12345678);
    }
    
    00000000000006b0 <fun>:
     6b0:   b8 78 56 34 12          mov    $0x12345678,%eax
     6b5:   c3                      retq 
    

    calling convention looks fine.

    Put find_start in a separate file so the optimizer can’t remove it

    gcc -O2 so.c sp.c -o so
    ./so
    0xb1192fc8
    0xb1192fc8
    ./so
    0x7aa979d8
    0x7aa979d8
    ./so
    0x485134c8
    0x485134c8
    ./so
    0xa8317c98
    0xa8317c98
    ./so
    0x2ba70b8
    0x2ba70b8
    
    Disassembly of section .text:
    
    0000000000000560 <main>:
     560:   48 83 ec 08             sub    $0x8,%rsp
     564:   e8 67 01 00 00          callq  6d0 <find_start>
     569:   48 8d 35 f4 01 00 00    lea    0x1f4(%rip),%rsi        # 764 <_IO_stdin_used+0x4>
     570:   48 89 c2                mov    %rax,%rdx
     573:   bf 01 00 00 00          mov    $0x1,%edi
     578:   31 c0                   xor    %eax,%eax
     57a:   e8 c1 ff ff ff          callq  540 <__printf_chk@plt>
     57f:   e8 4c 01 00 00          callq  6d0 <find_start>
     584:   48 8d 35 d9 01 00 00    lea    0x1d9(%rip),%rsi        # 764 <_IO_stdin_used+0x4>
     58b:   48 89 c2                mov    %rax,%rdx
     58e:   bf 01 00 00 00          mov    $0x1,%edi
     593:   31 c0                   xor    %eax,%eax
     595:   e8 a6 ff ff ff          callq  540 <__printf_chk@plt>
    

    I didn’t let it inline those functions it can see nest so it inlined it removing the stack change that came with it. So now the value nested or not is the same.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search