I recently started learning about memory management and I read about relative addresses and physical addresses, and a question appeared in my mind:
When I print a variable’s address, is it showing the relative (virtual) address or the physical address in where the variable located in the memory?
And another question regarding memory management:
Why does this code produce the same stack pointer value for each run (from Shellcoder’s Handbook, page 28)?
Does any program that I run produce this address?
// find_start.c
unsigned long find_start(void)
{
__asm__("movl %esp, %eax");
}
int main()
{
printf("0x%xn",find_start());
}
If we compile this and run this a few times, we get:
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
shellcoders@debian:~/chapter_2$ ./find_start
0xbffffad8
I would appreciate if someone could clarify this topic to me.
3
Answers
The counterpart to a relative address is an absolute address. That has nothing to do with the distinction between virtual and physical addresses.
On most common modern operating systems, such as Windows, Linux and MacOS, unless you are writing a driver, you will never encounter physical addresses. These are handled internally by the operating system. You will only be working with virtual addresses.
On most modern operating systems, every process has its own virtual memory address space. The executable is loaded to its preferred base address in that virtual address space, if possible, otherwise it is loaded at another address (relocated). The preferred base address of an executable file is normally stored in its header. Depending on the operating system and CPU, the heap is probably created at a higher address, since the heap normally grows upward (towards higher addresses). Because the stack normally grows downward (towards lower addresses), it will likely be created below the load address of the executable and grow towards the address 0.
Since the preferred load address is the same every time you run the executable, it is likely that the virtual memory addresses are the same. However, this may change if address layout space randomization is used. Also, just because the virtual memory addresses are the same does not mean that the physical memory address are the same, too.
Depending on your operating system, you can set the preferred base address in which your program is loaded into virtual memory in the linker settings. Many programs may still have the same base address as your program, probably because both programs were built using the same linker with default settings.
It is not possible for program2 to access program1’s memory directly, because they have separate virtual memory address spaces. However, it is possible for one program to ask the operating system for permission to access another process’s address space. The operating system will normally grant this permission, provided that the program has sufficient priviledges. On Windows, this is can be accomplished for example with the function WriteProcessMemory. Linux offers similar functionality by using ptrace and writing to
/proc/[pid]/mem
. See this link for further information.You get virtual addresses. Your program never gets to see physical addresses. Ever.
No, because you can don’t have addresses that point to program1’s memory. If you have virtual address 0xabcd1234 in the program1 process, and you try to read it from the program2 process, you get program2’s 0xabcd1234 (or a crash if there is no such address in program2). It’s not a permission check – it’s not like the CPU goes to the memory and sees “oh, this is program1’s memory, I shouldn’t access it”. It’s program2’s own memory space.
But yes, if you use “shared memory” to ask the OS to put the same physical memory in both processes.
And yes, if you use
ptrace
or/proc/<pid>/mem
to ask the OS nicely to read from the other process’s memory, and you have permission to do that, then it will do that.Apparently, that program always has that stack pointer value. Different programs might have different stack pointers. And if you put more local variables in
main
, or callfind_start
from a different function, you will get a different stack pointer value because there will be more data pushed on the stack.Note: even if you run the program twice at the same time, the address will be the same, because they are virtual addresses, and every process has its own virtual address space. They will be different physical addresses but you don’t see the physical addresses.
It all works within one process.
Only focusing on a small part of your question.
There is no magic here. The virtual space allows for programs to be built the same. I don’t need to know where my program is going to live, each program can be compiled for the same address space, when loaded and run the can see the same virtual address space because they are all mapped to separate/different physical address spaces.
(don’t but I prefer objdump)
objdump -D so
So, two things or maybe more than two things. Our entry point _start is in ram at a low address. low virtual address. On this system with this compiler I would expect all/most programs to start in a similar place or the same or in some cases it may depend on what is in my program, but it should be somewhere low.
The stack pointer though, if you check above and now as I type stuff:
it has changed.
a few times within a few seconds. The stack is a relative thing not absolute so there is no real need to create a fixed address that is the same every time. Needs to be in a space that is related to this user/thread and virtual since it is going through the mmu for protection reasons. There is no reason for a virtual address to not equal the physical address. The kernel code/driver that manages the mmu for a platform is programmed to do it a certain way. You can have the address space for code start at 0x0000 for every program, and you might wish the address space for data to be the same, zero based. but for stack it doesn’t matter. And on my machine, my os, this particular version this particular day it isn’t consistent.
I originally thought your question was different depending on factors that are specific to your build, and settings. For a specific build a single call to find_start is going to be at a fixed relative address for the stack pointer each function that uses the stack will put it back the way it was found, assuming you can’t change the compilation of the program while running the stack pointer for a single instance of the call the nesting will be the same the stack consumption by each function along the way will be the same.
I added another layer and by looking at the disassembly, main, nest and find_start all mess with the stack pointer (unoptimized) so that is why for these runs they are 0x10 apart. if I added/removed more code per function to change the stack usage in one or more of the functions then that delta could change.
But
The optimizer didn’t recognize the return value for some reason.
calling convention looks fine.
Put find_start in a separate file so the optimizer can’t remove it
I didn’t let it inline those functions it can see nest so it inlined it removing the stack change that came with it. So now the value nested or not is the same.