Whenever you execute a program (either from a shell or replace a forked child process image using one of the functions from exec()
family) - eventually the system call execve(2)
is invoked.
The prototype of this system call is int execve(const char *filename, char *const argv[], char *const envp[]);
This system call, among other things (like verifying execute permission on the filename, setuid etc.), figures the value of argc
using argv
and copies argc
, argv
and argp
arguments on new user stack along with process .data
and .text
. It stores argc
into %rsp
, argv[0]
in LP_SIZE(%rsp)
… argv[argc] = null
in (LP_SIZE*argc)(%rsp)
, similary envp[0]
in (LP_SIZE*(argc+1))(%rsp) ... null
. Where LP_SIZE
is the size of long pointer in bytes.
NOTE: both argv
and argp
are null
terminated arrays.
The arguments to function int main(int argc, char *argv[], char *envp[]);
is usually implementation defined and specified by platforms ABI. C99 does neither bless or forbid envp
argument to main
function.
On linux executable of a program is created according to ELF specifications.
Typically the ELF
is implemented such that, some glibc
wrapper functions are called before main
to make sure that argc
is initialized from stack (these functions typically involve _start
, __libc_csu_init
, __libc_start_main
etc.).
You could experiment your binary files produced on linux using the binutils
(notably readelf
and objdump
utilities among others).
Following is an example where a simple C
program prints the command line arguments and environment variables.
Assuming that above programs filename is argc.c
, it can be compiled using command
gcc -o argc argc.c
This should give you an executable file argc
(an executable ELF object).
Now you can disassemble the executable .section
s of argc
using objdump
utility that comes with binutils
package.
objdump -d argc > argc.objdump
will give you following (or equivalent depending on architecture you're working on) in file argc.objdump
.
The disassembled file is divided in various .section
s as specified by the platform specific ABI. On a linux platform, for our argc
program we have following .section
s with executable instructions.
.init
process initialization code.
.plt
procedure linkage table.
.text
program text, or executables instructions of program.
.fini
finalization code of the process.
All the .section
(s) of an ELF
object can be listed by objdump --section-headers argc
As can be seen in above disassembled executable .section
s of argc executable object, the _starup
code figures the argc
and then pushes argc
, argv
, init
, fini
and rtld_fini
, on the argument stack and calls __libc_start_main
__libc_start_main
uses following arguments:
1. address of main
function,
2. argc
,
3. argv
,
4. init
,
5. fini
,
6. rtld_fini
, and
7. stack_end
and is responsible to finally calling main()
with appropriate arguments. There's a lot that goes in __libc_start_main
, please read glibc
's code for more details.