Whenever you execute a program (either from a shell or replace a forked child process image using one of the functions from exec()
family) - eventually the system call execve(2)
is invoked.
The prototype of this system call is int execve(const char *filename, char *const argv[], char *const envp[]);
This system call, among other things (like verifying execute permission on the filename, setuid etc.), figures the value of argc
using argv
and copies argc
, argv
and argp
arguments on new user stack along with process .data
and .text
. It stores argc
into %rsp
, argv[0]
in LP_SIZE(%rsp)
… argv[argc] = null
in (LP_SIZE*argc)(%rsp)
, similary envp[0]
in (LP_SIZE*(argc+1))(%rsp) ... null
. Where LP_SIZE
is the size of long pointer in bytes.
NOTE: both argv
and argp
are null
terminated arrays.
The arguments to function int main(int argc, char *argv[], char *envp[]);
is usually implementation defined and specified by platforms ABI. C99 does neither bless or forbid envp
argument to main
function.
On linux executable of a program is created according to ELF specifications.
Typically the ELF
is implemented such that, some glibc
wrapper functions are called before main
to make sure that argc
is initialized from stack (these functions typically involve _start
, __libc_csu_init
, __libc_start_main
etc.).
You could experiment your binary files produced on linux using the binutils
(notably readelf
and objdump
utilities among others).
Following is an example where a simple C
program prints the command line arguments and environment variables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #include <stdio.h> int main( int argc, char *argv[], char *envp[]) { int i; // dump args for (i = 0; i < argc; ++i) { printf ( "%d = %s\n" , i, argv[i]); } // dump environment for (i = 0; envp[i] != NULL; ++i) { printf ( "%s\n" , envp[i]); } return 0; } |
Assuming that above programs filename is argc.c
, it can be compiled using command
gcc -o argc argc.c
This should give you an executable file argc
(an executable ELF object).
Now you can disassemble the executable .section
s of argc
using objdump
utility that comes with binutils
package.
objdump -d argc > argc.objdump
will give you following (or equivalent depending on architecture you're working on) in file argc.objdump
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 | argc: file format elf64-x86-64 Disassembly of section .init: 0000000000400418 <_init>: 400418: 48 83 ec 08 sub $0x8,%rsp 40041c: 48 8b 05 d5 0b 20 00 mov 0x200bd5(%rip),%rax # 600ff8 <_DYNAMIC+0x1d0> 400423: 48 85 c0 test %rax,%rax 400426: 74 05 je 40042d <_init+0x15> 400428: e8 53 00 00 00 callq 400480 <__gmon_start__@plt> 40042d: 48 83 c4 08 add $0x8,%rsp 400431: c3 retq Disassembly of section .plt: 0000000000400440 <puts@plt-0x10>: 400440: ff 35 c2 0b 20 00 pushq 0x200bc2(%rip) # 601008 <_GLOBAL_OFFSET_TABLE_+0x8> 400446: ff 25 c4 0b 20 00 jmpq *0x200bc4(%rip) # 601010 <_GLOBAL_OFFSET_TABLE_+0x10> 40044c: 0f 1f 40 00 nopl 0x0(%rax) 0000000000400450 <puts@plt>: 400450: ff 25 c2 0b 20 00 jmpq *0x200bc2(%rip) # 601018 <_GLOBAL_OFFSET_TABLE_+0x18> 400456: 68 00 00 00 00 pushq $0x0 40045b: e9 e0 ff ff ff jmpq 400440 <_init+0x28> 0000000000400460 <printf@plt>: 400460: ff 25 ba 0b 20 00 jmpq *0x200bba(%rip) # 601020 <_GLOBAL_OFFSET_TABLE_+0x20> 400466: 68 01 00 00 00 pushq $0x1 40046b: e9 d0 ff ff ff jmpq 400440 <_init+0x28> 0000000000400470 <__libc_start_main@plt>: 400470: ff 25 b2 0b 20 00 jmpq *0x200bb2(%rip) # 601028 <_GLOBAL_OFFSET_TABLE_+0x28> 400476: 68 02 00 00 00 pushq $0x2 40047b: e9 c0 ff ff ff jmpq 400440 <_init+0x28> 0000000000400480 <__gmon_start__@plt>: 400480: ff 25 aa 0b 20 00 jmpq *0x200baa(%rip) # 601030 <_GLOBAL_OFFSET_TABLE_+0x30> 400486: 68 03 00 00 00 pushq $0x3 40048b: e9 b0 ff ff ff jmpq 400440 <_init+0x28> Disassembly of section .text: 0000000000400490 <_start>: 400490: 31 ed xor %ebp,%ebp 400492: 49 89 d1 mov %rdx,%r9 400495: 5e pop %rsi 400496: 48 89 e2 mov %rsp,%rdx 400499: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 40049d: 50 push %rax 40049e: 54 push %rsp 40049f: 49 c7 c0 a0 06 40 00 mov $0x4006a0,%r8 4004a6: 48 c7 c1 30 06 40 00 mov $0x400630,%rcx 4004ad: 48 c7 c7 80 05 40 00 mov $0x400580,%rdi 4004b4: e8 b7 ff ff ff callq 400470 <__libc_start_main@plt> 4004b9: f4 hlt 4004ba: 66 90 xchg % ax ,% ax 4004bc: 0f 1f 40 00 nopl 0x0(%rax) 00000000004004c0 <deregister_tm_clones>: 4004c0: b8 47 10 60 00 mov $0x601047,%eax 4004c5: 55 push %rbp 4004c6: 48 2d 40 10 60 00 sub $0x601040,%rax 4004cc: 48 83 f8 0e cmp $0xe,%rax 4004d0: 48 89 e5 mov %rsp,%rbp 4004d3: 77 02 ja 4004d7 <deregister_tm_clones+0x17> 4004d5: 5d pop %rbp 4004d6: c3 retq 4004d7: b8 00 00 00 00 mov $0x0,%eax 4004dc: 48 85 c0 test %rax,%rax 4004df: 74 f4 je 4004d5 <deregister_tm_clones+0x15> 4004e1: 5d pop %rbp 4004e2: bf 40 10 60 00 mov $0x601040,%edi 4004e7: ff e0 jmpq *%rax 4004e9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 00000000004004f0 <register_tm_clones>: 4004f0: b8 40 10 60 00 mov $0x601040,%eax 4004f5: 55 push %rbp 4004f6: 48 2d 40 10 60 00 sub $0x601040,%rax 4004fc: 48 c1 f8 03 sar $0x3,%rax 400500: 48 89 e5 mov %rsp,%rbp 400503: 48 89 c2 mov %rax,%rdx 400506: 48 c1 ea 3f shr $0x3f,%rdx 40050a: 48 01 d0 add %rdx,%rax 40050d: 48 d1 f8 sar %rax 400510: 75 02 jne 400514 <register_tm_clones+0x24> 400512: 5d pop %rbp 400513: c3 retq 400514: ba 00 00 00 00 mov $0x0,%edx 400519: 48 85 d2 test %rdx,%rdx 40051c: 74 f4 je 400512 <register_tm_clones+0x22> 40051e: 5d pop %rbp 40051f: 48 89 c6 mov %rax,%rsi 400522: bf 40 10 60 00 mov $0x601040,%edi 400527: ff e2 jmpq *%rdx 400529: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0000000000400530 <__do_global_dtors_aux>: 400530: 80 3d 05 0b 20 00 00 cmpb $0x0,0x200b05(%rip) # 60103c <_edata> 400537: 75 11 jne 40054a <__do_global_dtors_aux+0x1a> 400539: 55 push %rbp 40053a: 48 89 e5 mov %rsp,%rbp 40053d: e8 7e ff ff ff callq 4004c0 <deregister_tm_clones> 400542: 5d pop %rbp 400543: c6 05 f2 0a 20 00 01 movb $0x1,0x200af2(%rip) # 60103c <_edata> 40054a: f3 c3 repz retq 40054c: 0f 1f 40 00 nopl 0x0(%rax) 0000000000400550 <frame_dummy>: 400550: 48 83 3d c8 08 20 00 cmpq $0x0,0x2008c8(%rip) # 600e20 <__jcr_end__> 400557: 00 400558: 74 1e je 400578 <frame_dummy+0x28> 40055a: b8 00 00 00 00 mov $0x0,%eax 40055f: 48 85 c0 test %rax,%rax 400562: 74 14 je 400578 <frame_dummy+0x28> 400564: 55 push %rbp 400565: bf 20 0e 60 00 mov $0x600e20,%edi 40056a: 48 89 e5 mov %rsp,%rbp 40056d: ff d0 callq *%rax 40056f: 5d pop %rbp 400570: e9 7b ff ff ff jmpq 4004f0 <register_tm_clones> 400575: 0f 1f 00 nopl (%rax) 400578: e9 73 ff ff ff jmpq 4004f0 <register_tm_clones> 40057d: 0f 1f 00 nopl (%rax) 0000000000400580 <main>: 400580: 55 push %rbp 400581: 48 89 e5 mov %rsp,%rbp 400584: 48 83 ec 30 sub $0x30,%rsp 400588: 89 7d ec mov %edi,-0x14(%rbp) 40058b: 48 89 75 e0 mov %rsi,-0x20(%rbp) 40058f: 48 89 55 d8 mov %rdx,-0x28(%rbp) 400593: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 40059a: eb 2f jmp 4005cb <main+0x4b> 40059c: 8b 45 fc mov -0x4(%rbp),%eax 40059f: 48 98 cltq 4005a1: 48 8d 14 c5 00 00 00 lea 0x0(,%rax,8),%rdx 4005a8: 00 4005a9: 48 8b 45 e0 mov -0x20(%rbp),%rax 4005ad: 48 01 d0 add %rdx,%rax 4005b0: 48 8b 10 mov (%rax),%rdx 4005b3: 8b 45 fc mov -0x4(%rbp),%eax 4005b6: 89 c6 mov %eax,%esi 4005b8: bf c0 06 40 00 mov $0x4006c0,%edi 4005bd: b8 00 00 00 00 mov $0x0,%eax 4005c2: e8 99 fe ff ff callq 400460 <printf@plt> 4005c7: 83 45 fc 01 addl $0x1,-0x4(%rbp) 4005cb: 8b 45 fc mov -0x4(%rbp),%eax 4005ce: 3b 45 ec cmp -0x14(%rbp),%eax 4005d1: 7c c9 jl 40059c <main+0x1c> 4005d3: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 4005da: eb 23 jmp 4005ff <main+0x7f> 4005dc: 8b 45 fc mov -0x4(%rbp),%eax 4005df: 48 98 cltq 4005e1: 48 8d 14 c5 00 00 00 lea 0x0(,%rax,8),%rdx 4005e8: 00 4005e9: 48 8b 45 d8 mov -0x28(%rbp),%rax 4005ed: 48 01 d0 add %rdx,%rax 4005f0: 48 8b 00 mov (%rax),%rax 4005f3: 48 89 c7 mov %rax,%rdi 4005f6: e8 55 fe ff ff callq 400450 <puts@plt> 4005fb: 83 45 fc 01 addl $0x1,-0x4(%rbp) 4005ff: 8b 45 fc mov -0x4(%rbp),%eax 400602: 48 98 cltq 400604: 48 8d 14 c5 00 00 00 lea 0x0(,%rax,8),%rdx 40060b: 00 40060c: 48 8b 45 d8 mov -0x28(%rbp),%rax 400610: 48 01 d0 add %rdx,%rax 400613: 48 8b 00 mov (%rax),%rax 400616: 48 85 c0 test %rax,%rax 400619: 75 c1 jne 4005dc <main+0x5c> 40061b: b8 00 00 00 00 mov $0x0,%eax 400620: c9 leaveq 400621: c3 retq 400622: 66 2e 0f 1f 84 00 00 nopw % cs :0x0(%rax,%rax,1) 400629: 00 00 00 40062c: 0f 1f 40 00 nopl 0x0(%rax) 0000000000400630 <__libc_csu_init>: 400630: 41 57 push %r15 400632: 41 89 ff mov %edi,%r15d 400635: 41 56 push %r14 400637: 49 89 f6 mov %rsi,%r14 40063a: 41 55 push %r13 40063c: 49 89 d5 mov %rdx,%r13 40063f: 41 54 push %r12 400641: 4c 8d 25 c8 07 20 00 lea 0x2007c8(%rip),%r12 # 600e10 <__frame_dummy_init_array_entry> 400648: 55 push %rbp 400649: 48 8d 2d c8 07 20 00 lea 0x2007c8(%rip),%rbp # 600e18 <__init_array_end> 400650: 53 push %rbx 400651: 4c 29 e5 sub %r12,%rbp 400654: 31 db xor %ebx,%ebx 400656: 48 c1 fd 03 sar $0x3,%rbp 40065a: 48 83 ec 08 sub $0x8,%rsp 40065e: e8 b5 fd ff ff callq 400418 <_init> 400663: 48 85 ed test %rbp,%rbp 400666: 74 1e je 400686 <__libc_csu_init+0x56> 400668: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 40066f: 00 400670: 4c 89 ea mov %r13,%rdx 400673: 4c 89 f6 mov %r14,%rsi 400676: 44 89 ff mov %r15d,%edi 400679: 41 ff 14 dc callq *(%r12,%rbx,8) 40067d: 48 83 c3 01 add $0x1,%rbx 400681: 48 39 eb cmp %rbp,%rbx 400684: 75 ea jne 400670 <__libc_csu_init+0x40> 400686: 48 83 c4 08 add $0x8,%rsp 40068a: 5b pop %rbx 40068b: 5d pop %rbp 40068c: 41 5c pop %r12 40068e: 41 5d pop %r13 400690: 41 5e pop %r14 400692: 41 5f pop %r15 400694: c3 retq 400695: 66 66 2e 0f 1f 84 00 data32 nopw % cs :0x0(%rax,%rax,1) 40069c: 00 00 00 00 00000000004006a0 <__libc_csu_fini>: 4006a0: f3 c3 repz retq 4006a2: 66 90 xchg % ax ,% ax Disassembly of section .fini: 00000000004006a4 <_fini>: 4006a4: 48 83 ec 08 sub $0x8,%rsp 4006a8: 48 83 c4 08 add $0x8,%rsp 4006ac: c3 retq |
The disassembled file is divided in various .section
s as specified by the platform specific ABI. On a linux platform, for our argc
program we have following .section
s with executable instructions.
.init
process initialization code.
.plt
procedure linkage table.
.text
program text, or executables instructions of program.
.fini
finalization code of the process.
All the .section
(s) of an ELF
object can be listed by objdump --section-headers argc
As can be seen in above disassembled executable .section
s of argc executable object, the _starup
code figures the argc
and then pushes argc
, argv
, init
, fini
and rtld_fini
, on the argument stack and calls __libc_start_main
__libc_start_main
uses following arguments:
1. address of main
function,
2. argc
,
3. argv
,
4. init
,
5. fini
,
6. rtld_fini
, and
7. stack_end
and is responsible to finally calling main()
with appropriate arguments. There's a lot that goes in __libc_start_main
, please read glibc
's code for more details.