The keywords _stdcall and _cdecl specify
32-bit calling conventions. That's why they are not relevant for 64-bit
programs (i.e. x64). On x64, there is only the standard
calling convention and the extended __vectorcall
calling convenction.
Why does x64 use mov rather than push? I
assume it's just more efficient and wasn't available in x86.
That is not the reason. Both of these instructions also exist in x86
assembly language.
效率并且是否可实现不是原因.这两种指令(push和mov)在x86汇编语言中都存在
The reason why your compiler is not emitting a push
instruction for the x64 code is probably because it must adjust the
stack pointer directly anyway, in order to create 32 bytes of "shadow
space" for the called function. See this
link (which was provided by @NateEldredge) for further information
on "shadow space".
rbp is the frame pointer on x86_64. In your generated
code, it gets a snapshot of the stack pointer (rsp) so that
when adjustments are made to rsp (i.e. reserving space for
local variables or pushing values on to the stack), local
variables and function parameters are still accessible from a constant
offset from rbp.
A lot of compilers offer frame pointer omission as an optimization
option; this will make the generated assembly code access variables
relative to rsp instead and free up rbp as
another general purpose register for use in functions.
In the case of GCC, which I'm guessing you're using from the AT&T
assembler syntax, that switch is -fomit-frame-pointer. Try
compiling your code with that switch and see what assembly code you get.
You will probably notice that when accessing values relative to
rsp instead of rbp, the offset from the
pointer varies throughout the function.
The __stdcall calling convention is
used to call Win32 API functions. The callee cleans the stack, so the
compiler makes vararg functions
__cdecl. Functions that use this calling
convention require a function prototype. The
__stdcall modifier is
Microsoft-specific.
By value, unless a pointer or reference
type is passed. 除非参数是指针或者引用类型,否则采用值传递
Stack-maintenance
responsibility 栈维护
Called function pops its own arguments
from the stack. 被调用者自己清理自己用到的栈
Name-decoration
convention 命名修饰规则
An underscore (_) is prefixed
to the name. The name is followed by the at sign (@)
followed by the number of bytes (in decimal) in the argument list.
Therefore, the function declared as
int func( int a, double b ) is decorated as follows:
_func@12 下划线开头,然后@,然后是十进制表示的参数表字节大小. 因此int func(int a,double b)将会被修饰为_func@12(int四个字节+double八个字节)
The Microsoft-specific__thiscall calling convention is used on
C++ class member functions on the x86 architecture. It's the default
calling convention used by member functions that don't use variable
arguments (vararg functions).
Under __thiscall, the callee cleans the
stack, which is impossible for vararg functions. Arguments
are pushed on the stack from right to left. The
this pointer is passed via register ECX,
and not on the stack.
如果函数有__thiscall修饰则被调用者清理自己的栈,因此变参函数难以实现.
函数参数从右向左压栈.this指针通过ECX寄存器传递
On ARM, ARM64, and x64 machines,
__thiscall is accepted and ignored by the
compiler. That's because they use a register-based calling convention by
default.
The Microsoft x64 calling convention[18][19]
is followed on Windows
and pre-boot UEFI (for
long mode on x86-64). The first four
arguments are placed onto the registers. That means RCX, RDX, R8, R9 for
integer, struct or pointer arguments (in that order), and XMM0, XMM1,
XMM2, XMM3 for floating point arguments. Additional arguments are pushed
onto the stack (right to left). Integer return values (similar to x86)
are returned in RAX if 64 bits or less. Floating point return values are
returned in XMM0. Parameters less than 64 bits long are not zero
extended; the high bits are not zeroed.
Structs and unions with sizes that match integers are passed and
returned as if they were integers. Otherwise they are replaced with a
pointer when used as an argument. When an oversized struct return is
needed, another pointer to a caller-provided space is prepended as the
first argument, shifting all other arguments to the right by one
place.[20]
When compiling for the x64 architecture in a Windows context (whether
using Microsoft or non-Microsoft tools), stdcall, thiscall, cdecl, and
fastcall all resolve to using this convention.
In the Microsoft x64 calling convention, it is the caller's
responsibility to allocate 32 bytes of "shadow space" on the stack right
before calling the function (regardless of the actual number of
parameters used), and to pop the stack after the call. The shadow space
is used to spill RCX, RDX, R8, and R9,[21]
but must be made available to all functions, even those with fewer than
four parameters.
The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile
(caller-saved).[22]
RAX, RCX, RDX, R8, R9, R10, R11这些寄存器都是volatile修饰的
image-20220428101247431
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are
considered nonvolatile (callee-saved).[22]
RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15不用volatile修饰
For example, a function taking 5 integer arguments will take the
first to fourth in registers, and the fifth will be pushed on top of the
shadow space. So when the called function is entered, the stack will be
composed of (in ascending order) the return address, followed by the
shadow space (32 bytes) followed by the fifth parameter.
Calls the _main function which will do initializing stuff
that gcc needs. Call will push the current instruction pointer on the
stack and jump to the address of _main