Page 1 of 1

NO$GBA hates the "push sp" instruction, libgcc uses it

Posted: Fri Jul 29, 2011 7:46 am
by Dwedit
It appears that NO$GBA does not emulate the push (stmfd) instruction correctly when SP is one of the registers which is pushed onto the stack.

Unfortunately, libgcc uses this instruction a few times:
_aeabi_ldivmod
_aeabi_uldivmod
_arm_cmpdf2
_arm_cmpsf2
libunwind

For example, in _aeabi_ldivmod:

Code: Select all

  2c:	e92d6000 	push	{sp, lr}
  30:	ebfffffe 	bl	0 <__gnu_ldivmod_helper>
On this line, when NO$GBA attempts to push the stack pointer onto the stack, it instead pushes a value which is 8 less than it should be.

In the ldivmod helper:

Code: Select all

  20:	9a08      	ldr	r2, [sp, #32]
  22:	1a1b      	subs	r3, r3, r0
  24:	418c      	sbcs	r4, r1
  26:	6013      	str	r3, [r2, #0]
  28:	6054      	str	r4, [r2, #4]
It reads the old stack pointer from the stack, then writes two values into memory.
In NO$GBA, this overwrites the return address and it jumps to something invalid (I saw it return to address 00000000).

So is it possible to get rid of the "Store the stack pointer onto the stack" thing that libgcc is using, so code that divides 64-bit integers will work correctly on NO$GBA?

Re: NO$GBA hates the "push sp" instruction, LibGCC uses it

Posted: Fri Jul 29, 2011 8:02 am
by Dwedit
To clarify, this post concerns the libgcc THUMB library that is included in devkitARM, specifically the file "C:\devkitpro\devkitARM\lib\gcc\arm-eabi\4.5.1\thumb\libgcc.a".
edit: I have just confirmed that the problem also happens in version 4.6.1 as well.

Re: NO$GBA hates the "push sp" instruction, LibGCC uses it

Posted: Fri Jul 29, 2011 11:53 am
by WinterMute
Why are you reporting no$gba bugs here?

There are absolutely no circumstances under which I will ever be convinced to modify the toolchain to work around bugs in emulators. If it works on the target hardware there is no issue.

Re: NO$GBA hates the "push sp" instruction, LibGCC uses it

Posted: Sat Jul 30, 2011 1:00 pm
by HeavyDude
Surely you should be reporting this to the authors of NO$GBA. If development has stopped then thats tough luck.

Use another emulator (or use a DS... just a thought).

Re: NO$GBA hates the "push sp" instruction, LibGCC uses it

Posted: Sat Jul 30, 2011 5:48 pm
by Dwedit
Development stopped in Feb 2008, he even stopped shipping the product to people who wanted to pay for it, and disappeared from the face of the internet. Rumor has it that Nintendo bought it out and turned it into their debugger, but that's just a rumor.
And if I wanted to hack the EXE to try to fix bugs, there's copy protection in the way.
It's a very nice debugger because it just works well, has data breakpoints, shows symbols, etc.
Whereas Insight + VisualBoyAdvance just doesn't work at all. Insight refuses to show code that isn't in C files, can't connect to VBA, etc.

I was also looking at why __aeabi_ldivmod and __gnu_ldivmod_helper are two separate functions in the first place. Normally, since __gnu_ldivmod_helper is only called by __aeabi_ldivmod, I'd expect it to be inlined. But it was not inlined, and it appears that the reason is just to have it return early when dividing by zero to avoid a bunch of stack pushes.

Re: NO$GBA hates the "push sp" instruction, libgcc uses it

Posted: Sat Oct 01, 2011 10:32 pm
by Dwedit
I have confirmed that including this ASM code makes 64-bit division compatible with NO$GBA:

Code: Select all

 .text
 .align
 .pool
 
 .global __aeabi_ldivmod
 .global __aeabi_uldivmod

__aeabi_ldivmod:
	cmp	r3, #0
	cmpeq	r2, #0
	bne	0f
	cmp	r1, #0
	cmpeq	r0, #0
	movlt	r1, #-2147483648	@ 0x80000000
	movlt	r0, #0
	mvngt	r1, #-2147483648	@ 0x80000000
	mvngt	r0, #0
	b	__aeabi_ldiv0
0:
	sub	sp, sp, #8
	mov r12,sp
	push	{r12, lr}
	bl	__gnu_ldivmod_helper
	ldr	lr, [sp, #4]
	add	sp, sp, #8
	pop	{r2, r3}
	bx	lr

__aeabi_uldivmod:
	cmp	r3, #0
	cmpeq	r2, #0
	bne	0f
	cmp	r1, #0
	cmpeq	r0, #0
	mvnne	r1, #0
	mvnne	r0, #0
	b	__aeabi_ldiv0
0:
	sub	sp, sp, #8
	mov r12,sp
	push	{r12, lr}
	bl	__gnu_uldivmod_helper
	ldr	lr, [sp, #4]
	add	sp, sp, #8
	pop	{r2, r3}
	bx	lr