Copyright (c) Hyperion Entertainment and contributors.
AmiWest 2013 Lesson 2
Contents
Interpreting Crash Reports
So your program has crashed and you managed to capture the crash report from either Reaper or Grim Reaper or both. This lesson will focus on interpreting that crash report so that you can find the source of the bug.
Using Crash-Logs for Debugging
A good introduction to interpreting crash logs is available in the article Using Crash-Logs for Debugging by Hans de Ruiter.
Here is a quick summary of the important points:
- Use -gstabs for debugging. There already exists a detailed explanation of the Stabs format if you wish to learn more.
- Check your stack boundaries first. AmigaOS does not have automatic stack enlargement.
- The addr2line tool can be used to find the source code line of a crash.
Since that article was written the Reaper and Grim Reaper now interpret debug symbols automatically. This capability is available to applications via Exec's ObtainDebugSymbol() and ReleaseDebugSymbol() functions.
Zero in on the Problem
The address zero crash program from Lesson 1 will be used to demonstrate how to interpret a crash log.
Without Stabs
First, we'll look at the output of Reaper without stabs debugging information present. Just run the program as you did before. You should see output from Reaper similar to the following:
Stack trace: (0x5b911ce0) module Code:AmiWest2013/zero at 0x7FC5C224 (section 5 @ 0x200) (0x5b911d00) native kernel module Kickstart/newlib.library.kmod+0x000020ac (0x5b911d70) native kernel module Kickstart/newlib.library.kmod+0x00002d14 (0x5b911f10) native kernel module Kickstart/newlib.library.kmod+0x00002ef0 (0x5b911f50) _start()+0x170 (section 1 @ 0x16C) (0x5b911f90) native kernel module Kickstart/dos.library.kmod+0x00024f18 (0x5b911fc0) native kernel module Kickstart/kernel+0x0003c958 (0x5b911fd0) native kernel module Kickstart/kernel+0x0003c9d8
Stack traces should be read from top to bottom. In this case, the program crashed in section at 0x200. Not very useful to a human.
Disassembly of crash site: 7fc5c214: 38000000 li r0,0 7fc5c218: 901f0008 stw r0,8(r31) 7fc5c21c: 813f0008 lwz r9,8(r31) 7fc5c220: 38000000 li r0,0 >7fc5c224: 90090000 stw r0,0(r9) 7fc5c228: 38000000 li r0,0 7fc5c22c: 7c030378 mr r3,r0 7fc5c230: 81610000 lwz r11,0(r1) 7fc5c234: 83ebfffc lwz r31,-4(r11) 7fc5c238: 7d615b78 mr r1,r11
This is a disassembly of the crash site which is PowerPC assembly language. Although it looks rather cryptic, you can pretty easily take the instructions and look them up with Google. The arrow (">") is pointing to the offending instructions which is stw and that means "store word" according to this PowerPC assembly site. So your program was attempting to store or write something when it crashed.
With Stabs
As mentioned in Hans' article, you can take that stack trace and find out what line that "section 5 @ 0x200" information translates into using the addr2line program in your SDK. Another approach is to run the debug version of your program in CodeBench and let the OS translate things for you.
Run the zero crashing program again but this time change the target. You want to run the executable with the ".debug" suffix on it. When you do you will see output similar to the following:
Stack trace: (0x59e4ace0) [zero.c:20] main()+0x1c (section 1 @ 0x200) (0x59e4ad00) native kernel module Kickstart/newlib.library.kmod+0x000020ac (0x59e4ad70) native kernel module Kickstart/newlib.library.kmod+0x00002d14 (0x59e4af10) native kernel module Kickstart/newlib.library.kmod+0x00002ef0 (0x59e4af50) _start()+0x170 (section 1 @ 0x16C) (0x59e4af90) native kernel module Kickstart/dos.library.kmod+0x00024f18 (0x59e4afc0) native kernel module Kickstart/kernel+0x0003c958 (0x59e4afd0) native kernel module Kickstart/kernel+0x0003c9d8
There is something new present which is the "[zero.c:20]" which means file zero.c line 20. If I then look at line 20 of my source file I see the following C code:
*zero = 0;
If you understand C, that means store a 0 at the address which the variable zero contains. So it is trying to stuff a 0 at address 0 in the computer.
Disassembly of crash site: 7fc5c214: 38000000 li r0,0 7fc5c218: 901f0008 stw r0,8(r31) 7fc5c21c: 813f0008 lwz r9,8(r31) 7fc5c220: 38000000 li r0,0 >7fc5c224: 90090000 stw r0,0(r9) 7fc5c228: 38000000 li r0,0 7fc5c22c: 7c030378 mr r3,r0 7fc5c230: 81610000 lwz r11,0(r1) 7fc5c234: 83ebfffc lwz r31,-4(r11) 7fc5c238: 7d615b78 mr r1,r11
Nothing has changed in the disassembly because Stabs won't affect what happens to the assembly code. Stabs adds information about the C source code which compiled into this assembly code.
Stack Boundaries
We didn't focus on the stack boundaries above but that is generally the very first thing you should check. When using Reaper that information is output for every crash and looks similar to the following:
Stack pointer (0x59e4ace0) is inside bounds Redzone is OK (4)
This means the stack pointer in the Stabs stack trace example above is inside the stack boundaries. It also tells you the red zone was not touched.
Now, this does not guarantee your program was good and didn't blow its stack. It only means the OS didn't catch it doing so. The odds are good your program behaved well in terms of stack use but it is never an absolute guarantee. When in doubt, increase the size of the stack your program needs using a stack cookie. It could save you many hours of fruitless searching for a bug which isn't really there; you just ran out of stack.
Other Stuff
There is even more information in the crash reports but they won't be covered in detail here. I hope we will have time at AmiWest to explore the other areas.