WinDbg Disassembler Confusion

[Update: I forgot to mention the observation that, in the buggy installation of windbg, the target address that windbg shows for the call instruction in question appears to be the raw operand value for the call instruction -- without treating that value as relative to the current EIP. Update appears below.]
I've had a sneaking suspicion lately that some of the disassembled code I've been looking at in WinDbg for managed code that's been JITed has been a little off kilter. But it only seems off when it comes to the disassembly for JITed code. The disassembly for normal native code looks fine. And if I step through the JITed code, it actually executes just fine. It's just that the disassembler seems to be out of whack.
So this afternoon I sat down to see if I could clear things up (prove/disprove my suspicion). Unfortunately for me, I believe I've proved my suspicion (the windbg disassembler is out of whack), but I haven't been able to correct it. So I thought I'd post my results here on the off chance that someone out there's had a similar experience and can explain what's going on.
I happened to be poking around at the managed exception handling mechanism, so the code I'm looking at looks like this:
class Program {
  ...
  static void ThrowException( Exception ex ) {
    throw ex;
  }
}
In WinDbg, I used the SOS !bpmd command to set a breakpoint on the ThrowException method. When the breakpoint hits, here's what the disassembly looks like on a machine (XP Pro SP2) where windbg seems to be working just fine:
Breakpoint 1 hit
...
0:000> u eip
01200420 56              push esi
01200421 8bf1            mov esi,ecx
01200423 833dc82da20000  cmp dword ptr ds:[0A22DC8h],0
0120042a 7405            je 01200431
0120042c e8cd1ee978      call mscorwks!JIT_DbgIsJustMyCode (7a0922fe)
01200431 90              nop
01200432 8bce            mov ecx,esi
01200434 e89340e978      call mscorwks!JIT_Throw (7a0944cc)
01200439 cc              int 3
The line of code in question is the one at address 01200434 (the call to mscorwks!JIT_Throw). WinDbg is kind enough to disassemble the instruction and its operand (e89340e978), and show us that the target address for that call instruction is 7a0944cc. But if you wanted to do it yourself (which will be come germane in a moment), WinDbg figures that out by parsing the instruction at address 001200434 like so…
The first byte (e8) is the x86 'call' instruction, which in this case is followed by a 32bit operand indicating the target address of the call relative to the current instruction. That value (9340e978) is displayed as a sequence of bytes above, so to see what the actual relative offset as a 32bit value, we can do this:
0:000> dd eip+1 l1
01200435  78e94093
So the target address is at 'offset' 78e94093 relative to the current instruction pointer (after the opcode and its operand have been consumed, which is 5 bytes). So we can do the math ourselves in windbg like so:
0:000> ? eip + 5 + poi(eip+1)
Evaluate expression: 2047427788 = 7a0944cc
Note that this results in the same target address (7a0944cc) that windbg had shown us was the call to mscorwks!JIT_Throw. Just for completeness, we can confirm this to be the case like so:
0:000> u 7a0944cc
mscorwks!JIT_Throw:
7a0944cc 6894000000      push    94h
7a0944d1 b8d497317a      mov     eax,offset mscorwks!GetManagedNameForTypeInfo+0x1b355 (7a3197d4)
7a0944d6 e848f3ddff      call    mscorwks!_EH_prolog3_catch (79e73823)
7a0944db 8bf9            mov     edi,ecx
7a0944dd e8761bf0ff      call    mscorwks!ResetCurrentContext (79f96058)
7a0944e2 8d4d84          lea     ecx,[ebp-7Ch]
7a0944e5 e81a09deff      call    mscorwks!LazyMachStateCaptureState (79e74e04)
7a0944ea 85c0            test    eax,eax
So that's the reference case on the machine where windbg seems to be in working order. Repeating the same exercise on the buggy installation of windbg (on the same app, using the same version of windbg, XP, and the CLR), here's what I see when the same breakpoint hits:
Breakpoint 1 hit
...
0:000> u eip
00d20420 56              push    esi
00d20421 8bf1            mov     esi,ecx
00d20423 833d102f910000  cmp     dword ptr ds:[912F10h],0
00d2042a 7405            je      00d20431
*** WARNING: Unable to verify checksum for C:\WINDOWS\assembly\NativeImages_v2.0.50727_32\mscorlib\4918a15d4f2a914781651294adafd2ce\mscorlib.ni.dll
00d2042c e8cd1e3779      call    mscorlib_ni+0x2b1ecd (79371ecd)
00d20431 90              nop
00d20432 8bce            mov     ecx,esi
00d20434 e893403779      call    mscorlib_ni+0x2b4093 (79374093)
00d20439 cc              int     3
Right off the bat, it's apparent that windbg is trying to locate symbols for mscorlib for some reason, even though this code doesn't call mscorlib. Also, windbg isn't able to convert what it believes is the target address in mscorlib into anything useful in terms of symbols. Furthermore, disassembling what windbg indicates is the target function being called (79374093) yields what I refer to in technical jargon as gobbletygook:
0:000> u 79374093
mscorlib_ni+0x2b4093:
79374093 2900           sub dword ptr [eax],eax
79374095 b801000000     mov eax,1
7937409a 8986880a0000   mov dword ptr [esi+0A88h],eax
793740a0 83be880a000001 cmp dword ptr [esi+0A88h],1
793740a7 0f94c0         sete al
793740aa 0fb6c0         movzx eax,al
793740ad 5e             pop esi
793740ae 5f             pop edi
Note, BTW, that the address windbg's disassembler computes (79374093) is actually the raw operand value in the call instruction:
0:000> dd eip+1 l1
00d20435 79374093
So it would seem that windbg, on this installation, isn't computing the target of the call relative to the current instruction pointer - it's just displaying the operand for the call instruction as-is.
To top it all off, if I single step (using windbg's 't' command) into the target method, I correctly 'land' at mscorwks!JIT_Throw:
0:000> u eip
00d20434 e893403779 call mscorlib_ni+0x2b4093 (79374093)
00d20439 cc int 3
0:000> t
eax=00913148 ebx=0012f4ac ecx=0128a084 edx=00000000 esi=0128a084 edi=00000000
eip=7a0944cc esp=0012f448 ebp=0012f478 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
mscorwks!JIT_Throw:
7a0944cc 6894000000 push 94h
When I first saw this, I thought my symbol cache was corrupted or otherwise out of date. But doing .reload /f /o (even after manually deleting my local symbol file cache) didn't change the output. Since the Pentium doesn't seem to have any issues actually making the method call, I decided to manually decode the call instruction; comparing the same code on a working installation of windbg and this buggy one. So with the instruction pointer sitting at the address of the call instruction in question, I performed the same computation of the call target as shown above:
0:000> ? eip + 5 + poi(eip+1)
Evaluate expression: 2047427788 = 7a0944cc
THAT looks a lot better, since it corresponds to where the Pentium actually landed when the 't' command was used above. But just to round off the operation, here's confirmation of where that call instruction is going to land:
0:000> u 7a0944cc
mscorwks!JIT_Throw:

7a0944cc 6894000000      push    94h
7a0944d1 b8d497317a      mov     eax,offset mscorwks!GetManagedNameForTypeInfo+0x1b355 (7a3197d4)
7a0944d6 e848f3ddff      call    mscorwks!_EH_prolog3_catch (79e73823)
7a0944db 8bf9            mov     edi,ecx
7a0944dd e8761bf0ff      call    mscorwks!ResetCurrentContext (79f96058)
7a0944e2 8d4d84          lea     ecx,[ebp-7Ch]
7a0944e5 e81a09deff      call    mscorwks!LazyMachStateCaptureState (79e74e04)
7a0944ea 85c0            test    eax,eax
This little exercise would seem to prove that the windbg disassembler is screwed up on one of my machines. The really frustrating thing (other than the time I've lost on this issue) is that I can't seem to repair windbg. I've blown away my symbol cache as mentioned earlier, removed the Debugging Tools for Windows, and downloaded (again, just to be sure) and then reinstalled DTW from scratch - all with no change in behavior. I've also compared the help/about screenshots of windbg's about box on the good/bad installations, and verified that they're supposedly identical.
The only difference between the two machines I've been comparing is that one (the working one) is a regular WinXP machine, while the other (the buggy one) is the same setup running in Virtual PC 2004. But the OS, CLR, and tools on the host and guest OS are (as near as I can tell) identical.
Basically, I'm stumped.
Suggestions, anyone?

Posted Dec 21 2006, 07:28 PM by mike-woodring

Comments

Dude wrote re: WinDbg Disassembler Confusion
on 12-21-2006 4:09 PM
Try the !U extension from SOS.
Mike wrote re: WinDbg Disassembler Confusion
on 12-21-2006 5:03 PM
!sos.u shows the same output, just with clr-aware annotations. But the address computation for the call instruction comes up with the same erroneous value.
Dmitriy Zaslavskiy wrote re: WinDbg Disassembler Confusion
on 12-27-2006 2:16 PM
This is not exactly the same, but smells the same :)

http://blogs.msdn.com/vancem/archive/2006/09/05/742062.aspx
Jason Haley wrote Interesting Finds: January 1, 2007
on 01-01-2007 12:21 PM
arcos wrote re: WinDbg Disassembler Confusion
on 01-05-2007 8:12 AM
You could try running tlist on the windbg process on both machines (tlist windbg). This will output the version and checksum of all DLLs loaded in the process. Comparing the output may reveal the answer to your question...
Some Assembly Required wrote WinDbg Bug: Disassembling JITed Code
on 01-15-2007 12:35 PM

Add a Comment

(required)  
(optional)
(required)  
Remember Me?