APLawrence - Information and Resources for Unix and Linux Systems, Bloggers and the self-employed
RSS Feeds Get APLawrence.com by RSS











(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Home > News Posts > double panic smp osr5 crash dump interpret
Printer Friendly Version




News Group Posts

double panic smp osr5 crash dump

interpret



From: Bela Lubkin <belal@sco.com>
Subject: Re: Server crashes - need help! :(
Date: Mon, 30 Dec 2002 12:35:07 GMT
References: <3E0F2797.1040102@dniq-online.com>
<20021229115029.I10531@mammoth.ca.sco.com>
<3E0FDE64.505@dniq-online.com> Farlander wrote: > Ok :) Here's the most frequent one: > > KERNEL STACK TRACE FOR PROCESS 94: > STKADDR FRAMEPTR FUNCTION POSSIBLE ARGUMENTS > e0000844 e0000970 prf_task_s (0x4,0,0x1000,0xe) > e0000978 e0000994 cmn_err (0x3,got_RESERVEDFLT+0x26c,0xe,u+0x9d4) > e000099c e00009c8 k_trap (u+0x9d4) > e00009d4 kern_trap from 0xf0013ae5 in bcpalign > ax:dffda000 cx: 400 dx: 1ffda bx: 1000 fl: 10206 ds: 160 fs: 0 > sp:e0000a04 bp:e0000a24 si:dffda000 di:c0120000 err: 0 es: 160 gs: 0 > e00009dc e0000a24 bcpalign (tmpva_pages,0xc0120000,0x1000,0x1ffda) > e0000a2c e0000a50 dumpnextpa (0xc0120000,u+0xb30,0x3,got_RESERVEDFLT+0x26c) > e0000a58 e0000b74 sysdump (0x4,0,0xfd8bd2b8,0xe) > e0000b7c e0000b98 cmn_err (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xbd8) > e0000ba0 e0000bcc k_trap (u+0xbd8) > e0000bd8 kern_trap from 0xf005f234 in freeb > ax:ffffffff cx: 1 dx:f03560c4 bx:fd8bd2b8 fl: 10282 ds: 160 fs: 0 > sp:e0000c08 bp:e0000c30 si:fd8c87d8 di: 0 err: 0 es: 160 gs: 0 > e0000be0 e0000c30 freeb (0xfd8c87d8,0xfd8c87d8,0x1,0xf2d745e8) > e0000c38 e0000c48 freemsg (0xfd8c87d8,0xfd8c87d8,0xf2d745e8,0xf2d5c700) > e0000c50 e0000c88 sr_device (0xf2d5c700,0xfd8c87d8,0xfd8c87d8,0xf2d5c700) > e0000c90 e0000cb4 sramsendcm (0xf2d5c700,0xfd8c87d8,0xf27aaa00,0) > e0000cbc e0000cd4 _dlgn_send (0xfd8c87d8,0xfd8c87d8,0,0xfd8c87d8) > e0000cdc e0000cf4 _dlgn_putc (0xf2d5c700,0xfd8c87d8,0xfce2df7c,streams+0x1998) > e0000cfc e0000d18 dlgnwput (0xfce2df7c,0xfd8c87d8,0xfce2bfb4,0) > e0000d20 e0000d44 putnext (0xfce2bfb4,0xfd8c87d8,streams+0x1998,0) > e0000d4c e0000d7c strputpmsg (inode+0x12de0,u+0xdcc,u+0xdc0,0) > e0000d84 e0000d9c strputmsg (inode+0x12de0,u+0xdcc,u+0xdc0,0) > e0000da4 e0000ddc msgio (0x2) > e0000de4 e0000de8 putmsg (0x80d5810,0x80d1b34,0x80d1ab4,0x80474d0) > e0000df0 e0000e10 systrap (u+0xe1c) > e0000e1c scall_noke from 0x80053348 > ax: 56 cx: 4 dx: 0 bx: 80d5810 fl: 202 ds: 1f fs: 0 > sp:e0000e4c bp: 804742c si: 80d1b34 di: 80d1ab4 err: 56 es: 1f gs: 0












Well, that's clearly in the Dialog driver (and then a double-panic in
the panic dump writing code...!)

When it hits this double-panic (2nd panic in bcpalign()), has it printed
any of the dump-in-progress dots?

>    And here's another one - for msgcount:
> 
> KERNEL STACK TRACE FOR PROCESS 87:
> STKADDR   FRAMEPTR  FUNCTION   POSSIBLE ARGUMENTS
> e0000910  e0000a3c  prf_task_s (0x4,0,0x1000,0xe)
> e0000a44  e0000a60  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xaa0)
> e0000a68  e0000a94  k_trap     (u+0xaa0)
>            e0000aa0  kern_trap  from 0xf0013ae5 in bcpalign
>    ax:dffda000 cx:     400 dx:   1ffda bx:    1000 fl:    10206 ds: 160 fs:   0
>    sp:e0000ad0 bp:e0000af0 si:dffda000 di:c0110000 err:       0 es: 160 gs:   0
> e0000aa8  e0000af0  bcpalign   (tmpva_pages,0xc0110000,0x1000,0x1ffda)
> e0000af8  e0000b1c  dumpnextpa (0xc0110000,u+0xbfc,0x3,got_RESERVEDFLT+0x26c)
> e0000b24  e0000c40  sysdump    (0x4,0,0,0xe)
> e0000c48  e0000c64  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xca4)
> e0000c6c  e0000c98  k_trap     (u+0xca4)
>            e0000ca4  kern_trap  from 0xf005fc7a in msgcount
>    ax:       0 cx:      78 dx:       0 bx:       0 fl:    10286 ds: 160 fs:   0
>    sp:e0000cd4 bp:e0000ce8 si:f3005e50 di:       7 err:       0 es: 160 gs:   0
> e0000cac  e0000ce8  msgcount (0xfd8c84c0,0xfd4297f0,0xfd423e78,shlock_str_qnext)
> e0000cf0  e0000d18  putnextqru (0,0,0x1,0)
> e0000d20  e0000d44  queuerun   (0xfd8c98a0,0x1)
> e0000d4c  e0000d54  runqueues (u+0xdcc,inode+0x52d50,u+0x1148,region+0xcae0)
> e0000d5c  e0000d7c  strputpmsg (inode+0x52d50,u+0xdcc,u+0xdc0,0)
> e0000d84  e0000d9c  strputmsg  (inode+0x52d50,u+0xdcc,u+0xdc0,0)
> e0000da4  e0000ddc  msgio      (0x2)
> e0000de4  e0000de8  putmsg     (0x80d5870,0x80d1b94,0x80d1b0c,u+0xe10)
> e0000df0  e0000e10  systrap    (u+0xe1c)
>            e0000e1c  scall_noke from 0x80053348
>    ax:      56 cx:       2 dx:       0 bx: 80d5870 fl:      202 ds:  1f fs:   0
>    sp:e0000e4c bp: 8047508 si: 80d1b94 di: 80d1b0c err:      56 es:  1f gs:   0
> 
>    I hope there's something useful in there that I'm missing...

Again with the double-panic...

This doesn't have the Dialogic driver on the stack, but if it fed bad
data into a STREAMS queue, this could be related.

If the double-panics are happening after the dump has printed some dots,
there is something bad happening in the hardware.  Something like a DMA
transfer being written to a wrong address, corrupting memory not owned
by the driver.  The loop in sysdump() that calls dumpnextpage() and then
bcopy() (which we see here as "bcpalign") uses the same addresses over
and over.  0xc0110000 is the unchanging address of a disk buffer it's
using to stage writes.  tmpva_pages is the unchanging virtual address at
which it is sequentially mapping every page of memory.  The mapping
cannot fail (if no memory existed at that physical address, it would
just get all 0xff's).  So if it double-panics after some dots have been
printed, something very strange is happening.  In fact it's pretty
strange even if this is the first page.



What is the value of register CR2 in these dumps?  That's the address it
got the fault on.  Should be the same as either %esi or %edi in the last
trap frame in the stack trace (si:dffda000 di:c0110000 in the 2nd
example).

For that matter, how are you displaying these stacks?!  Those are
crash(ADM) output.  To get crash output on a panic, you would need a
finished panic dump, but these show the system going down in flames in
mid-dump!  I could understand scodb traces, you could be using a serial
console and capturing the output, but crash output from a double-panic
in the dump code?!?

>Bela<
 

If this page was useful to you, please click to help others find it:  

Your +1's can help friends, contacts, and others on the web find the best stuff when they search.

Comments?



Click here to add your comments



Don't miss responses! Subscribe to Comments by RSS or by Email

Click here to add your comments


If you want a picture to show with your comment, go get a Gravatar



LOD Communications, Inc.

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

Jump to Comments



Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.

Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.

We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.

g_face.jpg

This post tagged:

       - Bela
       - SCO_OSR5




Unix/Linux Consultants

Skills Tests

Guest Post Here