Quantcast
Channel: File Services and Storage Forum
Viewing all articles
Browse latest Browse all 1766

MPIO Crashing Server 2008 R2 SP1

$
0
0

Hi all:

Thanks in advance for taking the time to read my thread.

Here is the background:

  • We have a Dell server running Server 2008 R2 SP1 connected to a Winchester Systems SAN via dual port HBA to Channel 0 on Storage Processor A & Channel 0 on Storage Processor B.
  • We have the vendors drivers installed and the MPIO feature installed.
  • The drives properly appear as Multi-Path Disk Devices within Device Manager | Disk Drives
  • MPIO is configured for Fail Over Only
  • We can successfully expose the LUNS to the server over either *individual* channel with no issues accessing the storage (effectively confirming both paths are functioning properly when used independantly)
  • This server has failover clustering installed as part of Exchange 2010 DAGs

Reproducable Issue:

  • While CH:0 on SP:A is initially exposed, we expose CH:0 on SP:B
  • We reboot for device manager to pick up the changes
  • We go into Device Manager | Multi-Path Disk Device Properties and click on the MPIO tab or run "mpclaim -s -d" from the CLI
  • The server Blue screens and crashes -- each and every single time.
  • I've tried working with our storage vendor and they believe it is a microsoft issue.

Here is a windb of MEMORY.DMP

*********************************************
*                                                                             *
*                        Bugcheck Analysis                           *
*                                                                             *
*********************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000014, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff880010771c2, address which referenced memory

FOLLOWUP_IP:
msdsm!DsmpQueryLoadBalancePolicy+232
fffff880`010771c2 8b4814          mov     ecx,dword ptr [rax+14h]

SYMBOL_STACK_INDEX:  3
SYMBOL_NAME:  msdsm!DsmpQueryLoadBalancePolicy+232
FOLLOWUP_NAME:  MachineOwner
MODULE_NAME: msdsm
IMAGE_NAME:  msdsm.sys
DEBUG_FLR_IMAGE_TIMESTAMP:  4ce7a476
FAILURE_BUCKET_ID:  X64_0xD1_msdsm!DsmpQueryLoadBalancePolicy+232
BUCKET_ID:  X64_0xD1_msdsm!DsmpQueryLoadBalancePolicy+232

0: kd> lmvm msdsm
start             end                 module name
fffff880`01060000 fffff880`01086000   msdsm      (pdb symbols)          d:\symbols\msdsm.pdb\E4D203DABED04CC8A14C0F3894E777D11\msdsm.pdb
    Loaded symbol image file: msdsm.sys
    Image path: \SystemRoot\system32\DRIVERS\msdsm.sys
    Image name: msdsm.sys
    Timestamp:        Sat Nov 20 05:35:34 2010 (4CE7A476)
    CheckSum:         000251CF
    ImageSize:        00026000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

--------------------------------------------------------------------------------

It should be noted that we have tested if Fail-Over happens at all when both storage processors are exposed by physically pulling the fibre connected to the HBA. I can confirm that failover occurs instantly and successfully, however we can never confirm access the MPIO tab.

I've have googled around for this issue and have come across a few KB articles that try to address this, but they have not helped:

Article ID: 2277440 - not applicable as we are running R2 SP1
Article ID: 981379 - did not help

Many other existing articles did not directly address our issue since so they were not installed.

Thanks again for any assistance offered.



Viewing all articles
Browse latest Browse all 1766

Trending Articles