Discussion:
qla2xxx Call Trace
deeepdish
2014-06-25 16:21:32 UTC
Permalink
Hello,


Yesterday I noticed a call trace in our logs (see below). We have a
BL490G6 blade with a QHM2462 dual port HBA. Running Fedora 20, Kernel
3.14.8 (3.14.8-200.fc20.x86_64).

This traget is FC attached to a Windows 2012 R2 initiator, and we were
coping files to this target. We also have a few ESXi 5.5 FC initators
attached, however no significant data / workload (only persistent state
volume presetned from this target -- esxi syslog / scratch files -- no
datastores).

Our Windows host required a reboot to reconnect to the storage following this.


[206531.614797] perf samples too long (5119 > 5000), lowering
kernel.perf_event_max_sample_rate to 25000
[271975.274719] Updating session ffff8811dcfd5400 from port
10:00:00:00:c9:7c:46:cb loop_id 0 -> 3 s_id 4:15:2 -> 4:15:1
[271975.530474] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.532897] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.535317] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.537410] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.539735] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.541864] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.544101] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.546231] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.548372] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.550367] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.552355] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.391584] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.394393] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.400470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.402645] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.420583] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.422681] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.432491] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.434647] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.436720] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.439220] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.441602] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272035.854146] rport-2:0-2: blocked FC remote port time out: removing rport
[272036.366038] rport-4:0-2: blocked FC remote port time out: removing rport
[272125.030152] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.031632] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.033072] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.034528] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.035975] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.037399] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.038913] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.040304] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.041703] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.043090] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.044449] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.080846] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.082360] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.092077] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.093416] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.094827] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.096175] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.097470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.099954] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.102486] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.104924] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.107245] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[274840.673963] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[274964.855121] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277650.557507] rport-4:0-1: blocked FC remote port time out: removing rport
[277650.558399] rport-2:0-1: blocked FC remote port time out: removing rport
[277654.524681] rport-2:0-0: blocked FC remote port time out: removing rport
[277654.588691] rport-4:0-0: blocked FC remote port time out: removing rport
[277659.355670] rport-2:0-4: blocked FC remote port time out: removing rport
[277659.387629] rport-4:0-4: blocked FC remote port time out: removing rport
[277665.914306] rport-2:0-5: blocked FC remote port time out: removing rport
[277666.298253] rport-4:0-5: blocked FC remote port time out: removing rport
[277857.318094] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[277857.696163] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[277868.907849] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277873.543152] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277879.810702] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277886.608317] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[278889.211595] rport-2:0-3: blocked FC remote port time out: removing rport
[278889.723480] rport-4:0-3: blocked FC remote port time out: removing rport
[278978.164847] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.167581] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.170228] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.172845] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.175435] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.178025] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.180480] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.182923] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.185292] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.187662] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.190164] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.221698] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.224230] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.226672] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.229046] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.231498] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.233835] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.236204] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.238529] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.240874] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.243192] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.245552] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[279221.460067] Updating session ffff8811ea22c200 from port
50:01:43:80:03:be:95:2a loop_id 6 -> 5 s_id 4:e:2 -> 4:f:2
[279230.698828] Updating session ffff8811e9ca5300 from port
50:01:43:80:03:be:95:28 loop_id 3 -> 5 s_id 3:e:2 -> 3:f:2
[279230.698833] ------------[ cut here ]------------
[279230.698842] WARNING: CPU: 0 PID: 458 at
drivers/scsi/qla2xxx/tcm_qla2xxx.c:1570
tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]()
[279230.698844] Already have lport_fcport_map entry for s_id 3:f:2
[279230.698846] Modules linked in: team_mode_roundrobin tcm_qla2xxx
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod
target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm
ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich
i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich
edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss
nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper
ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279230.698891] CPU: 0 PID: 458 Comm: qla2xxx_2_dpc Tainted: G
I 3.14.8-200.fc20.x86_64 #1
[279230.698893] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279230.698895] 0000000000000000 000000004800ade8 ffff8811e8591b98
ffffffff816f0502
[279230.698900] ffff8811e8591be0 ffff8811e8591bd0 ffffffff8108a1cd
0000000000030f02
[279230.698904] ffff8811e8d15040 ffff8811eb0dd860 ffff8811e9ca5300
0000000000030f02
[279230.698909] Call Trace:
[279230.698919] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279230.698928] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279230.698932] [<ffffffff8108a24c>] warn_slowpath_fmt+0x5c/0x80
[279230.698937] [<ffffffffa06223cd>]
tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]
[279230.698956] [<ffffffffa012682d>] qlt_fc_port_added+0x11d/0x280 [qla2xxx]
[279230.698966] [<ffffffffa00cf594>]
qla2x00_reg_remote_port+0x124/0x1e0 [qla2xxx]
[279230.698977] [<ffffffffa00d6e58>] qla2x00_update_fcport+0x88/0x210
[qla2xxx]
[279230.698987] [<ffffffffa00d70eb>]
qla2x00_async_login_done+0x10b/0x130 [qla2xxx]
[279230.698996] [<ffffffffa00c92e8>] qla2x00_do_work+0x1c8/0x270 [qla2xxx]
[279230.699002] [<ffffffff816f3d22>] ? __schedule+0x2e2/0x740
[279230.699011] [<ffffffffa00caaae>] qla2x00_do_dpc+0x13e/0x880 [qla2xxx]
[279230.699020] [<ffffffffa00ca970>] ? qla2x00_relogin+0x290/0x290 [qla2xxx]
[279230.699024] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279230.699028] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699034] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279230.699037] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699040] ---[ end trace 2ca64747f944acdb ]---
[279232.564068] rport-4:0-7: blocked FC remote port time out: removing rport
[279232.565248] rport-2:0-7: blocked FC remote port time out: removing rport
[279236.147319] rport-4:0-6: blocked FC remote port time out: removing rport
[279236.148508] rport-2:0-6: blocked FC remote port time out: removing rport
[279237.555031] ------------[ cut here ]------------
[279237.555042] WARNING: CPU: 0 PID: 9944 at
drivers/scsi/qla2xxx/tcm_qla2xxx.c:710
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
[279237.555044] Modules linked in: team_mode_roundrobin tcm_qla2xxx
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod
target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm
ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich
i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich
edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss
nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper
ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279237.555089] CPU: 0 PID: 9944 Comm: kworker/0:5 Tainted: G W
I 3.14.8-200.fc20.x86_64 #1
[279237.555091] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279237.555108] Workqueue: events qlt_del_sess_work_fn [qla2xxx]
[279237.555110] 0000000000000000 00000000191bd17a ffff8811d6991cf0
ffffffff816f0502
[279237.555115] 0000000000000000 ffff8811d6991d28 ffffffff8108a1cd
ffff8811ea22c500
[279237.555119] ffff8811e30c8840 ffff8811eb0dda60 ffff8808ea8d6c40
ffff8811e30c8800
[279237.555124] Call Trace:
[279237.555134] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279237.555143] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279237.555147] [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
[279237.555152] [<ffffffffa0622ecc>]
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
[279237.555165] [<ffffffffa0120102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
[279237.555170] [<ffffffffa0621b52>]
tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
[279237.555174] [<ffffffffa0623791>] tcm_qla2xxx_put_sess+0x41/0x50
[tcm_qla2xxx]
[279237.555187] [<ffffffffa012155a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
[279237.555193] [<ffffffff810a68e6>] process_one_work+0x176/0x430
[279237.555196] [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
[279237.555200] [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
[279237.555204] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279237.555207] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555214] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279237.555217] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555219] ---[ end trace 2ca64747f944acdc ]---
[279238.495966] Updating session ffff8811e9ca5300 from port
50:01:43:80:03:be:95:28 loop_id 5 -> 1 s_id 3:f:2 -> 3:f:2
[279240.277453] qla2xxx [0000:06:00.0]-5030:2: Error entry - invalid
handle/queue.
[279240.278748] qla2xxx [0000:06:00.0]-00af:2: Performing ISP error
recovery - ha=ffff8808ea390000.
[279240.690404] rport-2:0-8: blocked FC remote port time out: removing rport
[279241.138278] rport-4:0-8: blocked FC remote port time out: removing rport
[279244.253914] Updating session ffff8811e9ca5a80 from port
50:01:43:80:03:ba:2f:8e loop_id 4 -> 1 s_id 4:b:2 -> 4:a:1
[279245.210435] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.245736] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.474941] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[279442.667609] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[279458.514391] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
deeepdish
2014-07-07 18:06:23 UTC
Permalink
Hello,

I've been able to obtain additional logs. The call traces pertaining
to tcm_qla2xxx are quite frequent when there's a ESXi (5.5.0 update 1)
connected.


ul 07 13:53:21 localhost.localdomain kernel: Updating session
ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 3 -> 1 s_id
4:a:1 -> 4:a:1
Jul 07 13:53:18 localhost.localdomain kernel: rport-2:0-3: blocked FC
remote port time out: removing rport
Jul 07 13:53:18 localhost.localdomain kernel: rport-4:0-3: blocked FC
remote port time out: removing rport
Jul 07 13:53:16 localhost.localdomain kernel: rport-4:0-5: blocked FC
remote port time out: removing rport
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace
71a5cd3df560c243 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ?
insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>]
ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ?
insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>]
kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ?
rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>]
worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>]
process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>]
qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>]
tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>]
tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>]
qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>]
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>]
warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>]
warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>]
dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: Call Trace:
Jul 07 13:53:15 localhost.localdomain kernel: ffff8811e0676040
ffff8808ec0b4660 ffff8811ea3c2c40 ffff8811e0676000
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000
ffff8811dad15d28 ffffffff8108a1cd ffff8811d8ac7280
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000
0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events
qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP
ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm:
kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in:
tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock
iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211
rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985
at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here
]------------
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace
71a5cd3df560c242 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ?
insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>]
ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ?
insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>]
kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ?
rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>]
worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>]
process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>]
qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>]
tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>]
tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>]
qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>]
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>]
warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>]
warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>]
dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: Call Trace:
Jul 07 13:53:15 localhost.localdomain kernel: ffff8808eaff0840
ffff8808de20d260 ffff8811e8f5d440 ffff8808eaff0800
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000
ffff8811dad15d28 ffffffff8108a1cd ffff8808ea3d3b00
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000
0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events
qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP
ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm:
kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in:
tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock
iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211
rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985
at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here
]------------
Jul 07 13:53:09 localhost.localdomain kernel: rport-4:0-2: blocked FC
remote port time out: removing rport
Jul 07 13:53:09 localhost.localdomain kernel: rport-2:0-0: blocked FC
remote port time out: removing rport
Jul 07 13:52:58 localhost.localdomain kernel: Updating session
ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 2 -> 3 s_id
4:b:1 -> 4:a:1
Jul 07 13:52:57 localhost.localdomain kernel: Updating session
ffff8808e8c8e680 from port 50:01:43:80:03:ba:2f:8c loop_id 1 -> 3 s_id
3:b:1 -> 3:a:1
Post by deeepdish
Hello,
Yesterday I noticed a call trace in our logs (see below). We have a
BL490G6 blade with a QHM2462 dual port HBA. Running Fedora 20, Kernel
3.14.8 (3.14.8-200.fc20.x86_64).
This traget is FC attached to a Windows 2012 R2 initiator, and we were
coping files to this target. We also have a few ESXi 5.5 FC initators
attached, however no significant data / workload (only persistent state
volume presetned from this target -- esxi syslog / scratch files -- no
datastores).
Our Windows host required a reboot to reconnect to the storage following this.
[206531.614797] perf samples too long (5119 > 5000), lowering
kernel.perf_event_max_sample_rate to 25000
[271975.274719] Updating session ffff8811dcfd5400 from port
10:00:00:00:c9:7c:46:cb loop_id 0 -> 3 s_id 4:15:2 -> 4:15:1
[271975.530474] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.532897] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.535317] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.537410] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.539735] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.541864] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.544101] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.546231] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.548372] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.550367] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.552355] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.391584] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.394393] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.400470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.402645] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.420583] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.422681] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.432491] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.434647] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.436720] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.439220] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.441602] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272035.854146] rport-2:0-2: blocked FC remote port time out: removing rport
[272036.366038] rport-4:0-2: blocked FC remote port time out: removing rport
[272125.030152] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.031632] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.033072] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.034528] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.035975] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.037399] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.038913] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.040304] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.041703] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.043090] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.044449] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.080846] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.082360] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.092077] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.093416] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.094827] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.096175] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.097470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.099954] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.102486] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.104924] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.107245] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[274840.673963] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[274964.855121] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277650.557507] rport-4:0-1: blocked FC remote port time out: removing rport
[277650.558399] rport-2:0-1: blocked FC remote port time out: removing rport
[277654.524681] rport-2:0-0: blocked FC remote port time out: removing rport
[277654.588691] rport-4:0-0: blocked FC remote port time out: removing rport
[277659.355670] rport-2:0-4: blocked FC remote port time out: removing rport
[277659.387629] rport-4:0-4: blocked FC remote port time out: removing rport
[277665.914306] rport-2:0-5: blocked FC remote port time out: removing rport
[277666.298253] rport-4:0-5: blocked FC remote port time out: removing rport
[277857.318094] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[277857.696163] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[277868.907849] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277873.543152] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277879.810702] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277886.608317] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[278889.211595] rport-2:0-3: blocked FC remote port time out: removing rport
[278889.723480] rport-4:0-3: blocked FC remote port time out: removing rport
[278978.164847] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.167581] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.170228] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.172845] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.175435] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.178025] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.180480] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.182923] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.185292] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.187662] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.190164] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.221698] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.224230] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.226672] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.229046] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.231498] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.233835] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.236204] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.238529] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.240874] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.243192] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.245552] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[279221.460067] Updating session ffff8811ea22c200 from port
50:01:43:80:03:be:95:2a loop_id 6 -> 5 s_id 4:e:2 -> 4:f:2
[279230.698828] Updating session ffff8811e9ca5300 from port
50:01:43:80:03:be:95:28 loop_id 3 -> 5 s_id 3:e:2 -> 3:f:2
[279230.698833] ------------[ cut here ]------------
[279230.698842] WARNING: CPU: 0 PID: 458 at
drivers/scsi/qla2xxx/tcm_qla2xxx.c:1570
tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]()
[279230.698844] Already have lport_fcport_map entry for s_id 3:f:2
[279230.698846] Modules linked in: team_mode_roundrobin tcm_qla2xxx
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod
target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm
ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich
i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich
edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss
nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper
ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279230.698891] CPU: 0 PID: 458 Comm: qla2xxx_2_dpc Tainted: G
I 3.14.8-200.fc20.x86_64 #1
[279230.698893] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279230.698895] 0000000000000000 000000004800ade8 ffff8811e8591b98
ffffffff816f0502
[279230.698900] ffff8811e8591be0 ffff8811e8591bd0 ffffffff8108a1cd
0000000000030f02
[279230.698904] ffff8811e8d15040 ffff8811eb0dd860 ffff8811e9ca5300
0000000000030f02
[279230.698919] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279230.698928] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279230.698932] [<ffffffff8108a24c>] warn_slowpath_fmt+0x5c/0x80
[279230.698937] [<ffffffffa06223cd>]
tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]
[279230.698956] [<ffffffffa012682d>] qlt_fc_port_added+0x11d/0x280 [qla2xxx]
[279230.698966] [<ffffffffa00cf594>]
qla2x00_reg_remote_port+0x124/0x1e0 [qla2xxx]
[279230.698977] [<ffffffffa00d6e58>] qla2x00_update_fcport+0x88/0x210
[qla2xxx]
[279230.698987] [<ffffffffa00d70eb>]
qla2x00_async_login_done+0x10b/0x130 [qla2xxx]
[279230.698996] [<ffffffffa00c92e8>] qla2x00_do_work+0x1c8/0x270 [qla2xxx]
[279230.699002] [<ffffffff816f3d22>] ? __schedule+0x2e2/0x740
[279230.699011] [<ffffffffa00caaae>] qla2x00_do_dpc+0x13e/0x880 [qla2xxx]
[279230.699020] [<ffffffffa00ca970>] ? qla2x00_relogin+0x290/0x290 [qla2xxx]
[279230.699024] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279230.699028] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699034] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279230.699037] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699040] ---[ end trace 2ca64747f944acdb ]---
[279232.564068] rport-4:0-7: blocked FC remote port time out: removing rport
[279232.565248] rport-2:0-7: blocked FC remote port time out: removing rport
[279236.147319] rport-4:0-6: blocked FC remote port time out: removing rport
[279236.148508] rport-2:0-6: blocked FC remote port time out: removing rport
[279237.555031] ------------[ cut here ]------------
[279237.555042] WARNING: CPU: 0 PID: 9944 at
drivers/scsi/qla2xxx/tcm_qla2xxx.c:710
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
[279237.555044] Modules linked in: team_mode_roundrobin tcm_qla2xxx
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod
target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm
ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich
i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich
edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss
nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper
ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279237.555089] CPU: 0 PID: 9944 Comm: kworker/0:5 Tainted: G W
I 3.14.8-200.fc20.x86_64 #1
[279237.555091] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279237.555108] Workqueue: events qlt_del_sess_work_fn [qla2xxx]
[279237.555110] 0000000000000000 00000000191bd17a ffff8811d6991cf0
ffffffff816f0502
[279237.555115] 0000000000000000 ffff8811d6991d28 ffffffff8108a1cd
ffff8811ea22c500
[279237.555119] ffff8811e30c8840 ffff8811eb0dda60 ffff8808ea8d6c40
ffff8811e30c8800
[279237.555134] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279237.555143] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279237.555147] [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
[279237.555152] [<ffffffffa0622ecc>]
tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
[279237.555165] [<ffffffffa0120102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
[279237.555170] [<ffffffffa0621b52>]
tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
[279237.555174] [<ffffffffa0623791>] tcm_qla2xxx_put_sess+0x41/0x50
[tcm_qla2xxx]
[279237.555187] [<ffffffffa012155a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
[279237.555193] [<ffffffff810a68e6>] process_one_work+0x176/0x430
[279237.555196] [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
[279237.555200] [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
[279237.555204] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279237.555207] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555214] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279237.555217] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555219] ---[ end trace 2ca64747f944acdc ]---
[279238.495966] Updating session ffff8811e9ca5300 from port
50:01:43:80:03:be:95:28 loop_id 5 -> 1 s_id 3:f:2 -> 3:f:2
[279240.277453] qla2xxx [0000:06:00.0]-5030:2: Error entry - invalid
handle/queue.
[279240.278748] qla2xxx [0000:06:00.0]-00af:2: Performing ISP error
recovery - ha=ffff8808ea390000.
[279240.690404] rport-2:0-8: blocked FC remote port time out: removing rport
[279241.138278] rport-4:0-8: blocked FC remote port time out: removing rport
[279244.253914] Updating session ffff8811e9ca5a80 from port
50:01:43:80:03:ba:2f:8e loop_id 4 -> 1 s_id 4:b:2 -> 4:a:1
[279245.210435] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.245736] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.474941] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[279442.667609] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[279458.514391] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
Nicholas A. Bellinger
2014-07-31 23:37:34 UTC
Permalink
Hi deeepdish,

Apologies for the delayed response, comments are inline below.
Post by deeepdish
Hello,
I've been able to obtain additional logs. The call traces pertaining
to tcm_qla2xxx are quite frequent when there's a ESXi (5.5.0 update 1)
connected.
Jul 07 13:53:21 localhost.localdomain kernel: Updating session ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 3 -> 1 s_id 4:a:1 -> 4:a:1
Jul 07 13:53:18 localhost.localdomain kernel: rport-2:0-3: blocked FC remote port time out: removing rport
Jul 07 13:53:18 localhost.localdomain kernel: rport-4:0-3: blocked FC remote port time out: removing rport
Jul 07 13:53:16 localhost.localdomain kernel: rport-4:0-5: blocked FC remote port time out: removing rport
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace 71a5cd3df560c243 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>] kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>] process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>] tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>] tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>] tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>] dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: ffff8811e0676040 ffff8808ec0b4660 ffff8811ea3c2c40 ffff8811e0676000
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 ffff8811dad15d28 ffffffff8108a1cd ffff8811d8ac7280
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm: kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in: tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211 rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710 tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here ]------------
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace 71a5cd3df560c242 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>] kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>] process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>] tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>] tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>] tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>] dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: ffff8808eaff0840 ffff8808de20d260 ffff8811e8f5d440 ffff8808eaff0800
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 ffff8811dad15d28 ffffffff8108a1cd ffff8808ea3d3b00
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm: kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in: tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211 rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710 tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here ]------------
Jul 07 13:53:09 localhost.localdomain kernel: rport-4:0-2: blocked FC remote port time out: removing rport
Jul 07 13:53:09 localhost.localdomain kernel: rport-2:0-0: blocked FC remote port time out: removing rport
Jul 07 13:52:58 localhost.localdomain kernel: Updating session ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 2 -> 3 s_id 4:b:1 -> 4:a:1
Jul 07 13:52:57 localhost.localdomain kernel: Updating session ffff8808e8c8e680 from port 50:01:43:80:03:ba:2f:8c loop_id 1 -> 3 s_id 3:b:1 -> 3:a:1
Would it be possible to obtain target side logs with the following debug
bits enabled..? The first involves enabling qla2xxx debugging at
modprobe qla2xxx time with:

modprobe qla2xxx ql2xextended_error_logging=0x1e40f000

and the second is enabling tcm_qla2xxx debugging (via debugfs +
CONFIG_DYNAMIC_DEBUG=y) with:

echo 'module tcm_qla2xxx +p' > /debug/dynamic_debugging/control

Having the logs with both of these debug bits enabled would be very
helpful in pinpointing the issue.

Also, adding CC' for Roland + Joern, as they where the last ones to
touched this specific piece of code..

Roland + Joern, I know that at some point you had converted from btree
-> radix trees due to some issues in lib/btree.c code, but never pushed
this conversion patch for tcm_qla2xxx upstream. Can you give an idea of
what caused to conversion to radix trees, and if the bug reported here
is similar..?

Thanks,

--nab
Post by deeepdish
Post by deeepdish
Hello,
Yesterday I noticed a call trace in our logs (see below). We have a
BL490G6 blade with a QHM2462 dual port HBA. Running Fedora 20, Kernel
3.14.8 (3.14.8-200.fc20.x86_64).
This traget is FC attached to a Windows 2012 R2 initiator, and we were
coping files to this target. We also have a few ESXi 5.5 FC initators
attached, however no significant data / workload (only persistent state
volume presetned from this target -- esxi syslog / scratch files -- no
datastores).
Our Windows host required a reboot to reconnect to the storage following this.
[206531.614797] perf samples too long (5119 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[271975.274719] Updating session ffff8811dcfd5400 from port 10:00:00:00:c9:7c:46:cb loop_id 0 -> 3 s_id 4:15:2 -> 4:15:1
[271975.530474] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.532897] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.535317] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.537410] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.539735] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.541864] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.544101] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.546231] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.548372] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.550367] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271975.552355] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.391584] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.394393] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.400470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.402645] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.420583] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.422681] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.432491] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.434647] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.436720] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.439220] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[271981.441602] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272035.854146] rport-2:0-2: blocked FC remote port time out: removing rport
[272036.366038] rport-4:0-2: blocked FC remote port time out: removing rport
[272125.030152] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.031632] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.033072] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.034528] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.035975] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.037399] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.038913] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.040304] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.041703] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.043090] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.044449] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.080846] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.082360] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.092077] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.093416] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.094827] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.096175] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.097470] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.099954] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.102486] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.104924] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[272125.107245] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[274840.673963] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[274964.855121] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277650.557507] rport-4:0-1: blocked FC remote port time out: removing rport
[277650.558399] rport-2:0-1: blocked FC remote port time out: removing rport
[277654.524681] rport-2:0-0: blocked FC remote port time out: removing rport
[277654.588691] rport-4:0-0: blocked FC remote port time out: removing rport
[277659.355670] rport-2:0-4: blocked FC remote port time out: removing rport
[277659.387629] rport-4:0-4: blocked FC remote port time out: removing rport
[277665.914306] rport-2:0-5: blocked FC remote port time out: removing rport
[277666.298253] rport-4:0-5: blocked FC remote port time out: removing rport
[277857.318094] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[277857.696163] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[277868.907849] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277873.543152] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277879.810702] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[277886.608317] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[278889.211595] rport-2:0-3: blocked FC remote port time out: removing rport
[278889.723480] rport-4:0-3: blocked FC remote port time out: removing rport
[278978.164847] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.167581] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.170228] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.172845] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.175435] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.178025] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.180480] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.182923] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.185292] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.187662] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.190164] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.221698] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.224230] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.226672] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.229046] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.231498] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.233835] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.236204] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.238529] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.240874] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.243192] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[278978.245552] MODE SENSE: unimplemented page/subpage: 0x0a/0x01
[279221.460067] Updating session ffff8811ea22c200 from port 50:01:43:80:03:be:95:2a loop_id 6 -> 5 s_id 4:e:2 -> 4:f:2
[279230.698828] Updating session ffff8811e9ca5300 from port 50:01:43:80:03:be:95:28 loop_id 3 -> 5 s_id 3:e:2 -> 3:f:2
[279230.698833] ------------[ cut here ]------------
[279230.698842] WARNING: CPU: 0 PID: 458 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:1570 tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]()
[279230.698844] Already have lport_fcport_map entry for s_id 3:f:2
[279230.698846] Modules linked in: team_mode_roundrobin tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279230.698891] CPU: 0 PID: 458 Comm: qla2xxx_2_dpc Tainted: GI 3.14.8-200.fc20.x86_64 #1
[279230.698893] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279230.698895] 0000000000000000 000000004800ade8 ffff8811e8591b98 ffffffff816f0502
[279230.698900] ffff8811e8591be0 ffff8811e8591bd0 ffffffff8108a1cd 0000000000030f02
[279230.698904] ffff8811e8d15040 ffff8811eb0dd860 ffff8811e9ca5300 0000000000030f02
[279230.698919] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279230.698928] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279230.698932] [<ffffffff8108a24c>] warn_slowpath_fmt+0x5c/0x80
[279230.698937] [<ffffffffa06223cd>] tcm_qla2xxx_update_sess+0x20d/0x2f0 [tcm_qla2xxx]
[279230.698956] [<ffffffffa012682d>] qlt_fc_port_added+0x11d/0x280 [qla2xxx]
[279230.698966] [<ffffffffa00cf594>] qla2x00_reg_remote_port+0x124/0x1e0 [qla2xxx]
[279230.698977] [<ffffffffa00d6e58>] qla2x00_update_fcport+0x88/0x210 [qla2xxx]
[279230.698987] [<ffffffffa00d70eb>] qla2x00_async_login_done+0x10b/0x130 [qla2xxx]
[279230.698996] [<ffffffffa00c92e8>] qla2x00_do_work+0x1c8/0x270 [qla2xxx]
[279230.699002] [<ffffffff816f3d22>] ? __schedule+0x2e2/0x740
[279230.699011] [<ffffffffa00caaae>] qla2x00_do_dpc+0x13e/0x880 [qla2xxx]
[279230.699020] [<ffffffffa00ca970>] ? qla2x00_relogin+0x290/0x290 [qla2xxx]
[279230.699024] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279230.699028] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699034] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279230.699037] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279230.699040] ---[ end trace 2ca64747f944acdb ]---
[279232.564068] rport-4:0-7: blocked FC remote port time out: removing rport
[279232.565248] rport-2:0-7: blocked FC remote port time out: removing rport
[279236.147319] rport-4:0-6: blocked FC remote port time out: removing rport
[279236.148508] rport-2:0-6: blocked FC remote port time out: removing rport
[279237.555031] ------------[ cut here ]------------
[279237.555042] WARNING: CPU: 0 PID: 9944 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710 tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
[279237.555044] Modules linked in: team_mode_roundrobin tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team cfg80211 rfkill coretemp kvm_intel bnx2x kvm ipmi_si bcache mdio libcrc32c serio_raw crc32c_intel microcode gpio_ich i7core_edac iTCO_wdt iTCO_vendor_support ipmi_msghandler lpc_ich edac_core mfd_core hpwdt acpi_power_meter shpchp hpilo nfsd auth_rpcgss nfs_acl lockd sunrpc raid1 radeon qla2xxx i2c_algo_bit drm_kms_helper ttm drm scsi_transport_fc ata_generic i2c_core pata_acpi scsi_tgt hpsa
[279237.555089] CPU: 0 PID: 9944 Comm: kworker/0:5 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
[279237.555091] Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
[279237.555108] Workqueue: events qlt_del_sess_work_fn [qla2xxx]
[279237.555110] 0000000000000000 00000000191bd17a ffff8811d6991cf0 ffffffff816f0502
[279237.555115] 0000000000000000 ffff8811d6991d28 ffffffff8108a1cd ffff8811ea22c500
[279237.555119] ffff8811e30c8840 ffff8811eb0dda60 ffff8808ea8d6c40 ffff8811e30c8800
[279237.555134] [<ffffffff816f0502>] dump_stack+0x45/0x56
[279237.555143] [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[279237.555147] [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
[279237.555152] [<ffffffffa0622ecc>] tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
[279237.555165] [<ffffffffa0120102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
[279237.555170] [<ffffffffa0621b52>] tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
[279237.555174] [<ffffffffa0623791>] tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
[279237.555187] [<ffffffffa012155a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
[279237.555193] [<ffffffff810a68e6>] process_one_work+0x176/0x430
[279237.555196] [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
[279237.555200] [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
[279237.555204] [<ffffffff810ae2d1>] kthread+0xe1/0x100
[279237.555207] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555214] [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
[279237.555217] [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
[279237.555219] ---[ end trace 2ca64747f944acdc ]---
[279238.495966] Updating session ffff8811e9ca5300 from port 50:01:43:80:03:be:95:28 loop_id 5 -> 1 s_id 3:f:2 -> 3:f:2
[279240.277453] qla2xxx [0000:06:00.0]-5030:2: Error entry - invalid handle/queue.
[279240.278748] qla2xxx [0000:06:00.0]-00af:2: Performing ISP error recovery - ha=ffff8808ea390000.
[279240.690404] rport-2:0-8: blocked FC remote port time out: removing rport
[279241.138278] rport-4:0-8: blocked FC remote port time out: removing rport
[279244.253914] Updating session ffff8811e9ca5a80 from port 50:01:43:80:03:ba:2f:8e loop_id 4 -> 1 s_id 4:b:2 -> 4:a:1
[279245.210435] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.245736] qla2xxx [0000:06:00.0]-505f:2: Link is operational (4 Gbps).
[279436.474941] qla2xxx [0000:06:00.1]-505f:4: Link is operational (4 Gbps).
[279442.667609] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
[279458.514391] MODE SENSE: unimplemented page/subpage: 0x1c/0x02
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger
2014-08-01 00:22:01 UTC
Permalink
Post by Nicholas A. Bellinger
Hi deeepdish,
Apologies for the delayed response, comments are inline below.
Post by deeepdish
Hello,
I've been able to obtain additional logs. The call traces pertaining
to tcm_qla2xxx are quite frequent when there's a ESXi (5.5.0 update 1)
connected.
Jul 07 13:53:21 localhost.localdomain kernel: Updating session ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 3 -> 1 s_id 4:a:1 -> 4:a:1
Jul 07 13:53:18 localhost.localdomain kernel: rport-2:0-3: blocked FC remote port time out: removing rport
Jul 07 13:53:18 localhost.localdomain kernel: rport-4:0-3: blocked FC remote port time out: removing rport
Jul 07 13:53:16 localhost.localdomain kernel: rport-4:0-5: blocked FC remote port time out: removing rport
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace 71a5cd3df560c243 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>] kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>] process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>] tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>] tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>] tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>] dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: ffff8811e0676040 ffff8808ec0b4660 ffff8811ea3c2c40 ffff8811e0676000
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 ffff8811dad15d28 ffffffff8108a1cd ffff8811d8ac7280
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm: kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in: tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211 rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710 tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here ]------------
Jul 07 13:53:15 localhost.localdomain kernel: ---[ end trace 71a5cd3df560c242 ]---
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8170083c>] ret_from_fork+0x7c/0xb0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae1f0>] ? insert_kthread_work+0x40/0x40
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810ae2d1>] kthread+0xe1/0x100
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a7460>] ? rescuer_thread+0x3b0/0x3b0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a757b>] worker_thread+0x11b/0x3a0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff810a68e6>] process_one_work+0x176/0x430
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010b55a>] qlt_del_sess_work_fn+0x8a/0xe0 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e7791>] tcm_qla2xxx_put_sess+0x41/0x50 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e5b52>] tcm_qla2xxx_release_session+0x12/0x20 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa010a102>] qlt_unreg_sess+0x22/0x80 [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffffa08e6ecc>] tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a2fa>] warn_slowpath_null+0x1a/0x20
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
Jul 07 13:53:15 localhost.localdomain kernel: [<ffffffff816f0502>] dump_stack+0x45/0x56
Jul 07 13:53:15 localhost.localdomain kernel: ffff8808eaff0840 ffff8808de20d260 ffff8811e8f5d440 ffff8808eaff0800
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 ffff8811dad15d28 ffffffff8108a1cd ffff8808ea3d3b00
Jul 07 13:53:15 localhost.localdomain kernel: 0000000000000000 0000000037487358 ffff8811dad15cf0 ffffffff816f0502
Jul 07 13:53:15 localhost.localdomain kernel: Workqueue: events qlt_del_sess_work_fn [qla2xxx]
Jul 07 13:53:15 localhost.localdomain kernel: Hardware name: HP ProLiant BL490c G6, BIOS I21 07/02/2013
Jul 07 13:53:15 localhost.localdomain kernel: CPU: 0 PID: 1985 Comm: kworker/0:3 Tainted: G W I 3.14.8-200.fc20.x86_64 #1
Jul 07 13:53:15 localhost.localdomain kernel: Modules linked in: tcm_qla2xxx target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod team_mode_roundrobin team cfg80211 rfkill iTCO_wdt iTCO_ven
Jul 07 13:53:15 localhost.localdomain kernel: WARNING: CPU: 0 PID: 1985 at drivers/scsi/qla2xxx/tcm_qla2xxx.c:710 tcm_qla2xxx_clear_nacl_from_fcport_map+0x16c/0x180 [tcm_qla2xxx]()
Jul 07 13:53:15 localhost.localdomain kernel: ------------[ cut here ]------------
Jul 07 13:53:09 localhost.localdomain kernel: rport-4:0-2: blocked FC remote port time out: removing rport
Jul 07 13:53:09 localhost.localdomain kernel: rport-2:0-0: blocked FC remote port time out: removing rport
Jul 07 13:52:58 localhost.localdomain kernel: Updating session ffff8808e8c8e480 from port 50:01:43:80:03:ba:2f:8e loop_id 2 -> 3 s_id 4:b:1 -> 4:a:1
Jul 07 13:52:57 localhost.localdomain kernel: Updating session ffff8808e8c8e680 from port 50:01:43:80:03:ba:2f:8c loop_id 1 -> 3 s_id 3:b:1 -> 3:a:1
Would it be possible to obtain target side logs with the following debug
bits enabled..? The first involves enabling qla2xxx debugging at
modprobe qla2xxx ql2xextended_error_logging=0x1e40f000
and the second is enabling tcm_qla2xxx debugging (via debugfs +
echo 'module tcm_qla2xxx +p' > /debug/dynamic_debugging/control
Having the logs with both of these debug bits enabled would be very
helpful in pinpointing the issue.
Also, adding CC' for Roland + Joern, as they where the last ones to
touched this specific piece of code..
Roland + Joern, I know that at some point you had converted from btree
-> radix trees due to some issues in lib/btree.c code, but never pushed
this conversion patch for tcm_qla2xxx upstream. Can you give an idea of
what caused to conversion to radix trees, and if the bug reported here
is similar..?
Roland + Joern, I'm specifically talking about the following patches:

https://github.com/rolandd/qla-target/commit/77bd729e5e9f858495dfded7c08e548bc9ecc9be
https://github.com/rolandd/qla-target/commit/79ae07a141950c0008480a46f91d72318a1d799e

Can you please confirm if these are related to what deeepdish has
reported here..?

Thanks,

--nab
Roland Dreier
2014-08-01 14:56:30 UTC
Permalink
On Thu, Jul 31, 2014 at 5:22 PM, Nicholas A. Bellinger
Post by Nicholas A. Bellinger
https://github.com/rolandd/qla-target/commit/77bd729e5e9f858495dfded7c08e548bc9ecc9be
https://github.com/rolandd/qla-target/commit/79ae07a141950c0008480a46f91d72318a1d799e
Can you please confirm if these are related to what deeepdish has
reported here..?
Our latest tree doesn't use the btree -> radix tree conversion -- we
decided there is nothing wrong with the btree code itself. So from
that POV upstream is fine.

However the first (77bd...) commit may be related. Certainly the
comment is true: the qla2xxx target stuff is racy, especially around
fabric changes (initiators/sessions coming and going). It looks like
the original trace has to do with session update, so yeah, it probably
is hitting one of those races.

- R.
Jörn Engel
2014-08-01 16:13:24 UTC
Permalink
Post by Roland Dreier
On Thu, Jul 31, 2014 at 5:22 PM, Nicholas A. Bellinger
Roland + Joern, I'm specifically talking about the following patche=
https://github.com/rolandd/qla-target/commit/77bd729e5e9f858495dfde=
d7c08e548bc9ecc9be
Post by Roland Dreier
https://github.com/rolandd/qla-target/commit/79ae07a141950c0008480a=
46f91d72318a1d799e
Post by Roland Dreier
Can you please confirm if these are related to what deeepdish has
reported here..?
=20
Our latest tree doesn't use the btree -> radix tree conversion -- we
decided there is nothing wrong with the btree code itself. So from
that POV upstream is fine.
If memory serves, replacing the btree with a radix tree helped because
the radix tree is more robust against races. With a btree the entries
get moved left and right a lot and only rarely have NULL neighbors.
At the end of the day, the conversion allowed us to survive in spite
of other unfixed bugs a bit longer.

We should go through the list of our patches soon. I would be amazed
if there weren't some forgotten fixes we haven't sent to you yet. I
will take the blame for that.

J=C3=B6rn

--
The only good bug is a dead bug.
-- Starship Troopers
Nicholas A. Bellinger
2014-10-03 22:26:58 UTC
Permalink
Post by Roland Dreier
On Thu, Jul 31, 2014 at 5:22 PM, Nicholas A. Bellinger
Post by Nicholas A. Bellinger
https://github.com/rolandd/qla-target/commit/77bd729e5e9f858495dfded7c08e548bc9ecc9be
https://github.com/rolandd/qla-target/commit/79ae07a141950c0008480a46f91d72318a1d799e
Can you please confirm if these are related to what deeepdish has
reported here..?
Our latest tree doesn't use the btree -> radix tree conversion -- we
decided there is nothing wrong with the btree code itself. So from
that POV upstream is fine.
However the first (77bd...) commit may be related. Certainly the
comment is true: the qla2xxx target stuff is racy, especially around
fabric changes (initiators/sessions coming and going). It looks like
the original trace has to do with session update, so yeah, it probably
is hitting one of those races.
Just a heads up that commit 77bd729e5 from above has been included into
target-pending/for-next:

qla_target: don't delete changed nacls
https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?h=for-next&id=f4c24db1b7ad0ce84409e15744d26c6f86a96840

Deeepdish, it would be really helpful if you can verify the following
patch on your tcm_qla2xxx setup, and see if you can reproduce the loss
of FC connectivity with a Windows 2012 R2 initiator.

Note that Joern's patch still has a WARN_ON() in place when the scenario
occurs (eg: you'll still see a stack backtrace in dmesg when the issue
occurs), but now tcm_qla2xxx should be able to successfully recover.

Please let us know if/when your able to verify.

--nab

deeepdish
2014-07-08 20:03:16 UTC
Permalink
Hello,

I'm having a lot of stability issues with LIO presenting storage over
FC to ESXi. The log details are contained within this thread and
range from call traces on the TCM_QLA2xxx module to LUNs loosing
connectivity to ESXi. The hosts themselves are able to connect to
other storage presented off an HDS array without issue.

Our setup involves Fedora 20, Kernel 3.15.3 (latest)
HP BL490G6 Blade
1 x QHM2462
LVM backed block backstores
4 ESXi 5.5 update 1 nodes
is_not_rot=0 set on all block devices

I suspect its something we're doing wrong as there are successful
commercial products running LIO as their FC target without issue.

Are there any FC specific considerations beyond targetcli provisioning
that need to be considered when connecting ESXi?

targetcli setup is fairly vanilla:

targetcli shell version 2.1.fb35
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/backstores/b...L01-VMSR-0000> cd /
/> ls
o- /
.........................................................................................................................
[...]
o- backstores
..............................................................................................................
[...]
| o- block
..................................................................................................
[Storage Objects: 2]
| | o- LABCL01-VMSR-0000
.............................................
[/dev/data/LABCL01-VMSR-0000 (2.0TiB) write-thru activated]
| | o- LABCL01-VMSR-0001
.............................................
[/dev/data/LABCL01-VMSR-0001 (2.0TiB) write-thru activated]
| o- fileio
.................................................................................................
[Storage Objects: 0]
| o- pscsi
..................................................................................................
[Storage Objects: 0]
| o- ramdisk
................................................................................................
[Storage Objects: 0]
o- iscsi
............................................................................................................
[Targets: 0]
o- loopback
.........................................................................................................
[Targets: 0]
o- qla2xxx
..........................................................................................................
[Targets: 2]
| o- naa.5001438003be9560
.............................................................................................
[gen-acls]
| | o- acls
............................................................................................................
[ACLs: 4]
| | | o- naa.5001438003ba2f8c
...................................................................................
[Mapped LUNs: 1]
| | | | o- mapped_lun0
.......................................................................
[lun0 block/LABCL01-VMSR-0000 (rw)]
| | | o- naa.5001438003be9528
...................................................................................
[Mapped LUNs: 1]
| | | | o- mapped_lun1
.......................................................................
[lun1 block/LABCL01-VMSR-0001 (rw)]
| | | o- naa.5001438003be9530
...................................................................................
[Mapped LUNs: 1]
| | | | o- mapped_lun0
.......................................................................
[lun0 block/LABCL01-VMSR-0000 (rw)]
| | | o- naa.50014380072d6878
...................................................................................
[Mapped LUNs: 1]
| | | o- mapped_lun1
.......................................................................
[lun1 block/LABCL01-VMSR-0001 (rw)]
| | o- luns
............................................................................................................
[LUNs: 2]
| | o- lun0
............................................................
[block/LABCL01-VMSR-0000 (/dev/data/LABCL01-VMSR-0000)]
| | o- lun1
............................................................
[block/LABCL01-VMSR-0001 (/dev/data/LABCL01-VMSR-0001)]
| o- naa.5001438003be9562
.............................................................................................
[gen-acls]
| o- acls
............................................................................................................
[ACLs: 4]
| | o- naa.5001438003ba2f8e
...................................................................................
[Mapped LUNs: 1]
| | | o- mapped_lun1
.......................................................................
[lun1 block/LABCL01-VMSR-0001 (rw)]
| | o- naa.5001438003be9532
...................................................................................
[Mapped LUNs: 1]
| | | o- mapped_lun1
.......................................................................
[lun1 block/LABCL01-VMSR-0001 (rw)]
| | o- naa.5001438003be952a
...................................................................................
[Mapped LUNs: 1]
| | | o- mapped_lun0
.......................................................................
[lun0 block/LABCL01-VMSR-0000 (rw)]
| | o- naa.50014380072d687a
...................................................................................
[Mapped LUNs: 1]
| | o- mapped_lun0
.......................................................................
[lun0 block/LABCL01-VMSR-0000 (rw)]
| o- luns
............................................................................................................
[LUNs: 2]
| o- lun0
............................................................
[block/LABCL01-VMSR-0000 (/dev/data/LABCL01-VMSR-0000)]
| o- lun1
............................................................
[block/LABCL01-VMSR-0001 (/dev/data/LABCL01-VMSR-0001)]
o- vhost
............................................................................................................
[Targets: 0]


Thank you.
Loading...