[lttng-dev] Segfault at v_read() called from lib_ring_buffer_try_reserve_slow() in LTTng-UST traced app - CPU/VMware dependent

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Sep 3 11:25:47 EDT 2015


----- On Sep 2, 2015, at 10:14 PM, David OShea David.OShea at quantum.com wrote:

> For the record, it appears that upgrading from VMware ESXi version 5.0.0, 469512
> to version 5.5.0, 2068190 ("Update 2") resolved this issue.  However, we had
> other hosts running version 5.1.0, 799733 which should have been set to the
> same CPU architecture (Nehalem) which didn't have the issue, so presumably the
> fix was included in that version.

That's good news, thanks for letting us know!

Mathieu

> 
> Thanks,
> David
> 
>> -----Original Message-----
>> From: Mathieu Desnoyers [mailto:mathieu.desnoyers at efficios.com]
>> Sent: Thursday, 15 January 2015 1:21 PM
>> To: David OShea
>> Cc: lttng-dev
>> Subject: Re: [lttng-dev] Segfault at v_read() called from
>> lib_ring_buffer_try_reserve_slow() in LTTng-UST traced app - CPU/VMware
>> dependent
>> 
>> ----- Original Message -----
>> > From: "David OShea" <David.OShea at quantum.com>
>> > To: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>
>> > Cc: "lttng-dev" <lttng-dev at lists.lttng.org>
>> > Sent: Wednesday, January 14, 2015 9:45:01 PM
>> > Subject: RE: [lttng-dev] Segfault at v_read() called from
>> lib_ring_buffer_try_reserve_slow() in LTTng-UST traced app
>> > - CPU/VMware dependent
>> >
>> > Hi Mathieu,
>> >
>> > > -----Original Message-----
>> > > From: Mathieu Desnoyers [mailto:mathieu.desnoyers at efficios.com]
>> > > Sent: Tuesday, 13 January 2015 2:06 AM
>> > > To: David OShea
>> > > Cc: lttng-dev
>> > > Subject: Re: [lttng-dev] Segfault at v_read() called from
>> > > lib_ring_buffer_try_reserve_slow() in LTTng-UST traced app -
>> CPU/VMware
>> > > dependent
>> > [...]
>> > > > > Is it possible that this is an issue in LTTng, or should I work
>> out
>> > > how the
>> > > > > kernel works out which CPU it is running on and then look into
>> > > whether
>> > > > > there
>> > > > > are any VMware bugs in this area?
>> > > >
>> > > > This appears to be very likely a VMware bug. /proc/cpuinfo should
>> > > show
>> > > > 4 CPUs (and sysconf(_SC_NPROCESSORS_CONF) should return 4) if the
>> > > current
>> > > > CPU number can be 0, 1, 2, 3 throughout execution.
>> >
>> > /proc/cpuinfo shows two CPUs:
>> >
>> > processor       : 0
>> > vendor_id       : GenuineIntel
>> > cpu family      : 6
>> > model           : 26
>> > model name      : Intel(R) Xeon(R) CPU           X7550  @ 2.00GHz
>> > stepping        : 4
>> > microcode       : 8
>> > cpu MHz         : 1995.000
>> > cache size      : 18432 KB
>> > physical id     : 0
>> > siblings        : 1
>> > core id         : 0
>> > cpu cores       : 1
>> > apicid          : 0
>> > initial apicid  : 0
>> > fpu             : yes
>> > fpu_exception   : yes
>> > cpuid level     : 11
>> > wp              : yes
>> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx
>> rdtscp lm
>> > constant_tsc arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc
>> > aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt
>> hypervisor
>> > lahf_lm ida dts
>> > bogomips        : 3990.00
>> > clflush size    : 64
>> > cache_alignment : 64
>> > address sizes   : 40 bits physical, 48 bits virtual
>> > power management:
>> >
>> > processor       : 1
>> > vendor_id       : GenuineIntel
>> > cpu family      : 6
>> > model           : 26
>> > model name      : Intel(R) Xeon(R) CPU           X7550  @ 2.00GHz
>> > stepping        : 4
>> > microcode       : 8
>> > cpu MHz         : 1995.000
>> > cache size      : 18432 KB
>> > physical id     : 2
>> > siblings        : 1
>> > core id         : 0
>> > cpu cores       : 1
>> > apicid          : 2
>> > initial apicid  : 2
>> > fpu             : yes
>> > fpu_exception   : yes
>> > cpuid level     : 11
>> > wp              : yes
>> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx
>> rdtscp lm
>> > constant_tsc arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc
>> > aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt
>> hypervisor
>> > lahf_lm ida dts
>> > bogomips        : 3990.00
>> > clflush size    : 64
>> > cache_alignment : 64
>> > address sizes   : 40 bits physical, 48 bits virtual
>> > power management:
>> >
>> > > You might want to look at the sysconf(3) manpage, especially the
>> parts
>> > > about
>> > > _SC_NPROCESSORS_CONF and _SC_NPROCESSORS_ONLN. My guess is that
>> vmware
>> > > is lying
>> > > about the number of "possible" CPUs (_SC_NPROCESSORS_CONF).
>> >
>> > _SC_NPROCESSORS_CONF = 2
>> > _SC_NPROCESSORS_ONLN = 2
>> >
>> > Thanks for the pointers, I will look into possible VMware bugs.
>> >
>> > Out of curiosity, what happens if I happened to have a system with
>> > hot-pluggable CPUs - does _SC_NPROCESSORS_CONF reflect the maximum
>> number of
>> > CPUs I can insert, and that is how many LTTng will support?
>> 
>> Yes, exactly.
>> 
>> Thanks,
>> 
>> Mathieu
>> 
>> >
>> > Thanks,
>> > David
>> >
>> > ---------------------------------------------------------------------
>> -
>> > The information contained in this transmission may be confidential.
>> Any
>> > disclosure, copying, or further distribution of confidential
>> information is
>> > not permitted unless such privilege is explicitly granted in writing
>> by
>> > Quantum. Quantum reserves the right to have electronic
>> communications,
>> > including email and attachments, sent across its networks filtered
>> through
>> > anti virus and spam software programs and retain such messages in
>> order to
>> > comply with applicable data security and retention requirements.
>> Quantum is
>> > not responsible for the proper and complete transmission of the
>> substance of
>> > this communication or for any delay in its receipt.
>> >
>> 
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> https://urldefense.proofpoint.com/v1/url?u=http://www.efficios.com/&k=8
>> F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=H%2F7L7PqcsBryhdFPEDkMctduZSYZKIU%2Bn0
>> pwhSRt%2FlE%3D%0A&m=prixRKthxyU%2BMyt%2F6tzAMJHpXUWgy4zX5MfojFJij0w%3D%
> > 0A&s=d3553cdf8b9f86db71bd2f2a34d4ba415a863c5592a6ca9655ee047b4b017ef3

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list