[lttng-dev] current_thread_info() not respecting program order with gcc 4.8.x

Peter Zijlstra peterz at infradead.org
Tue Nov 19 10:57:49 EST 2013


On Tue, Nov 19, 2013 at 03:29:12PM +0000, Mathieu Desnoyers wrote:
> Hi,
> 
> I got a bug report on ARM which appears to be caused by an aggressive
> gcc optimisation starting from gcc 4.8.x due to lack of constraints on
> the current_thread_info() inline assembly. The only logical
> explanation for his issue I see so far is that read of the
> preempt_count within might_sleep() is reordered with preempt_enable()
> or preempt_disable(). AFAIU, this kind of issue also applies to other
> architectures.
> 
> First thing, preempt enable/disable only contains barrier() between
> the inc/dec and the inside of the critical section, not the outside.
> Therefore, we are relying on program order to ensure that the
> preempt_count() read in might_sleep() is not reordered with the
> preempt count inc/dec.
> 
> However, looking at ARM arch/arm/include/asm/thread_info.h:
> 
> static inline struct thread_info *current_thread_info(void) { register
> unsigned long sp asm ("sp"); return (struct thread_info *)(sp &
> ~(THREAD_SIZE - 1)); }
> 
> The inline assembly has no clobber and is not volatile. (this is also
> true for all other architectures I've looked at so far, which includes
> x86 and powerpc)
> 
> As long as someone does:
> 
> struct thread_info *myinfo = current_thread_info();
> 
> load from myinfo->preempt_count; store to myinfo->preempt_count;
> 
> The program order should be preserved, because the read and the write
> are done wrt same base. However, what we have here in the case of
> might_sleep() followed by preempt_enable() is:
> 
> load from current_thread_info()->preempt_count; store to
> current_thread_info()->preempt_count;
> 
> Since each current_thread_info() is a different asm ("sp") without
> clobber nor volatile, AFAIU, the compiler is within its right to
> reorder them.
> 
> One possible solution to this might be to add "memory" clobber and
> volatile to this inline asm, but I fear it would put way too much
> constraints on the compiler optimizations (too heavyweight).
> 
> Have any of you experienced this issue ? Any thoughts on the matter ?
> 
> I'm providing the original bug report email exchange below. I had
> confirmation that adding barrier() outside
> preempt_enable()/preempt_disable() does indeed work-around the issue,
> but I fear that this work-around would be leaving the
> current_thread_info() semantic gap in place: people expect program
> order to be respected.

This all brings to mind commit 509eb76ebf977 ("ARM: 7747/1: pcpu: ensure
__my_cpu_offset cannot be re-ordered across barrier()").

GCC seems overly happy with reordering asm() thingies.

Ideally we'd be able to tell gcc that:

  register unsigned long *sp asm ("sp")

not only acts like a variable but should be treated like a variable and
thus such reshuffeling should be limited to preserve program-order.

There doesn't seem to be any __attribute__() that can do this, and
currently out only recourse seems to add either volatile, which seems
overly strict as Mathieu said, or add fake input like Will did.



More information about the lttng-dev mailing list