[ltt-dev] [PATCH 3/4] Add native ARM port for armv7l
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed Jun 16 19:57:55 EDT 2010
* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Wed, Jun 16, 2010 at 05:23:15PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> > > Add native support for armv7l. Other variants of ARM will likely require
> > > separate ports.
> > >
> > > Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > > ---
> > > configure.ac | 4 +++
> > > urcu/arch_armv7l.h | 59 ++++++++++++++++++++++++++++++++++++++++++++
> > > urcu/uatomic_arch_armv7l.h | 48 +++++++++++++++++++++++++++++++++++
>
> [ . . . ]
>
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +/* xchg */
> > > +#define uatomic_xchg(addr, v) __sync_lock_test_and_set(addr, v)
> > > +
> > > +/* cmpxchg */
> > > +#define uatomic_cmpxchg(addr, old, _new) \
> > > + __sync_val_compare_and_swap(addr, old, _new)
> > > +
> > > +/* uatomic_add_return */
> > > +#define uatomic_add_return(addr, v) __sync_add_and_fetch(addr, v)
> >
> > So, do we end up trusting that gcc got the memory barriers right in the ARM
> > __sync_() primitives ? That sounds unlikely.
> >
> > I'd vote for surrounding these primitives with smp_mb().
>
> On ARM, my current belief is that the primitives other than
> __sync_synchronize() and __sync_lock_release() are set up correctly.
>
> However, I must defer to Paolo and Uli on this.
There is nothing like a quick test to see the result:
With a arm-linux-cs2009q1-203sb1 scratchbox compiler (gcc 4.3.3, provided by
Nokia for the Omap3):
arm-none-linux-gnueabi-gcc-4.3.3 (Sourcery G++ Lite 2009q1-203) 4.3.3
I compile, with
/scratchbox/compilers/arm-linux-cs2009q1-203sb1/bin/arm-none-linux-gnueabi-gcc-4.3.3 -mcpu=cortex-a9 -mtune=cortex-a9 -O2 -o armtest armtest.c
the following program:
int a;
int
f()
{
__sync_val_compare_and_swap(&a, 4, 1);
//__sync_lock_test_and_set(&a, 1);
//__sync_add_and_fetch(&a, 1);
//__sync_synchronize();
}
int main()
{
f();
}
and get:
/scratchbox/compilers/arm-linux-cs2009q1-203sb1/bin/arm-none-linux-gnueabi-objdump -S armtest
[...]
000083cc <f>:
83cc: e59f0008 ldr r0, [pc, #8] ; 83dc <f+0x10>
83d0: e3a01004 mov r1, #4 ; 0x4
83d4: e3a02001 mov r2, #1 ; 0x1
83d8: ea000305 b 8ff4 <__sync_val_compare_and_swap_4>
83dc: 00011524 .word 0x00011524
[...]
00008ff4 <__sync_val_compare_and_swap_4>:
8ff4: e92d41f0 push {r4, r5, r6, r7, r8, lr}
8ff8: e59f8034 ldr r8, [pc, #52] ; 9034 <__sync_val_compare_and_swap_4+0x40>
8ffc: e1a06000 mov r6, r0
9000: e1a05001 mov r5, r1
9004: e1a07002 mov r7, r2
9008: e5964000 ldr r4, [r6]
900c: e1a00005 mov r0, r5
9010: e1550004 cmp r5, r4
9014: e1a01007 mov r1, r7
9018: e1a02006 mov r2, r6
901c: 1a000002 bne 902c <__sync_val_compare_and_swap_4+0x38>
9020: e12fff38 blx r8
9024: e3500000 cmp r0, #0 ; 0x0
9028: 1afffff6 bne 9008 <__sync_val_compare_and_swap_4+0x14>
902c: e1a00004 mov r0, r4
9030: e8bd81f0 pop {r4, r5, r6, r7, r8, pc}
9034: ffff0fc0 .word 0xffff0fc0
Where sadly the appropriate memory barriers are missing, and even the
appropriate ldrex/teq, strexeq sequence is missing. So not only is this
incorrect in terms of memory barriers, but also in terms of atomicity. Argh. I
don't know if a compiler more recent than 4.3.3 would do better though, but I
start to think that it would be wise to stay far away from gcc __sync_*()
primitives. For ARM at least.
Thanks,
Mathieu
>
> > Thanks,
> >
> > Mathieu
> >
> > > +
> > > +#ifdef __cplusplus
> > > +}
> > > +#endif
> > > +
> > > +#include <urcu/uatomic_generic.h>
> > > +
> > > +#endif /* _URCU_ARCH_UATOMIC_ARMV7L_H */
> > > --
> > > 1.7.0.6
> > >
> >
> > --
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list