[ltt-dev] [PATCH 3/4] Add native ARM port for armv7l

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Wed Jun 16 19:57:55 EDT 2010


* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Wed, Jun 16, 2010 at 05:23:15PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> > > Add native support for armv7l.  Other variants of ARM will likely require
> > > separate ports.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > > ---
> > >  configure.ac               |    4 +++
> > >  urcu/arch_armv7l.h         |   59 ++++++++++++++++++++++++++++++++++++++++++++
> > >  urcu/uatomic_arch_armv7l.h |   48 +++++++++++++++++++++++++++++++++++
> 
> [ . . . ]
> 
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif 
> > > +
> > > +/* xchg */
> > > +#define uatomic_xchg(addr, v) __sync_lock_test_and_set(addr, v)
> > > +
> > > +/* cmpxchg */
> > > +#define uatomic_cmpxchg(addr, old, _new) \
> > > +	__sync_val_compare_and_swap(addr, old, _new)
> > > +
> > > +/* uatomic_add_return */
> > > +#define uatomic_add_return(addr, v) __sync_add_and_fetch(addr, v)
> > 
> > So, do we end up trusting that gcc got the memory barriers right in the ARM
> > __sync_() primitives ? That sounds unlikely.
> > 
> > I'd vote for surrounding these primitives with smp_mb().
> 
> On ARM, my current belief is that the primitives other than
> __sync_synchronize() and __sync_lock_release() are set up correctly.
> 
> However, I must defer to Paolo and Uli on this.

There is nothing like a quick test to see the result:

With a arm-linux-cs2009q1-203sb1 scratchbox compiler (gcc 4.3.3, provided by
Nokia for the Omap3):

arm-none-linux-gnueabi-gcc-4.3.3 (Sourcery G++ Lite 2009q1-203) 4.3.3

I compile, with

/scratchbox/compilers/arm-linux-cs2009q1-203sb1/bin/arm-none-linux-gnueabi-gcc-4.3.3 -mcpu=cortex-a9 -mtune=cortex-a9 -O2 -o armtest armtest.c

the following program:

int a;

int
f()
{
        __sync_val_compare_and_swap(&a, 4, 1);
        //__sync_lock_test_and_set(&a, 1);
        //__sync_add_and_fetch(&a, 1);
        //__sync_synchronize();
}

int main()
{
        f();
}

and get:

/scratchbox/compilers/arm-linux-cs2009q1-203sb1/bin/arm-none-linux-gnueabi-objdump -S armtest

[...]


000083cc <f>:
    83cc:       e59f0008        ldr     r0, [pc, #8]    ; 83dc <f+0x10>
    83d0:       e3a01004        mov     r1, #4  ; 0x4
    83d4:       e3a02001        mov     r2, #1  ; 0x1
    83d8:       ea000305        b       8ff4 <__sync_val_compare_and_swap_4>
    83dc:       00011524        .word   0x00011524

[...]

00008ff4 <__sync_val_compare_and_swap_4>:
    8ff4:       e92d41f0        push    {r4, r5, r6, r7, r8, lr}
    8ff8:       e59f8034        ldr     r8, [pc, #52]   ; 9034 <__sync_val_compare_and_swap_4+0x40>
    8ffc:       e1a06000        mov     r6, r0
    9000:       e1a05001        mov     r5, r1
    9004:       e1a07002        mov     r7, r2
    9008:       e5964000        ldr     r4, [r6]
    900c:       e1a00005        mov     r0, r5
    9010:       e1550004        cmp     r5, r4
    9014:       e1a01007        mov     r1, r7
    9018:       e1a02006        mov     r2, r6
    901c:       1a000002        bne     902c <__sync_val_compare_and_swap_4+0x38>
    9020:       e12fff38        blx     r8
    9024:       e3500000        cmp     r0, #0  ; 0x0
    9028:       1afffff6        bne     9008 <__sync_val_compare_and_swap_4+0x14>
    902c:       e1a00004        mov     r0, r4
    9030:       e8bd81f0        pop     {r4, r5, r6, r7, r8, pc}
    9034:       ffff0fc0        .word   0xffff0fc0

Where sadly the appropriate memory barriers are missing, and even the
appropriate ldrex/teq, strexeq sequence is missing. So not only is this
incorrect in terms of memory barriers, but also in terms of atomicity. Argh. I
don't know if a compiler more recent than 4.3.3 would do better though, but I
start to think that it would be wise to stay far away from gcc __sync_*()
primitives. For ARM at least.

Thanks,

Mathieu


> 
> > Thanks,
> > 
> > Mathieu
> > 
> > > +
> > > +#ifdef __cplusplus 
> > > +}
> > > +#endif
> > > +
> > > +#include <urcu/uatomic_generic.h>
> > > +
> > > +#endif /* _URCU_ARCH_UATOMIC_ARMV7L_H */
> > > -- 
> > > 1.7.0.6
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list