[lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for <urcu/uatomic.h> implementation

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Tue Mar 21 16:26:29 EDT 2023


On 2023-03-20 15:38, Duncan Sands via lttng-dev wrote:
> Hi Mathieu,
> 
>> While OK for the general case, I would recommend that we immediately 
>> implement something more efficient on x86 32/64 which takes into 
>> account that __ATOMIC_ACQ_REL atomic operations are implemented with 
>> LOCK prefixed atomic ops, which imply the barrier already, leaving the 
>> before/after_uatomic_*() as no-ops.
> 
> maybe first check whether the GCC optimizers merge them.  I believe some 
> optimizations of atomic primitives are allowed and implemented, but I 
> couldn't say which ones.
> 
> Best wishes, Duncan.

Tested on godbolt.org with:

int a;

void fct(void)
{
     (void) __atomic_add_fetch(&a, 1, __ATOMIC_RELAXED);
     __atomic_thread_fence(__ATOMIC_SEQ_CST);
}

x86-64 gcc 12.2 -O2 -std=c11:

fct:
         lock add        DWORD PTR a[rip], 1
         lock or QWORD PTR [rsp], 0
         ret
a:
         .zero   4

x86-64 clang 16.0.0 -O2 -std=c11:

fct:                                    # @fct
         lock            inc     dword ptr [rip + a]
         mfence
         ret
a:
         .long   0

So none of gcc/clang optimize this today, hence the need for an 
x86-specific implementation.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



More information about the lttng-dev mailing list