[lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for <urcu/uatomic.h> implementation
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Tue Mar 21 16:26:29 EDT 2023
On 2023-03-20 15:38, Duncan Sands via lttng-dev wrote:
> Hi Mathieu,
>
>> While OK for the general case, I would recommend that we immediately
>> implement something more efficient on x86 32/64 which takes into
>> account that __ATOMIC_ACQ_REL atomic operations are implemented with
>> LOCK prefixed atomic ops, which imply the barrier already, leaving the
>> before/after_uatomic_*() as no-ops.
>
> maybe first check whether the GCC optimizers merge them. I believe some
> optimizations of atomic primitives are allowed and implemented, but I
> couldn't say which ones.
>
> Best wishes, Duncan.
Tested on godbolt.org with:
int a;
void fct(void)
{
(void) __atomic_add_fetch(&a, 1, __ATOMIC_RELAXED);
__atomic_thread_fence(__ATOMIC_SEQ_CST);
}
x86-64 gcc 12.2 -O2 -std=c11:
fct:
lock add DWORD PTR a[rip], 1
lock or QWORD PTR [rsp], 0
ret
a:
.zero 4
x86-64 clang 16.0.0 -O2 -std=c11:
fct: # @fct
lock inc dword ptr [rip + a]
mfence
ret
a:
.long 0
So none of gcc/clang optimize this today, hence the need for an
x86-specific implementation.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
More information about the lttng-dev
mailing list