[ltt-dev] cli/sti vs local_cmpxchg and local_add_return
Nick Piggin
nickpiggin at yahoo.com.au
Tue Mar 17 02:05:35 EDT 2009
On Tuesday 17 March 2009 12:32:20 Mathieu Desnoyers wrote:
> Hi,
>
> I am trying to get access to some non-x86 hardware to run some atomic
> primitive benchmarks for a paper on LTTng I am preparing. That should be
> useful to argue about performance benefit of per-cpu atomic operations
> vs interrupt disabling. I would like to run the following benchmark
> module on CONFIG_SMP :
>
> - PowerPC
> - MIPS
> - ia64
> - alpha
>
> usage :
> make
> insmod test-cmpxchg-nolock.ko
> insmod: error inserting 'test-cmpxchg-nolock.ko': -1 Resource temporarily
> unavailable dmesg (see dmesg output)
>
> If some of you would be kind enough to run my test module provided below
> and provide the results of these tests on a recent kernel (2.6.26~2.6.29
> should be good) along with their cpuinfo, I would greatly appreciate.
>
> Here are the CAS results for various Intel-based architectures :
>
> Architecture | Speedup | CAS |
> Interrupts |
>
> | (cli + sti) / local cmpxchg | local | sync | Enable
> | (sti) | Disable (cli)
>
> ---------------------------------------------------------------------------
>---------------------- Intel Pentium 4 | 5.24 |
> 25 | 81 | 70 | 61 | AMD Athlon(tm)64 X2 | 4.57
> | 7 | 17 | 17 | 15 | Intel
> Core2 | 6.33 | 6 | 30 | 20
> | 18 | Intel Xeon E5405 | 5.25 | 8
> | 24 | 20 | 22 |
>
> The benefit expected on PowerPC, ia64 and alpha should principally come
> from removed memory barriers in the local primitives.
Benefit versus what? I think all of those architectures can do SMP
atomic compare exchange sequences without barriers, can't they?
More information about the lttng-dev
mailing list