[ltt-dev] UST clock rdtsc vs clock_gettime
David Goulet
david.goulet at polymtl.ca
Wed Jul 7 11:28:01 EDT 2010
On 10-07-06 03:39 PM, Nils Carlson wrote:
> Cool, so the measurements came through...
>
I've retested UST per event time with the new commit made few days ago
fixing the custom probes and cache line alignment. Here are the results
for TSC counter and clock_gettime (test made 1000 times on i7) :
rdtsc :
Average : 0.000000242229708 sec, 242.22971 nsec
Standard Deviation : 0.000000001663147 sec , 1.66315 nsec
clock_gettime :
Average : 0.000000272516616 sec, 272.51662 nsec
Standard Deviation : 0.000000002340784 sec , 2.34078 nsec
> What I would like to see is the automatic detection of whether the rdtsc
> instruction is usable,
> a test for this already exists in the kernel and the question is whether
> this info is currently exported
> or whether we need to submit a patch to export it.
>
From userspace, to test, this would be a syscall via prctl right? The
thing is that it's needed at compile time. Right now, the __i386__ and
__x86_64__ define is tested. Upon gcc compilation, it would be great to
have something like TSC_AVAILABLE define and then compile the right
function (either clock_gettime or rdtsc).
However, there is some issues about consistency by using TSC for example
between CPUs counter... so I think we need to be very careful about that
even if the performance are 30ns less and much more _stable_ (see std
variation).
David
> Then we should probably start looking at a simple choosing mechanism,
> probably a function pointer?
>
> /Nils
> On Jul 6, 2010, at 8:12 PM, David Goulet wrote:
>
>> Hey,
>>
>> After some talks with Nils from Ericsson, there was some questions
>> about using the TSC counter and not clock_gettime in include/ust/clock.h
>>
>> I ran some test after the meeting and was quite surprised by the
>> overhead of clock_gettime.
>>
>> On an average run ...
>> WITH clock_gettime : ~ 266ns per events
>> WITH rdtsc instruction : ~ 235ns per events
>>
>> And it is systematic... I'm getting stable result with rdtsc with
>> standard deviation of ~2ns.
>>
>> As little as I know on TSC, one thing for sure, with SMP, it becomes
>> much more "fragile" to rely on it because we don't have assurance of
>> coherent counters between CPUs and also the CPU scaling policy
>> (ondemand is default on Ubuntu now). New CPUs support constant_tsc and
>> nonstop_tsc flags but still a small range of them.
>>
>> Right now, UST is forcing the use of clock_gettime even if i386 or
>> x86_64 is used.
>> Should a change be consider ?
>>
>> Thanks
>> David
>>
>> _______________________________________________
>> ltt-dev mailing list
>> ltt-dev at lists.casi.polymtl.ca
>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>
More information about the lttng-dev
mailing list