[ltt-dev] UST clock rdtsc vs clock_gettime
David Goulet
david.goulet at polymtl.ca
Wed Jul 7 12:51:25 EDT 2010
ERRATA :
clock_gettime :
Average : 0.000000272516616 sec, 272.51662 nsec
Standard Deviation : 0.000000008640484 sec , 8.64048 nsec
Sorry! Bad copy paste... The variation is quite big actually.
On 10-07-07 11:28 AM, David Goulet wrote:
> On 10-07-06 03:39 PM, Nils Carlson wrote:
>> Cool, so the measurements came through...
>>
>
> I've retested UST per event time with the new commit made few days ago
> fixing the custom probes and cache line alignment. Here are the results
> for TSC counter and clock_gettime (test made 1000 times on i7) :
>
> rdtsc :
> Average : 0.000000242229708 sec, 242.22971 nsec
> Standard Deviation : 0.000000001663147 sec , 1.66315 nsec
>
> clock_gettime :
> Average : 0.000000272516616 sec, 272.51662 nsec
> Standard Deviation : 0.000000002340784 sec , 2.34078 nsec
>
>> What I would like to see is the automatic detection of whether the rdtsc
>> instruction is usable,
>> a test for this already exists in the kernel and the question is whether
>> this info is currently exported
>> or whether we need to submit a patch to export it.
>>
>
> From userspace, to test, this would be a syscall via prctl right? The
> thing is that it's needed at compile time. Right now, the __i386__ and
> __x86_64__ define is tested. Upon gcc compilation, it would be great to
> have something like TSC_AVAILABLE define and then compile the right
> function (either clock_gettime or rdtsc).
>
> However, there is some issues about consistency by using TSC for example
> between CPUs counter... so I think we need to be very careful about that
> even if the performance are 30ns less and much more _stable_ (see std
> variation).
>
> David
>
>> Then we should probably start looking at a simple choosing mechanism,
>> probably a function pointer?
>>
>> /Nils
>> On Jul 6, 2010, at 8:12 PM, David Goulet wrote:
>>
>>> Hey,
>>>
>>> After some talks with Nils from Ericsson, there was some questions
>>> about using the TSC counter and not clock_gettime in include/ust/clock.h
>>>
>>> I ran some test after the meeting and was quite surprised by the
>>> overhead of clock_gettime.
>>>
>>> On an average run ...
>>> WITH clock_gettime : ~ 266ns per events
>>> WITH rdtsc instruction : ~ 235ns per events
>>>
>>> And it is systematic... I'm getting stable result with rdtsc with
>>> standard deviation of ~2ns.
>>>
>>> As little as I know on TSC, one thing for sure, with SMP, it becomes
>>> much more "fragile" to rely on it because we don't have assurance of
>>> coherent counters between CPUs and also the CPU scaling policy
>>> (ondemand is default on Ubuntu now). New CPUs support constant_tsc and
>>> nonstop_tsc flags but still a small range of them.
>>>
>>> Right now, UST is forcing the use of clock_gettime even if i386 or
>>> x86_64 is used.
>>> Should a change be consider ?
>>>
>>> Thanks
>>> David
>>>
>>> _______________________________________________
>>> ltt-dev mailing list
>>> ltt-dev at lists.casi.polymtl.ca
>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>>
>
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
More information about the lttng-dev
mailing list