Skip to main content
7 votes
1 answer
299 views

How to use plain RDTSC without using asm?

I want to use RDTSC in Rust to benchmark how many ticks my function takes. There's a built-in std::arch::x86_64::_rdtsc, alas it always translates into: rdtsc shl rdx, 32 or rax, rdx ...
Daniil Tutubalin's user avatar
1 vote
0 answers
339 views

x86 TSC is really invariant?

Intel x86 introduced the Invariant TSC. But how is its invariance maintained ? In intel sdm 18.17.4 Invariant Time-keeping, mentioned: The invariant TSC is based on the invariant timekeeping hardware ...
wang fuqiang's user avatar
1 vote
0 answers
139 views

Invariant Timestamp Counter is synchronised between cores of the same CPU?

I'm interested in how Invariant TSC behaves on a multi-core CPU, on a classic PC with a single physical CPU. The only thing I could find is that its frequency is constant and the same for all CPU ...
lokains's user avatar
  • 137
4 votes
0 answers
530 views

Why do fixed-iteration register decrement loops exhibit jitter even without interrupts or frequency changes?

I aimed to understand the characteristics of hardware counters (e.g., Intel x86's rdtsc) by testing a piece of code with a fixed execution time. The test code uses a fixed number of register ...
foool's user avatar
  • 1,582
0 votes
0 answers
148 views

__faststorefence over _ReadWriteBarrier

I've come across something interesting. I've noticed that when I use __faststorefence in my timing function, I get more consistent results compared to when I use _ReadWriteBarrier. Here's the basic ...
daniel's user avatar
  • 55
2 votes
1 answer
144 views

The overhead-free monitor codes in the AMD CPU significantly increases the total synchronization duration

I am conducting a test to measure the message synchronization latency between different cores of a CPU. Specifically, I am measuring how many clock cycles it takes for CPU2 to detect changes in the ...
foool's user avatar
  • 1,582
0 votes
0 answers
33 views

rdtsc delta to nanosecond conversion [duplicate]

Recently, I have been trying to run some performance anlaysis on my program. I want to measure the latency of some functions in cpu ticks and later convert the delta to nanosecond. (I intentionally am ...
Hedgehog's user avatar
  • 115
0 votes
0 answers
92 views

What am I doing wrong in measuring execution time using RDTSC and CPUID instructions? [duplicate]

I'm a C++ student and I was assigned to measure the time of memory allocation of 1.000.000 integers, using CPUID and RDTSC instructions in inline assembly for G++ compiler. Here is the code I came up ...
Kudor's user avatar
  • 69
0 votes
0 answers
248 views

My GCC __rdtsc() method of calculating clock cycles returns me a totally different result than Microsoft inline assembly rdtsc

So my goal is to test how many clock cycles does it take to allocate one million integers with "malloc" function in C++. I have 2 programs, one in Visual Studio, which calls RDTSC from ...
Kudor's user avatar
  • 69
5 votes
2 answers
237 views

Clang optimizes out RDTSC asm blocks thinking the repeated block yields the same as the previous block. Is this legal?

Supposed we have some repetitions of the same asm that contains RDTSC such as volatile size_t tick1; asm ( "rdtsc\n" // Returns the time in EDX:EAX. "shl $32,...
sandthorn's user avatar
  • 2,898
1 vote
0 answers
315 views

how to use `__rdtsc` properly in linux x86 gcc

In linux, the gcc compiler has the intrinsic function __rdtsc to measure the cpu cycles. So I don't need to use inline asm code, which I am not familiar with. On the other hand, when reading posts ...
doraemon's user avatar
  • 2,594
1 vote
0 answers
294 views

rdtsc under WSL2 lacks nonstop_tsc?

I'm doing some microbenchmarking of short code snippets using the __rdtsc intrinsic inside WSL2 on Windows 11. I'm noticing that there is a lot more variance in the results than I'm used to from ...
Joseph Garvin's user avatar
0 votes
1 answer
1k views

Best practice of using __rdtsc

I am new to system programing, and I have some doubts about how to use __rdtsc. Here is a quote from Microsoft Learn: Generates the rdtsc instruction, which returns the processor time stamp. The ...
chenzhongpu's user avatar
  • 6,975
1 vote
0 answers
53 views

Why aren't mfence;RDTSCP timings closer to equal across repeated calls after warm-up? +-2% variation over ~3300 TSC counts

I am writing code to compare the execution times of different versions of a bigint add function, on AMD FX(tm)-8350 Eight-Core Processor 4.00 GHz. I need help to make sense of the machine behaviour ...
DannyPeet's user avatar
  • 179
1 vote
0 answers
553 views

Configure qemu KVM-SVM to not emulate rdtscp and get valid timestamp

I am trying to measure the cycle count of an instruction in a VM -- my code looks like this: start = rdtscp(); //complex_sequence_of_instructions end = rtdscp(); //complex_sequence_of_instructions ...
cryptobeginner's user avatar

15 30 50 per page
1
2 3 4 5
10