cancel
Showing results for 
Search instead for 
Did you mean: 

PC Processors

jmricher70
Adept I

Different execution times for same program with huge variations

Hi everyone,

I am facing a strange problem. I am using a Ryzen 5 3600 and a test that consists in counting the number of letters from 'a' to 'z' in a string of 256000 ASCII characters and I perform this test 100000 on the same string.

I didn't notice the problem on Intel processors, but on AMD I can have several behaviours if I execute 10 times the test :

- either I get nearly same execution time around 4.60 seconds
- or sometimes I get 8 or 9 seconds
- I can also get 14s !!

Here are for example the results of 10 tests with time to execute the code :

 

1 14.84 
2 4.58 
3 8.87 
4 4.63 
5 4.58 
6 4.58 
7 4.59 
8 7.54 
9 4.58 
10 4.58 

Basically the code does this :

 

 

align 16	
.while:
	movzx	eax, byte [rdi + rcx]	        ; s[i]
	sub		eax, 'a'		; s[i] - 'a'
	inc		dword [rbx + rax * 4]	; ++letters[ s[i]-'a' ]
	inc		ecx			; ++i

	cmp		ecx, esi		; if (i < size)
	jne		.while			;	goto .while
.end_while:

I can provide the code to execute if needed.

My question is why do I get so big differences sometimes ??? Does it come from the cache ? From the branch prediction ?
From penalties due to something ?

Best regards,

Jean-Michel

 

0 Likes
7 Replies
benman2785
Big Boss

compile it for Linux and it will work fine ;)

mainreason on this behaviour is how Windows handler handles AMD HW

PC: R7 2700X @PBO + RX 580 4G (1500MHz/2000MHz CL16) + 32G DDR4-3200CL14 + 144hz 1ms FS P + 75hz 1ms FS
Laptop: R5 2500U @30W + RX 560X (1400MHz/1500MHz) + 16G DDR4-2400CL16 + 120Hz 3ms FS
0 Likes

I forgot to say it is under Linux  Ubuntu 20.04 kernel 5.4.0-58-generic.

ah thanks - now i have to reproduce it myself :/

can you upload the script anywere and give me the via pm?

PC: R7 2700X @PBO + RX 580 4G (1500MHz/2000MHz CL16) + 32G DDR4-3200CL14 + 144hz 1ms FS P + 75hz 1ms FS
Laptop: R5 2500U @30W + RX 560X (1400MHz/1500MHz) + 16G DDR4-2400CL16 + 120Hz 3ms FS
0 Likes

Hi, the source code as an archive is available here

asm_vowels.zip

To perform the test, please do the following :

 

unzip asm_vowels.zip
cd asm_vowels_64
make clean && make configure && make

 

Then you may need to run the following script several times (sometimes just once) to see the problem appear

./test_methods.sh

and you will obtain something like this

#;average;    min;    max; stddev;method                             
-----------------------------------------------------------------------------
 4;  5.636;  4.580;  9.660;  1.988;cv_letters_asm                     

After 10 executions you see that the average is 5.636 seconds but we get execution times between 4.580 and 9.660 s which involves an important standard deviation of 1.988.

 

 

0 Likes

I also could see the same problem happen with an AMD Ryzen 7 1700X :

4; 11.264;  4.770; 26.380;  9.894;cv_letters_asm

So we get execution times from 4.77 s to 26.38 s !!

0 Likes

ok, will test it - but will take some time (as i am pretty lazy this year xD)

PC: R7 2700X @PBO + RX 580 4G (1500MHz/2000MHz CL16) + 32G DDR4-3200CL14 + 144hz 1ms FS P + 75hz 1ms FS
Laptop: R5 2500U @30W + RX 560X (1400MHz/1500MHz) + 16G DDR4-2400CL16 + 120Hz 3ms FS
0 Likes

Thank you for your help, I am not in a hurry, I just want to know the reason of the problem. I tried AMDuProf but it had a strange behavior telling me problem is in a function I am not using for this test !

0 Likes