cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

nibal
Challenger

Memory corruption in latest crimson driver 15.302?

Jump to solution

Using Ubuntu 14.04 and valgrind:

==00:00:01:30.014 4949== Invalid write of size 8

==00:00:01:30.014 4949== at 0x4C2F5F3: memcpy@GLIBC_2.2.5 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==00:00:01:30.014 4949== by 0xB3B6154: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.014 4949== by 0xB3B899A: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.014 4949== by 0xB3BB911: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB3C5F98: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB3C6667: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB3C6838: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB329CFB: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB35182C: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB351BD6: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB2F2DAC: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB2F312C: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB29115E: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0xB30115B: ??? (in /usr/lib/libamdocl64.so)

==00:00:01:30.015 4949== by 0x60BA181: start_thread (pthread_create.c:312)

==00:00:01:30.015 4949== by 0x63CA47C: clone (clone.S:111)

==00:00:01:30.015 4949== Address 0x7f126ed63000 is not stack'd, malloc'd or (recently) free'd

Could be a false positive, but I'm getting some unexplained crashes:(

0 Likes
1 Solution

Accepted Solutions
german
Staff
Staff

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

1. Freqs array reallocation in the code looks broken. The code below:

freqs[fidx].hz = sig.hz;

freqs[fidx++].ts = ts;

if (fidx >= maxfreqs) {

maxfreqs += 16;

freqs = realloc(freqs, maxfreqs);

}

Should be something

if (fidx == (maxfreqs-1)) {

maxfreqs += 16;

freqs = realloc(freqs, maxfreqs * sizeof(freq_t));

}

freqs[fidx].hz = sig.hz;

freqs[fidx++].ts = ts;

2. You call run_fft() with pass=8 and that causes access to a destroyed cl_event ndr on the pass=7.

You destroyed ndr (pass=7)

if (pass == MAXPASS - 1) {

    if ((err = waitForEventAndRelease(&ndr)) != SUCCESS)

you have access to a destroyed object and corrupt memory. (pass=8)

if (pass && (err = waitForEventAndRelease(&ndr)) != SUCCESS)

View solution in original post

14 Replies
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

Actually this is much worse than I thought. This is real. That corruption existed in catalyst 15.201, 15.101 and anything in between. Not only it gave instability to the ocl part of the program, but anything else it came in contact with in the same program. Plz fix urgently. Is there a place to download older catalysts?

I will have to comment out all ocl parts and stop linking to the libraries until it is fixed

0 Likes
nou
Exemplar

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution
gstoner
Staff
Staff

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

Hi Nibal

  I am  having the team look into this I will get back to you by the end of the week. 

Greg

0 Likes
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

Hi Greg,

And thanks for helping out.

This is a tough corruption to track. Since it is very reproducible in my system, I will try to limit it to specific ocl calls and update ticket.

BR

Nikos

0 Likes
gstoner
Staff
Staff

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

What I need is what motherboard, processor, system bios version, which GPU, if possible vbios number for the GPU, which os and version ( if linux kernel version) you are running. Also if you have test app that causes the issue you can get us.

greg

0 Likes
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

My info so far:

Motherboard: Gigabyte Technology Co., Ltd. 970A-UD3P

BIOS: UEFI DualBIOS, American Megatrends Inc. version: F1

CPU: AMD FX(tm)-8320 Eight-Core Processor, @1.4 Ghz

GPU: AMD Radeon (TM) R9 270, Pitcairn, Curacao Pro, Platform ID: 0x7f7227b45a18 (as reported by clinfo)

OS: Ubuntu 14.04 x64, 3.13.0-49 generic

ocl SDK: 3.0, working ocl 1.2

Working on test app (Need to reboot).

BR

Nikos

0 Likes
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

Using printfs and the valgrind output I was able to bracket the Invalid write between NDRangeKernel and completing the kernel.

But here is the catch: It happens only on the first time the kernel is executed.

My kernel is a slightly modified kernel of your FFT sample.

Unfortunately validation of your FFT sample, will take more time.

Each time I run it through valgrind it crashes my PC.  I do not crash my PC when running FFT alone,

but I do not run it for long and corruption may not show. I will have to compile latest valgrind

from sources and retest

The pattern suggests that this is not specific to the kernel itself (else it would appear on every kernel pass),

but general to the kernel mechanism. I hope it can be reproduced with any kernel. I'm compiling as default (ocl 1.2)

BR,

Nikos

0 Likes
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

It doesn't show in your FFT sample. Will have to create a test app with my kernel

0 Likes
nibal
Challenger

Re: Memory corruption in latest crimson driver 15.302?

Jump to solution

Hi Greg,

Plz use attached fft.tgz to recreate problem. Included in val.out are 2 more Invalid reads, which were not in original valgrind report. You might want to check on them, too. These contain full stack trace. Instructions for recreating bug:

-> tar -xzvf fft.tgz      //This will create a directory fft/ witth the sources

-> cd fft

-> make db

-> fft                        // Optional. This terminates with a core dump in my system. Be careful in yours it could crash your PC

-> make clean

-> make db

-> script

-> valgrind fft          // Best use latest valgrind 3.11.0, from sources.  Otherwise it might crash your PC. Can be interrupted with <ctrl-C>,

                                  but in my case it core dumps before I get the chance to and generates vgcore,<pid>

-> exit                    // Script

Let me know if you can recreate problem.

TIA

Nikos

0 Likes