After overclocking GTX980: Tasks take LONGER to complete?!?

Rodario
Rodario
Joined: 15 Apr 14
Posts: 6
Credit: 97163057
RAC: 0
Topic 198299

I recently enabled the XMP1 profile on my new-ish gaming (and BOINC) rig and also decided to moderately oc my GTX 980 STRIX DC2OC.

GPU Clock: 1291MHZ -> 1450MHZ
Memory Clock 7010MHZ -> 7700MHZ

It's running completely stable and 3DMark gives me a ~10% higher benchmark score now, as expected.

While CPU tasks are now being completed slightly faster, as they should because of the slight XMP CPU oc, I've noticed Einstein@home's BRP4G(Arecibo) tasks now take around 34 minutes to complete, up from 28 before the oc.

This has left me a bit perplexed, as I expected those tasks to finish slightly faster now, certainly not slower.

GPU-Z tells me everything is in order, the card sticks to the new 1450MHZ gpu clock, is not being throttled by any ceiling and the memory clock is stuck at 6010MHZ, but that has always been the case with einstein tasks.

Does anyone have any insight as to why this might be? Thanks in advance.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

After overclocking GTX980: Tasks take LONGER to complete?!?

How many work units are you crunching on the CPU?

Rodario
Rodario
Joined: 15 Apr 14
Posts: 6
Credit: 97163057
RAC: 0

RE: How many work units are

Quote:
How many work units are you crunching on the CPU?

11/12 threads are used for CPU crunching, 1 is left open for GPU tasks and a responsive system (I noticed explorer lag with 100% usage setting). This was also the case before.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Before all of this started,

Before all of this started, were you always crunching on 11 of 12 cores?

I only ask as it's helps gives a point of reference.

Next, Are you using a max_project_concurrent xml to limit the number of task or are you using the web preferences like "use at most" 90% (example, I didn't calculate the real value)

Just throwing this out there, without knowing your answer, but have you tried reducing down to 10/12 cores and see if the times improve? If they do, then you know that you need that extra core to help out the GPU.

Rodario
Rodario
Joined: 15 Apr 14
Posts: 6
Credit: 97163057
RAC: 0

RE: Before all of this

Quote:

Before all of this started, were you always crunching on 11 of 12 cores?

I only ask as it's helps gives a point of reference.

I used to run it at 100% (12/12) @~28min/WU for a while, until the explorer lag annoyed me, then I reduced it to 11/12 (WU still around 28min)and the lag went away, so I kept it there. This was about a month before the OC.

Quote:
Next, Are you using a max_project_concurrent xml to limit the number of task or are you using the web preferences like "use at most" 90% (example, I didn't calculate the real value)

BOINC Manager preferences. Set it to max 95% (11/12=91.6666666666% + 2% and some to spare for einstein). Worked as intended until recently. Just to clarify, there are 11 CPU WUs running (stable at 8% each in task manager), plus the einstein GPU WU, which fluctuates between 1-2% there. The total is 93-94%

Quote:
Just throwing this out there, without knowing your answer, but have you tried reducing down to 10/12 cores and see if the times improve? If they do, then you know that you need that extra core to help out the GPU.

I did not. Seeing as 12/12 was also no issue, this did not occur to me.

Couple of extra details I just noticed/realized:

- I very recently updated the BOINC Manager to 7.6.9 from 7.1.x, though that was still two days before the OC

- The client executed a CPU benchmark after activating XMP, after which all ETAs increased (instead of decreasing, right?)

- I went to check my completed tasks on the account website and I think all WUs completed after the client upgrade (But before the OC) increased by about 25% (the CPU ones decreased by a bit again post XMP). E.g. Pre 7.6.9: 1000s, Post 7.6.9: 1250s, Post XMP: 1175s (something along that ratio).

- When watching the GPU-Z log for a while, I notice there are drops in GPU load sporadically. Around every 10-40s there's a 1-2s gap where it drops from 85% to 0. Interestingly, Memory clock spikes from 6010 to 7700Mhz during those gaps, while Bus load goes from about 25% to 15% and TDP from 55% to 35%. I guess if we added all those second long pauses up, they might amount to something. (This is not an overheating issue)

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

RE: BOINC Manager

Quote:
BOINC Manager preferences. Set it to max 95% (11/12=91.6666666666% + 2% and some to spare for einstein). Worked as intended until recently. Just to clarify, there are 11 CPU WUs running (stable at 8% each in task manager), plus the einstein GPU WU, which fluctuates between 1-2% there. The total is 93-94%

Ok, If my memory serves correctly. Boinc can't really do a 95% it will either round up or round down. In this case, it is probably rounding down to that 92.

Which is probably my guess it is doing. In that case, the GPU is sharing some CPU time with those other CPU work units. It would also explain why intermediately the GPU will stop crunching, as the CPU work units are utilizing all of the available resources. That would be why it's taking longer for the GPU to finish it's work.

That's just a guess of course.

Could you reduce "use at most" down to 83.33? Yes but I think you would run into the same effect.

(This is my opinion)

I actually have mine set at use "100%" and then created a cc_config.xml with a setting to allow me to crunch both CPU and GPU work units without the "hesitation" you are seeing on the GPU work units. In your case I would set the value at 12, 11 for the CPU and 1 for the GPU. If you could add up all the CPU usage by those work units it would be less than 100% Of course if you do anything else on the computer at the same time, it might slow down. In which case you just lower the value to 11.

That my 2 cents.

Lets see what other say. I'm sure they will have more options for you to look at.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1715
Credit: 1091898245
RAC: 1121943

You should try maybe 10/12 or

You should try maybe 10/12 or less and see since running both the GPU BRP6 and the CPU's at the same time runs you processor up at 100% and that will slow down or stop the GPU's for a few seconds.

I found that out a long time ago on my 8-core and just trying to use all 3 cores on a 3-core while running BRP6 X2

As soon as I quit running a CPU task and left the one free the GeForce 560Ti OC went back to normal.

And on my 8-core it did not like running several FGRP4's at the same time as the GPU X2 (and a pair of vLHC tasks)

Was really topping the CPU and Ram % (checking the Windows task manager performance)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110558904115
RAC: 32826105

RE: Does anyone have any

Quote:
Does anyone have any insight as to why this might be? Thanks in advance.


Have you seen this thread?

I don't have any high end Maxwell2 GPUs so have no experience and I read the linked thread only as an interested bystander :-). However I'm wondering if you are seeing something along the lines of what was reported and discussed there.

I have low to mid range older nvidia GPUs, 550Ti, 650 and 650Ti and I do run all of those without any free CPU cores for support. If I free up a CPU core, I gain a little from faster GPU crunch times but only of the order of what I lose (or even less) from one CPU task less. I believe the gain in GPU throughput may be higher for higher end and more recent architecture cards. I think you should try 10/12 in addition to 11/12 to see what works best. You really do have to experiment to find the optimum combination.

I don't think you specifically said, but it appears you are running single GPU tasks of both available types - BRP4G and BRP6. You should be aware that there is a beta app for BRP6 which uses newer CUDA 5.5 libs and which has about 20-25% better performance on your type of GPU. In order to get this app, you have to change your preferences to allow beta test apps. I've been getting around 15-20% gain on Kepler series GPUs so it's certainly worthwhile.

If you are in the experimenting mood and you're keen on best performance, you could also try running concurrent GPU tasks as a means of further improving output. You will need to free more CPU cores but the GPU output will more than make up for that. When running concurrent GPU tasks, your GPU will show a higher and more stable load with little tendency for the load to drop drastically at any point. Of course, you will generate extra heat so you need good cooling. Even my lowly 650s can handle 3x concurrency with no free CPU cores. You could try running 3x with 3 CPU cores free and establish the performance. You could then reduce to 2 free cores and see what happens.

If you want maximum throughput, the BRP6 cuda55 beta app is the way to go. Before changing anything, you should read the linked thread fully and see if anything there can explain your drop in performance.

Cheers,
Gary.

Rodario
Rodario
Joined: 15 Apr 14
Posts: 6
Credit: 97163057
RAC: 0

RE: I actually have mine

Quote:

I actually have mine set at use "100%" and then created a cc_config.xml with a setting to allow me to crunch both CPU and GPU work units without the "hesitation" you are seeing on the GPU work units. In your case I would set the value at 12, 11 for the CPU and 1 for the GPU. If you could add up all the CPU usage by those work units it would be less than 100% Of course if you do anything else on the computer at the same time, it might slow down. In which case you just lower the value to 11.

Thank you, I'll try that (just as soon as I figure out how to).

Quote:


Have you seen this thread?

I don't have any high end Maxwell2 GPUs so have no experience and I read the linked thread only as an interested bystander :-). However I'm wondering if you are seeing something along the lines of what was reported and discussed there.

Thanks, I'll try the inspector tip from that link, but I'm doubtful that will correct the "pausing" issue.

Quote:

If you are in the experimenting mood and you're keen on best performance, you could also try running concurrent GPU tasks as a means of further improving output. You will need to free more CPU cores but the GPU output will more than make up for that. When running concurrent GPU tasks, your GPU will show a higher and more stable load with little tendency for the load to drop drastically at any point. Of course, you will generate extra heat so you need good cooling. Even my lowly 650s can handle 3x concurrency with no free CPU cores. You could try running 3x with 3 CPU cores free and establish the performance. You could then reduce to 2 free cores and see what happens.

I would definitely do that, if this were a dedicated BOINC machine, but one of the reasons I chose einstein@home as my GPU project is that it doesn't put 100% load on the GPU, so I can leave it running while browsing or watching videos (or sleeping in the same room).

In the meantime I've done some testing and troubleshooting and I've made an interesting observation. It appears it's not the GPU overclock that's causing this issue, but the XMP profile. Counter intuitive, right? Let me walk you through it.

I reset the GPU to stock clocks for an hour and observed the change in WU completion time and the load graph in GPU-Z. The tasks were completed about two minutes slower now and I could still see the irregular load drops in GPU-Z.

Next, I disabled the XMP profile. CPU tasks were a bit slower than with it enabled, but einstein WUs were back to a 28min/WU completion rate with the GPU at stock, and about 26.5min with my OC settings.

Right now, I have both XMP1 and GPU OC enabled again and the originally described issue persists, so this is reproducible.

As for system stability and OC viability: I ran two passes of Memtest with 0 errors, an hour of Prime 95 in max heat mode with no issues (and a very acceptable max CPU temp of 68°C), I'm not getting any FPS drops in games and I'm reliably getting a 3DMark score of ~12500 in Firestrike (up from 11500 without oc/xmp). This, to me, means the issue is isolated to einstein@home, or to BRP4/6 tasks, or to the P2 state in general.

Since everything works as intended and expected for the primary purpose of this PC (gaming), I'll keep the XMP/OC settings. Hopefully I'll find a solution to the crunching problem, but if not, I'll just try to accept the minor drop in einstein RAC.

Thanks for your tips guys, I'll report back with my findings.

Rodario
Rodario
Joined: 15 Apr 14
Posts: 6
Credit: 97163057
RAC: 0

RE: RE: I actually have

Quote:
Quote:

I actually have mine set at use "100%" and then created a cc_config.xml with a setting to allow me to crunch both CPU and GPU work units without the "hesitation" you are seeing on the GPU work units. In your case I would set the value at 12, 11 for the CPU and 1 for the GPU. If you could add up all the CPU usage by those work units it would be less than 100% Of course if you do anything else on the computer at the same time, it might slow down. In which case you just lower the value to 11.

Thank you, I'll try that (just as soon as I figure out how to).

OK, I read up on this and if I'm not mistaken, that only works as an application setting. So, it'd work if I only ran one CPU project, but not if I run many different ones at the same time.

I tried fiddling around with the percentages to get 10/12, 9/12 asf cores, but the problem persists no matter how many are working. This also makes sense to me, as the GPU task has a higher priority than CPU tasks and should herefore not be limited by them.

Just to be sure, I ran JUST einstein@home for a while (without any CPU tasks active): No change.

Quote:
Quote:


Have you seen this thread?

I don't have any high end Maxwell2 GPUs so have no experience and I read the linked thread only as an interested bystander :-). However I'm wondering if you are seeing something along the lines of what was reported and discussed there.

Thanks, I'll try the inspector tip from that link, but I'm doubtful that will correct the "pausing" issue.

Welp, now I know how to get my memory clock up to the desired speed in P2, so that's cool. Unfortunately, this did not resolve the issue.

I'm out of ideas.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

My apologies, actually I

My apologies, actually I stated the wrong xml file.

You can do it into a app_config.xml.

I have one here that I can post here but it's going to take me some time post it.

I can also tell you how to make an xml file and how to unhide the folders to place it but I'm supposed to leave to meet family here soon so it maybe 4-5 hours before I can get around to it.

As far as P2 changes. Becareful when you get up to 3505Ghz on the Nvidia Inspector.

Sometimes it doesn't like it and will crash the graphic driver. I have/had several 980s

This is especially try when you run more than 1 work unit at a time on the Graphic card.

If you are insistence on doing so, then I would suggest 3.3 Ghz

That is what the 980Tis are normal set out.

As per Nvidia.

They choose 3 GHz because they knew that at 3.5 there was going to be data corruption or failures. Not a problem for gaming but it is for crunching.

Since that initital release I think they have been doing so fine tuning and determine that 3.3 was a safe value and set all subsequent GPUs to that value.

I'll check back here later and see about getting you that max_project_concurrent xml

Zalster

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.