No Work Sent (see Scheuler Log) Error?

StarCastle
StarCastle
Joined: 7 Aug 09
Posts: 4
Credit: 3448939
RAC: 954
Topic 198364

Suddenly no work is being sent and I see the following message:

1/6/2016 6:27:04 PM | Einstein@Home | see scheduler log messages on http://einstein5.aei.uni-hannover.de/EinsteinAtHome/host_sched_logs/11547/11547884

The Page does not exist so not sure where to look about this.

Anyone else having this issue or a resolution?

Thanks!

Peter

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110690262141
RAC: 32577230

No Work Sent (see Scheuler Log) Error?

Quote:

Suddenly no work is being sent and I see the following message:

1/6/2016 6:27:04 PM | Einstein@Home | see scheduler log messages on http://einstein5.aei.uni-hannover.de/EinsteinAtHome/host_sched_logs/11547/11547884

The Page does not exist so not sure where to look about this.


Go to your computers list on the website and click the link at the far right under the "Last contact" heading. This is a record of what happened during the last exchange with the scheduler. Here is an excerpt

2016-01-06 23:36:15.0720 [PID=14757]   Request: [USER#xxxxx] [HOST#11547884] [IP xxx.xxx.xxx.157] client 7.6.9
2016-01-06 23:36:15.1475 [PID=14757]    [send] effective_ncpus 3 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2016-01-06 23:36:15.1475 [PID=14757]    [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2016-01-06 23:36:15.1475 [PID=14757]    [send] Not using matchmaker scheduling; Not using EDF sim
2016-01-06 23:36:15.1475 [PID=14757]    [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2016-01-06 23:36:15.1475 [PID=14757]    [send] Intel GPU: req 691200.00 sec, 1.00 instances; est delay 0.00
....
....


From the above, your host is not requesting CPU work (0.00 seconds) but is requesting Intel GPU work (691200 seconds - rather a *lot*)

The last task you completed was an FGRP4 CPU task so if you're not requesting these any more it has to be due to you changing your preferences in some way to exclude that particular science run or perhaps CPU work in general? Have you made any preference changes recently?

Are you trying to get Intel GPU work? That's a bit of a can of worms unless you really know what you are doing :-).

There are several existing threads that go through the problems, particularly relating to driver versions that allow useful results that will validate when returned. You should research those threads if you intend to use your integrated GPU.

Cheers,
Gary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7086474931
RAC: 1321148

RE: From the above, your

Quote:
From the above, your host is not requesting CPU work (0.00 seconds)


So do you have other BOINC project enabled on your machine? Work onboard for other(s) might have your copy of BOINC deciding you don't need more Einstein CPU work just now. It is your host that is not requesting--not Einstein that is denying it CPU work.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110690262141
RAC: 32577230

Yes indeed, maybe Einstein's

Yes indeed, maybe Einstein's resource share is low enough for BOINC not to be interested in requesting more Einstein CPU work yet.

I (sort of) discounted that because there is just one FGRP4 task still showing and it was sent and returned quite a while ago so no real evidence that Einstein was being overly favoured in the more recent past. Of course, if a lot of projects are active, resource share of each will be quite low anyway so it wouldn't take many tasks for one particular project for BOINC to ban that project for quite a while.

The other thing I should have pointed out in my initial response was that the large GPU task request indicated cache settings that seem inappropriate for the number of projects to which the machine is attached - 12 all told. This will exacerbate the situation where one project may end up supplying far too much work at one point, leading to high priority mode crunching and then an extended period of not being able to request more work for that project due to resource share constraints. Depending on how many of the 12 are actually active, a work cache setting of much more than a day or two is probably asking for trouble.

Cheers,
Gary.

StarCastle
StarCastle
Joined: 7 Aug 09
Posts: 4
Credit: 3448939
RAC: 954

Thanks Gary, I reduced by

Thanks Gary,

I reduced by work cache to 2 days and will see what happens (will reduce further if needed).

The other projects don't seem to be having issues but each is different I guess.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.