Tasks staying in memory despite Suspended activity

Guy
Guy
Joined: 20 Jan 06
Posts: 25
Credit: 58110033
RAC: 241293
Topic 229001

Hello,

I'm using Windows 10 x64.  BOINC is v 7.20.2 (x64) with VirtualBox. (VirtualBox has today been updated to v7).

During the last few days, Einstein tasks have stayed in memory when Activity is Suspended.

I have the Einstein Computing preferences set to "No" for "Leave non-GPU tasks in memory while suspended?". (I've toggled this and restarted to no effect.)  The ONLY effective method of retrieving my RAM is to Exit BOINC altogether but when BOINC is started again and activity is resumed, the same MDGWs tasks start but all progress is lost....

For example:
Even though BOINC activity is suspended, the app -

einstein_O3MD1_1.03_windows_x86_64__GW-SSE2.exe

has 5 instances in the "Details" tab of Windows Task Manager, each one takes 1.85 GB of RAM but they are not running.  They are linked to the following workunits:

h1_0942.60_O3aC01Cl0In0__O3MD1V2a_VelaJr1_943.00Hz_264_0_0
h1_0942.60_O3aC01Cl0In0__O3MD1V2a_VelaJr1_943.00Hz_265_0_0
h1_0942.60_O3aC01Cl0In0__O3MD1V2a_VelaJr1_943.00Hz_266_0_0
h1_0942.60_O3aC01Cl0In0__O3MD1V2a_VelaJr1_943.00Hz_267_0_0
h1_0942.60_O3aC01Cl0In0__O3MD1V2a_VelaJr1_943.00Hz_268_0_0

This is a problem because I can't use the PC for anything else even while BOINC is suspended, with 12 of my 16 GB of RAM used up.    Also oddly, after using some other large app, like a web browser, the Task Manager shows the Einstein tasks eventually reducing their RAM usage to sub 100kB each - but the Task Manager RAM usage pane (and the overall system performance) shows the main RAM is still used up. The odd thing being that none of the items detailed in the Task Manager can account for this RAM use. A Windows 10 error?

Please help.

Thank you,
Guy

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 2851
Credit: 4730331765
RAC: 3285941

Hello Guy,The most

Hello Guy,

The most obvious question I have would be have you tried rebooting your computer?  And I don't mean "restart" your computer, but shut it down completely, wait for a minute, and then do a cold startup.  See if that 'fixes' it.

Many times Windows does stupid things, which is why I have switched to Linux Ubuntu.  But, that's a different story.

Try shutting it down, like I said, and if it is still doing the same thing, get back to us and we'll see what we can do.

.....[edit].....

I hope I caught you before you shut down.  This too does sometimes have an effect on BOINC without shutting it down first.  Be sure to turn off BOINC and Virtual Box before you shut down.  You can restart BOINC when you boot back up.

George

Proud member of the Old Farts Association

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5852
Credit: 111054327895
RAC: 34805786

Guy wrote:I'm using Windows

Guy wrote:
I'm using Windows 10 x64.  BOINC is v 7.20.2 (x64) with VirtualBox. (VirtualBox has today been updated to v7).

You have older hardware (4 core, 8 thread CPU) and some of the available resources would be allocated to your virtual machine setup (how much memory is tied up with that?).

You have 2 working GPUs, a discrete GTX 650 and an internal Intel GPU.  Both of these are being used to run GPU tasks which are not subject to the "non-GPU tasks in memory" setting.  You appear to be running every possible search that Einstein offers.  Your machine would appear to be quite overloaded.

You probably should try reducing the number of cores that BOINC is allowed to use.  Also check the amount of memory that BOINC can use when the machine is running other stuff.  These are preferences you can use to reduce the load on your machine.

You should consider disabling one of the two GPUs.  My guess is that using the Intel GPU is significantly impacting any CPU tasks you have running.  You should perhaps reduce the different searches that run.  The FGRPB1G search on the GTX 650, whilst very slow, should not impact as much on main memory use.  Experiment with the different CPU searches (one task for each individual search) to see what has least affect on your other uses of the machine. Once you find the best performing CPU search, try running incremental tasks until performance starts to suffer.  Don't try to run on all available threads.

Guy wrote:
Also oddly, after using some other large app, like a web browser, the Task Manager shows the Einstein tasks as having reduced their RAM usage ...

That is expected behaviour.  BOINC runs the science apps at a lower priority so that your normal work can get the resources when needed.  The real issue is that you are probably trying to run too many BOINC jobs which are all fighting for insufficient resources.

You really do have to experiment to see what works best.  It is also more complicated if you run other projects besides Einstein.  Good luck with optimising your machine's performance.

Cheers,
Gary.

hadron
hadron
Joined: 27 Jan 23
Posts: 18
Credit: 12378488
RAC: 43408

Guy wrote: I have the

Guy wrote:

I have the Einstein Computing preferences set to "No" for "Leave non-GPU tasks in memory while suspended?". (I've toggled this and restarted to no effect.)  The ONLY effective method of retrieving my RAM is to Exit BOINC

My first guess is that you are running using preferences you set locally, and you have "Leave non-GPU tasks in memory....." checked there.

I'm running Linux, so I can't be sure if this is the exact sequence to follow in Windows. Hopefully it will be.

In the BOINC manager, go to Options/Computing Preferences/Disk and Memory. The "Leave non-GPU...." box should be unchecked to do what you want.

 

Guy
Guy
Joined: 20 Jan 06
Posts: 25
Credit: 58110033
RAC: 241293

Setting CPU's to 50% (just 4

Setting CPU's to 50% (just 4 cores in this case) has made an immediate difference to the problem of loss of progress.  The search apps quit out of memory properly now.  It works fine!
Thanks for that.

I've stopped pulling iGPU tasks but until the ones already dl'd finish I won't see a difference.

Thank you all very much for your encouragement and the benefit of your insight!
Guy

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 876
Credit: 16327820
RAC: 5142

Guy schrieb:[...] but when

Guy wrote:

[...] but when BOINC is started again and activity is resumed, the same MDGWs tasks start but all progress is lost....

I would like to add that the O3MD1 CPU app only writes approx. 32 to 64 checkpoints. See progress in stderr.txt in the workunit's slot directory.

e.g: (from stderr.txt)

[...]

2023-01-24 02:15:50.7761 (19028) [normal]: Cpt:0,  total:32,  sky:1/1,  f1dot:1/32

--> This WU will checkpoint (Cpt) 32 times.

different WU example:

2023-01-29 21:13:12.4062 (11524) [normal]: Cpt:0,  total:63,  sky:1/1,  f1dot:1/63

--> will checkpoint 63 times...

With runtimes of (in my case) 18-24 hours, the app eventually runs almost an hour between two checkpoints. If you exit BOINC in between or restart the computer, computation since last checkpoint is lost. The same will happen, if a task is suspended (manually or by BOINC's scheduler) and checkbox "leave non-GPU tasks in memory" is unchecked. That's the tradeoff: leaving task(s) in memory vs. losing 'some' computation already done. See also in the BOINC manager in task details of a certain task the entry:  "CPU time since last checkpoint".

[EDIT:]

App logs computation progress also in stderr.txt in lines like the following:

...........c
...........c
...........c

A new line begins after each checkpoint (c).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.