Computation errors

Message boards : Macintosh : Computation errors

Author Message
markj
Send message
Joined: Jun 21 08
Posts: 3
Credit: 1,466,133
RAC: 0

Dear All,
I am seeing quite a few "Computation Errors" lately, and they occur after nearly one hour of processing, which is annoying.
Any idea what the problem may be?
I am using MacOSX Intel, but I am not sure if it is platform-specific, as I don't have other computers running Malariacontrol.
Greetings,
markj

Profile mikey
Avatar
Send message
Joined: Mar 23 07
Posts: 4382
Credit: 5,361,193
RAC: 1,084

Dear All,
I am seeing quite a few "Computation Errors" lately, and they occur after nearly one hour of processing, which is annoying.
Any idea what the problem may be?
I am using MacOSX Intel, but I am not sure if it is platform-specific, as I don't have other computers running Malariacontrol.
Greetings,
markj


Those of us using pure Windows machines just went thru this problem because Malaria used a machine with the newer .net stuff on it to make the units. That means that you and I needed it too, but the new versions of the workunits don't need it. I am crunching units that are using 6.24 under the Application tab in the Boinc Manager. What are you using? If that is not it I have no other idea, sorry!

I have posted a link to your thread in this thread https://malariacontrol.net/forum_thread.php?id=883

Hopefully this will get your problem solved
____________

Chris Sutton
Send message
Joined: Nov 10 05
Posts: 297
Credit: 4,941,683
RAC: 0

The 'nearly one hour' you mention is very close to the default BOINC preference for switching between applications (60 minutes).

The error reported regarding problems reading checkpoint file strengthens this hypothesis.

I didn't look too deeply, but it appears to affect all tasks for the given wu, so the problem is more than likely related to the input data.

Until the project analyses this and possibly makes changes to the input data, your options will most likely be limited to:
. change the switch between applications setting to > the expected processing time for the task so as to avoid switching out and thereby completing it in a single run;
. suspend all projects except MCDN on the affected hosts, also lending the box to hopefully run the MCDN units to completion without switching out;

markj
Send message
Joined: Jun 21 08
Posts: 3
Credit: 1,466,133
RAC: 0

thanks for the replies - I don't think the switching is the problem, as units also fail before 60 mins, at 55 mins. I'll try updating the boinc program (currently I'm on 6.6.20 and I see 6.6.36 is recommended now) and setting the switching to 2 hours though, to see if that solves things.
markj

Profile mikey
Avatar
Send message
Joined: Mar 23 07
Posts: 4382
Credit: 5,361,193
RAC: 1,084

thanks for the replies - I don't think the switching is the problem, as units also fail before 60 mins, at 55 mins. I'll try updating the boinc program (currently I'm on 6.6.20 and I see 6.6.36 is recommended now) and setting the switching to 2 hours though, to see if that solves things.
markj


Actually downgrading to 6.4.7 might be even better, the 6.6.? versions all have issues with the scheduler. If you only run one project it is fine but if you run multiple projects alot of users are having issues. NOT all users are having problems just a bunch of them. If you CUDA, ie use your gpu to crunch, then you will need at least version 6.5.0 of Boinc to do that thru Boinc.
____________

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 149,220
RAC: 17

Actually downgrading to 6.4.7 might be even better, the 6.6.? versions all have issues with the scheduler.

A good example of "How misinformation gets into the world."
If you want to explain things, explain them correctly. Ever since 6.6.20 BOINC contains separate CPU and GPU schedulers, built into the client.

Most of the problems with these have now been fixed. What people are falling over at this time is that BOINC 6.6.38, the latest version available for Windows platforms at least, will now and then request GPU work on projects that have no GPU application. This is only done in case the project installs a GPU application from one day to the other, so people who have a GPU and want that project to work on their GPU will get work. It's just a simple check, nothing broken.

A problem that has been fixed in 6.6.38 is that result uploads will be grouped together per project, meaning that when a project goes down, it has one timer on the retry to upload those results.

Furthermore, the problem of "won't finish in time" has been fixed in 6.6.38.

The biggest problem with new BOINC versions is that people expect:
a) that those bugs are fixed.
b) that despite those bug fixes, the way that they are now accustomed to how BOINC works won't change.

That's a wrong anticipation. Bug fixes will at times change the behaviour of the software. If you can't agree with that, do not run the newer software but stay stuck on something older.

Between 6.4 and 6.6 the way that debts are calculated changed. Warnings went up on several forums about this, I eventually added it in the release notes. Did people heed the warning? No. They found their BOINC to work differently than before and so decided it was broken. That the older versions may have been broken is something that can't be true. Newer versions with bug-fixes will inevitably break something that worked before... or so the reasoning is.

If you only run one project it is fine but if you run multiple projects alot of users are having issues. NOT all users are having problems just a bunch of them.

You are contradicting yourself here. First it's a lot, then it's just a bunch.
It's not that many, it's mostly people posting at Seti and then their vocal numbers are about 10. On a user base of 327 thousand active BOINC users, that is neither a lot nor a bunch.

Just panic posting. I have a problem so I need to tell everyone not to use this BOINC.

If you CUDA, ie use your gpu to crunch, then you will need at least version 6.5.0 of Boinc to do that thru Boinc.

6.4.5, 6.4.6 and 6.4.7 do CUDA as well. They, as well as 6.5.0 by the way, do not have separate CPU and GPU schedulers, so all work will be requested by the CPU. Which in some cases may leave the GPU go without work, because of CPU debt problems on the project.

But since we are in the Macintosh forum, there are no working 64bit library sets yet for the latest drivers. The drivers may enable CUDA on Leopard, but without those libraries it won't work. BOINC for the Mac has got CUDA detection built-in though. So as soon as Nvidia releases those libraries, all will be well.
____________
Jord.

BOINC FAQ Service

Post to thread

Message boards : Macintosh : Computation errors


Return to malariacontrol.net main page


Copyright © 2013 africa@home