Reporting issues and FAQ

Message boards : Number crunching : Reporting issues and FAQ

Author Message
hardy
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: Feb 18 09
Posts: 141
Credit: 54,376
RAC: 129

The computation on malariacontrol.net is very CPU and memory intensive, and because of its nature (heavy floating point computation and random number usage) is somewhat sensitive to the system it is running on. This means that getting occasional invalid results or computing errors is unfortunately normal.

If, however, you have frequent problems, please do report them! Please check for other reports of the same error in recent threads, but please don't assume you're error is the same as someone else's because it is from the same platform. If you're issue doesn't appear to already be covered in a recent thread, please start a new thread, including the error message in the title if possible.

Please check for errors within the "stderr out" of your result(s). Note that there quite often one or two warning messages in the output, and that most of the time these are not relevant to current issue. Look instead for messages like "Exception", "error", "exited with ...".

You may also want to check whether your result was ever continued from a checkpoint, which has caused a few errors in the past (although most of the problems we have had with checkpoints have been squashed lately). A message like the following written every time a checkpoint is read (the number is time within the simulation, RC stands for "Read Checkpoint"):

33191 RC


Please also check below for some common problems.

Lastly, we wish you fun computing and thank you for your contributions!

hardy
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: Feb 18 09
Posts: 141
Credit: 54,376
RAC: 129

Memory usage

The latest models we've been running on malariacontrol.net (since summer 2010) tend to use quite a bit of RAM: up to around 400 MB. Depending on your computer and what other applications you run this can sometimes be a problem, for example if you run four workunits at once on a quad-core processor but with only 2 GB of RAM, the OpenMalaria application could use around three quarters of your computer's memory, leaving very little for the operating system and other applications! In this case data tends to get swapped to the hard drive to free up some RAM, allowing your computer to continue to operate but causing slowdowns, which can be huge.

Solutions:
1. Restrict malariacontrol.net to running on fewer processors (or switch altogether to another BOINC application).
2. Upgrade your computer with extra RAM.

We are also planning to reduce memory usage with a future release.

hardy
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: Feb 18 09
Posts: 141
Credit: 54,376
RAC: 129

Exception: effectiveEIR is not finite: ...

A few people have reported seeing this error in almost all work units. Please see this thread for more information.

Message boards : Number crunching : Reporting issues and FAQ


Return to malariacontrol.net main page


Copyright © 2013 africa@home