Posts by michaelT
1) Message boards : Unix/Linux : 36 hours of computing (and increasing) ? (Message 22235)
Posted 14 days ago by michaelT
Differences between tasks could be due to 2 factors :
- models : we're simulating various combination of models which needs differents computation time. It's quiet difficult to estimate the computation time between all models combinations, and the method we're using, for the moment, is to estimated flops is to do a average flops of the workunits we already received. But we're working on a new solution to estimate flops more precisely.
- stochasticity : with a different set of parameters the simulation could go in a different direction and for example have more infection to simulate which imply more computation time.
Regarding the SETI@home task, this isn't related. The project are treated independently by BOINC.
2) Message boards : Number crunching : Memory leak (Message 21612)
Posted 68 days ago by michaelT
Thanks ce6423 we will have a look at that and check what's happening.
Could be that with the set of generated parameters, the number of infections is exploding which does the number of objects created...
3) Message boards : Number crunching : MD5 error downloading Densities AND wrong size 2012-12-29 (Message 21460)
Posted 82 days ago by michaelT
Haapy new year too.
MmMmMmM ... Strange :
1) Doing a md5sum on the file directrly on the server gave me 54ea34d38d96c311122642aec045bc40 so the file you have is correct.
2) same thing for scenario_29.xsd the size you have is the correct : 141319
Maybe ... the information you got from the server have somehow been corrupted during the transfert which could explain the difference between MD5s and also the filesize.
First some quick and dirty fix :Could you try to do a reset on the BOINC Client for malariacontrol project.
If it doesn't work :
- did it happen again between the date of the post and today ?
- is it for all the workunits you're getting or only some of them ?
- could you copy/paste new references the one you wrote have already been purged
4) Message boards : Number crunching : openMalaria test version v6.65 (Message 20196)
Posted 187 days ago by michaelT
Great it worked :)
For the message, no worries, it's a warning due to the experiment configuration we setuped, but we're expecting it. It will be took into account during the analysis of the results.
5) Message boards : Number crunching : Branch A wu_1163_* tasks failing (Message 20178)
Posted 188 days ago by michaelT
The sampling of the parameter 16 went to the wrong value space. We're currently modifying the workunits generator code to include boundaries in the sampling space ( ]a,b[ ).
The time to put the new version on production the wu_1163_* workunits generation have been stopped.
6) Message boards : Number crunching : openMalaria test version v6.65 (Message 20170)
Posted 189 days ago by michaelT
Some news from the battlefront, all the beta workunit have been sacrified (canceled), there are still some of them fighting with various errors ... but it will take a little bit of time until all the stack will be cleaned and all the workunits to be burried ...
So will still have some errors about the scenario_30.xsd for a couple of them. But we will soon come back with brand new "working" workunits.
7) Message boards : Number crunching : openMalaria test version v6.65 (Message 20162)
Posted 189 days ago by michaelT
Dammit ... I think I need holidays or buy news glasses ...
By mistake (for Beta), changing the xml back to the original one, I removed the </file_ref> xml tag in the xml doc. XML fixed. ...
Sorry for all the mess. :( :( :( :( :( :(
8) Message boards : Number crunching : openMalaria test version v6.65 (Message 20110)
Posted 192 days ago by michaelT
We stopped the generation of new workunits for beta and we will wait until all the workunits are back.
What happened is that :
- We replaced the old scenario_30.xsd file with the new one and launched a sample set of a batch to test.
- The client which didn't have workunits from beta downloaded the new version.
- But some clients were still running workunits from beta. So they still have the scenario_30.xsd file in the project directory. We expected that when the client get new workunits, the scenario_30.xsd will be replaced with the new one because the checksum between the old and new file was different but that was not the case... The old files stayed and that's were the mess started.
So we have to wait until all the workunits from beta are back like that the scenario_30.xsd file will be deleted from the project folder ( "<sticky/>" is not in the xml doc so it won't keep it). And then everything will be fixed and we can restart to generate new workunits.
9) Message boards : Number crunching : openMalaria test version v6.65 (Message 20098)
Posted 193 days ago by michaelT
We removed the <copy_file/> from the xml doc trying if this will replace the old scenario_30.xsd file with the new one on the client side but it was a bae idea :( ... we set it back as it was before.
The problem is that we have a new version of the scenario_30.xsd which replace an older version. In theory the new one should replace the old one each time you download a new workunit because it create a slot directory where workunites files are downloaded. It's only if you specify in the xml doc "<sticky/>".
10) Message boards : Number crunching : openMalaria test version v6.65 (Message 20086)
Posted 194 days ago by michaelT
1. wuVFNEW_* : xml attribute baseFactor have been set to a value < 0 while it was expecting a value whitin a range [0,1]. The xml file was validated by the xsd but openMalaria is doing additional check for value ranges that's why this error was not detected before. In addition, before submiting a new batch, we test it on our computer and then send a sample on BOINC before sending the whole batch. Here it was the sample.
I'll check with the scientist who created the experiment what happened.
2.wuVFNEWcov_* :Maybe a file cache problem. We updated openMalariaBeta with a new version which is backward compatible and replaced the scenario_30.xsd with a new version on our server. So the new workunits are sent with the new xsd files and not the old one. However it seem's that it was not replaced on all clients. I'll check if it's a problem of server cache or boinc client cache.