IRC logs of #boinc for Saturday, 2013-12-21

03:46 <dddh> half of running task is 32bit

03:46 <dddh> "sixtrack_lin32_4465_pni.linux"

03:47 <dddh> others are 64bit: "sixtrack_lin64_4465_pni.linux"

04:12 <Tank_Master> 64bit clients will be given 32bit tasks if theres not enough 64bit work

04:20 <dddh> oh

06:36 *** microdroid has joined #boinc

13:51 * dddh is waiting for his second GTX580 to arrive

16:52 <Myrth> hi, i'm noob, my boinc tasks have all finished, but it doesn't seem like i'm getting new ones

17:01 <dddh> Myrth: windows?

17:33 <Myrth> dddh: yep

17:43 <kdavyd> Hi guys, does anyone know of projects that have a large or very large data working set size? In the TB range.

17:44 <kdavyd> Need to find something disk I/O intensive to generate long-term load, and might as well be useful while i do it.

17:45 <efc> If you mean what the project has internally, I'm sure seti@home is there. But nobody is sending out WUs in that range (I hope)

17:47 <kdavyd> efc: Not necessarily WU size, could just be a need for large amount of scratch space, more than fits in RAM.

17:47 <kdavyd> But yes, the goal is to drive local I/O

17:48 <kdavyd> CPU/memory can scale accordingly, i have large systems.

17:49 <efc> I see, don't know of any offhand. Everybody assumes some minimum amount of RAM refuses to run below that..

17:50 <efc> seems like there might be some hash collision research like that

17:51 <kdavyd> ok, thanks. I'll keep looking, maybe just try a bunch of projects and see what their disk impact is

17:52 <efc> to be fair i've only run a few projects. you may get better opinions if you stick around. But generally, lots of disk activity is what they are trying to avoid.

17:53 <efc> People work on rainbow tables and I would think they would have a very large disk-based index to help look for collisions. Those always struck me as a bit nefarious, though.

17:54 <kdavyd> yeah, I can see how you'd want to avoid that in a volunteer environment. But that's exactly what i have a lot of

17:54 <efc> seti@home is of course doing serious crunching on the server end, many gigs of data being split up into WUs, but no way for you to get into that

17:55 <kdavyd> Maybe some genetic sequencing projects will do something like that - i do know commercial sequencers generate a crapload of data.

17:56 <efc> Map processing also, like google earth/maps, but again they hide it on the backend

18:01 <kdavyd> right

18:03 <kdavyd> I suppose i could do this in a bunch of VMs and overprovision memory, causing them to swap... but that seems counter-productive

18:07 <efc> compiling large projects hits the disk quite a bit, might compile some linux kernals.

18:07 <efc> kernels

18:10 <kdavyd> yeah, could always do that, but i'm looking for something more useful

18:15 <kdavyd> and it still doesn't hit the disk all that much. For a kernel compile, you only need a few gigs, and it's all gonna end up cached anyway

18:24 <kdavyd> aha, looks like I found one that might work - the clean energy project

18:24 <kdavyd> People complain it's too I/O intensive

18:36 <Myrth> hi, i'm not getting any new tasks - so i guess we're done?

19:09 <Myrth> how can i add new projects to my account?

23:41 <kdavyd> efc: just wanted to report back. 4 Clean Energy Project jobs hammer the crap out of a 15k drive. ~200 of them should be a nice load test for a storage box then.

23:41 <efc> wow, cool

23:41 <efc> And I'll remember not to run that one hehe

23:41 <efc> reads, writes, any idea?

23:42 <kdavyd> Yeah, don't. It's so bad they had to throttle it to one job per client, and the setting to increase # of jobs is hidden well

23:42 <kdavyd> At the storage layer, 99% writes

23:43 <kdavyd> Looks like 1M blocksize - not ideal for me, but i can work around that.

23:46 <kdavyd> It's very intermittent/spiky I/O too, which is nice. Worst case scenario for storage.

