February 12, 2011

PoD at CERN's lxplus. Part 2.

Anar here...

PoD at CERN's lxplus. Part 1.

Today I've been testing a new beta (v3.2.61.geeb0) of PoD and same as always I have been playing with it at CERN's lxplus (LSF) just to check whether this version properly behaves on AFS.

It is really surprising to see how effective CERN's LSF works. At GSI we have a special queue for PoD - preemptive queue, in order to provide as much as possible interactivity. I and other PoD users always get PoD very fast at GSI. But at CERN I used a standard "1nh" queue which was full of pending jobs and since my share should be good at CERN's LSF (I almost don't use the cluster, only doing my tests time to time) I got my PoD jobs through just within 40 seconds - almost a record :)
So, simply to say I got a dynamic PROOF cluster of 37 workers just in 40 seconds ;)
Ufff... Obviously CERN's LSF foreshore works perfectly correct.
If I used LSF at CERN more intensively, then I would get my worker up and running a bit later, I guess. Depends on how intensive I used it. There is no magic - fare share. This is why, if you want to provide the maximum interactivity for PoD user on your cluster, you need to tune a bit your cluster or make a dedicated queues. This is a good trade for a fast dynamic PROOF cluster, which will give resources back to batch users as soon as nobody use it.
Anyway, as we can see, even without any pre-configuration of resource management clusters, PoD is very fast and more than usable ;)

I would be very grateful if somebody at CERN, who use LSF intensively than I do, would test PoD and report back how fast he/she gets PoD works online.

BTW, since PoD now in a redesign stage. I wouldn't recommend PoD users to use PoD CLI instead of PoD GUI. I am currently working on a new GUI which will also allow to work with remote PoD servers and will reflect the latest development of PoD.

Some screenshots of my CERN tests of today:

1. start PoD server:

2. submit PoD jobs to CERN's LSF:

Just ~40! seconds later:
3. check how many workers we got already and which workers are they:

4. we have our dynamic PROOF cluster and now we can process a PROOF analysis as usual:


Here is another test I did, just for the sake of demonstration I used the "date" command to show the current time. Of course I also have PoD logs as a evidence :D

1. at 11:41:14 requested 40 PoD workers:

2. at 11:41:50 I got my first 18 workers.

3. at 11:42:10 I got all requested workers:

So, in the second test it took me ~36 seconds to get the first half of the workers and in ~ 1 min I got my last requested worker online.
Actually I could start my analysis as soon as I got some reasonable amount of workers, for example more than 30 workers and it was just about 40 seconds of waiting since I requested them. The rest of the workers will be connected to PoD automatically as soon as they online and ready. Also if I want, at any time, I can submit more workers and they also will be connected to my cluster automatically. ;) PoD is very flexible.


Meet us on http://pod.gsi.de

No comments: