To: "Naomi B. Schmidt" Cc: vkumar@MIT.EDU, wdc@MIT.EDU, nschmidt@MIT.EDU Subject: Re: An Idea On Long Jobs In-Reply-To: Your message of "Mon, 27 Jan 1997 16:13:02 EST." <9701272113.AA26581@mozart.MIT.EDU> Date: Mon, 27 Jan 1997 16:44:53 EST From: Mike Barker :) I really appreciate your giving some thought to a pet project of :) mine. What you suggest certainly has its merits, but my main :) concern is that it does not scale. I guess I don't understand what this means. If there are more users who want to run long jobs, we'd need to add more cpus. It scales the same way having public workstations scales--you increase the pool of machines if you need to. The signup and admin tasks would be complicated by growth. However, if all we need is that automated, it may be a much simpler job than trying to solve the "multiple user" problem. :) Also, what happens if someone :) signs up for N hours, and after that time the job is within a few :) minutes of completion? Does the job get bumped? Does the next :) person in line just wait? I certainly wasn't planning on trying to keep the time usage tight. I'd say if someone isn't finished when they expected to be, they would alert the administrator and leave it running. We don't have to tell the next person in line exactly when they will get a system, and I wouldn't. They get the next system that comes free, whichever one that is. :) What happens if one's turn comes up at :) 3 AM? Is there queuing software that starts the job at that time, :) or does one have to get out of bed, traipse to a cluster, remotely :) log in to the machine in question and start it up? And what if a :) user finishes early at 3 AM? Does the administrator get beeped? again, I wasn't suggesting any queuing software. if the user wants to start at 3 am, they can (and I'm sure many of our students would). I'd suggest if we want to allow for night-time operations, we need to use a group that has 24 hour coverage. We can also simply declare that we turn over machines during the day (so the smallest "assignment" of time would be 24 hours). I don't think the solutions to these problems should be technical (e.g. queuing software). :) :) Maybe a small group of us can get together and toss some ideas :) around to see if some of my objections can be answered. :) :) Thanks again for giving some time and thought to this. We all :) know that it will eventually again rear its head, and we shouldn't :) stop thinking about how to solve this problem. :) :) Naomi