Run a list of commands on a set of SSH nodes. With a bit of optional parametrization.
queue.yaml
) while Pegasus is running.To use Pegasus,
hosts.yaml
and queue.yaml
, and run Pegasus.Pegasus will remove one entry at a time from the top of queue.yaml
and move it to consumed.yaml
as it begins to execute it.
Run four Python commands using two nodes.
```yaml
```yaml
console
$ cargo run -- q # stands for Queue
Run identical commands for multiple nodes.
```yaml
console
$ cargo run -- b # stands for Broadcast
Split nodes into sub-nodes that run commands in parallel. Below, four SSH connections are kept, and four commands run in parallel.
```yaml
When parametrizing nodes, just make sure you specify the hostname
key.
You can use these parameters in your commands. By the way, the templating engine is Handlebars.
```yaml
Four sub-nodes and four jobs. So all jobs will start executing at the same time.
If you can parametrize nodes, why not commands?
```yaml
This results in the exact same jobs with the example above.
When parametrizing commands, just make sure you specify the command
key.
How many commands will execute in Queue mode?
```yaml
```yaml
Note that although echo bye from {{ hostname }}
doesn't really use the low
or high
parameters, it will run 2 * 2 = 4
times regardless.
The answer is 1 + 2 * 2 * 2
.
queue.yaml
is actually the queue.
Pegasus removes the first entry in queue.yaml
whenver there's a free host available.
If you delete entries before Pegasus pulls it, they will not execute.
If you add entreis to queue.yaml
, they will execute.
Think about when the number of remaining commands is less than the number of free nodes. Without a way to submit more jobs to Pegasus, those free nodes will stay idle until all the commands finish and you start a fresh new instance of Pegasus.
By providing a way to add to the queue while commands are still running, users may achieve higher node utilization. Being able to delete from the queue is just a byproduct; adding to the queue is the key feature.
queue.yaml
.Lock mode will lock queue.yaml
and launch a command line editor for you.
console
$ cargo run -- l --editor nvim # l stands for Lock
Editor priority is --editor
> $EDITOR
> vim
.
When you save and exit, the queue lock is released and Pegasus is allowed access to queue.yaml
.
queue.yaml
?Enable daemon mode, and Pegasus will not terminate even if queue.yaml
is empty. It will stand waiting for you to populate queue.yaml
again, and execute them.
console
$ cargo run -- q --daemon
queue.yaml
This is the queue file. Entries in queue.yaml
are consumed from the top, one by one. Also, entries are consumed only when a new host is available to execute new commands. Consumed entries are immediately appended to consumed.yaml
in "canonical form", where every entry has a command
key. Thus you might do something like tail -n 2 consumed.yaml > queue.yaml
to re-execute your previous single-line command.
As mentioned earlier, always use the Lock Mode when you need to modify queue.yaml
.
In broadcast mode, hosts are kept in sync with each other. That is, the next command is fetched from queue.yaml
and executed on all hosts when all the hosts are done executing the previous command.
Consider the following situation:
fast-host slow-host
- command1 success success
- command2 success fail!
- command3 success
- command4 running
In this case, we would want to prepend a undo command for command2
(e.g., rm -rf repo || true
) and restart from that, but fast-host
is already far ahead, making things complicated. Thus, especially when you're terraforming nodes with Pegasus, keeping hosts in sync should be beneficial.
There is also a -e
or --error-aborts
flag in Broadcast Mode, which aborts Pegasus automatically when a host fails on a command.
Pegasus tries to implement graceful termination upon ctrl-c. The following happens:
break
s right before attempting to fetch from queue.yaml
.
queue.yaml
will not change and new commands will not start executing.killall pegasus; killall ssh; rm -rf .ssh-connection*
.