7.2. Precip

Precip is a flexible experiment management API for running experiments on clouds. Precip was developed for use on FutureGrid infrastructures such as OpenStack, Eucalyptus (>=3.2), Nimbus, and at the same time, commercial clouds such as Amazon EC2. The API allows you to easily provision resources, which you can then run commands on and copy files to/from subsets of instances identified by tags. The goal of the API is to be flexible and simple to use in Python scripts to control your experiments.

The API does not require any special images, which makes it easy to get going. Any basic Linux image will work. More complex images can be used if your experiment requires so, or you can use the experiment API to run bootstrap scripts on the images to install/configure required software.

A concept which simplfies interacting with the API is instance tagging. When you start an instance, you can add arbitrary tags to it. The instance also gets a set of default tags. API methods such as running a remote command, or copying files, all use tags for specifying which instances you want to target.

Precip also handles ssh keys and security groups automatically. This is done to make sure the experiment management is not interfering with your existing cloud setup. The first time you use Precip, a directory will be created called ~/.precip. Inside this directory, a ssh keypair will be created and used for accessing instances. On clouds which supports it, the keypair is automatically registered as ‘precip’, and a ‘precip’ security group is created. If your experiment requires more ports to be open, you can use the cloud interface to add those ports to the precip security group.

Precip is a fairly new API, and if you have questions or suggestions for improvements, please contact pegasus-support@isi.edu .

7.2.1. Installation

If you want to use the India or Sierra FutureGrid resources to manage your experiment, Precip is available on the interactive logins nodes via modules: module load precip/0.1.

You can also install Precip on your own machine. Prerequisites are

the Paramiko and Boto Python modules. The Python source package and RPMs are available at: http://pegasus.isi.edu/static/precip/software/ .

7.2.2. API

provision(image_id, instance_type=’m1.small’, count=1, tags=None)

Provision a new instance. Note that this method starts the provisioning cycle, but does not block for the instance to finish booting. For blocking on instance creation/booting, see wait() .

Parameters:

  • image_id - the id of the image to instantiate
  • instance_type - the type of instance. This is infrastructure specific, but usually follows the Amazon EC2 model with m1.small, m1.large, and so on.
  • count - number of instances to create
  • tags - these are used to manipulate the instance later. Use this to create logical groups of your instances.
wait(tags=[], timeout=600)

Barrier for all instances matching the tags argument. This method will block until the instances have finished booting and are accessible via their external hostnames.

Parameters:

  • tags - tags specifying the subset of instances to block on. The default value is [] , which means wait for all instances.
  • timeout - timeout in seconds for the instances to boot. If the timeout is reached, an ExperimentException is raised. The default is 600 seconds.
deprovision(tags)

Deprovisions (terminates) instances matching the tags argument

Parameters:

  • tags - tags specifying the subset of instances to deprovision.
list(tags)

Returns a list of details about the instances matching the tags. The details include instance id, hostnames, and tags.

Parameters:

  • tags - tags specifying the subset of instances to give information on. If you want details on all current instances, use [].

Returns:

  • List of dictionaries, one for each instance.
get_public_hostnames(tags)

Provides a list of public hostnames for the instances matching the tags. The public hostnames can be provided to other instances in order to let the instances know about each other.

Parameters:

  • tags - tags specifying the subset of instances.

Returns:

  • A list of public hostnames
get_private_hostnames(tags)

Provides a list of private hostnames for the instances matching the tags. The private hostnames can be provided to other instances in order to let the instances know about each other.

Parameters:

  • tags - tags specifying the subset of instances.

Returns:

  • A list of private hostnames
get(tags, remote_path, local_path, user=”root”)

Transfers a file from a set of remote machines matching the tags, and stores the file locally. If more than one instance matches the tags, an instance id will be appended to the local_path.

Parameters:

  • tags - these are used to manipulate the instance later. Use this to create logical groups of your instances.
  • remote_path - the path of the file on the remote instance
  • local_path - the local path to tranfer to
  • user - remote user. If not specified, the default is ‘root’
put(tags, local_path, remote_path, user=”root”)

Transfers a local file to a set of remote machines matching the tags.

Parameters:

  • tags - these are used to manipulate the instance later. Use this to create logical groups of your instances.
  • local_path - the local path to tranfer from
  • remote_path - the path on the remote instance to store the file as
  • user - remote user. If not specified, the default is ‘root’
run(tags, cmd, user=”root”, check_exit_code=True)

Runs a command on the instances matches the tags. The commands are run in series, on one instance after the other.

Parameters:

  • tags - these are used to manipulate the instance later. Use this to create logical groups of your instances.
  • cmd - the command to run
  • user - remote user. If not specified, the default is ‘root’. If you need to run commands as another user, you will have to make sure that user accepts the ssh key in ~/.precip/
  • check_exit_code - If set to True (default), commands returning non-zero exit codes will result in a ExperimentException being raised.

Returns:

  • A list of lists, containing exit_code[], stdout[] and stderr[] for the commands run

**copy_and_run(tags, local_script, args=[], user=”root”,

check_exit_code=True)**

Copies a script from the local machine to the remote instances and executes the script. The script is run in series, on one instance after the other.

Parameters:

  • tags - these are used to manipulate the instance later. Use this to create logical groups of your instances.
  • local_script - the local script to run
  • args - arguments for the script
  • user - remote user. If not specified, the default is ‘root’. If you need to run commands as another user, you will have to make sure that user accepts the ssh key in ~/.precip/
  • check_exit_code - If set to True (default), commands returning non-zero exit codes will result in a ExperimentException being raised.

Returns:

  • A list of lists, containing exit_code[], stdout[] and stderr[] for the commands run

The basic methods above are standard across all the Cloud infrastructures. What is different is the constructors, as each infrastructure handles initialization a little bit differently. For example, to create a new OpenStack using the EC2_* environment provided automatically by FutureGrid:

For Amazon EC2, you have to specify region, endpoint, and access/secret keys. Note that it is not required to use environment variables for your credentials, but seperating the credentials from the code prevents them from being checked in to source control systems.

7.2.3. Examples

7.2.3.3. Setting up a Condor pool and running a Pegasus workflow

This is a more complex example in which a small Condor pool is set up and then a Pegasus workflow is run and benchmarked. The Precip script is similar to what we have seen before, but it has two groups of instances: one master, acting as the Condor central manager, and a set of Condor worker nodes.

We also need a bootstrap.sh which sets up the instances:

‹ Nimbus Phantom up cloudinit.d ›