Monday, March 18, 2013

Playing around with remote software and data access

Playing around with SkeletonKey, Parrot, and Chirp

One of the projects, I've been working on is called SkeletonKey.  SkeletonKey is a tool that lets people create scripts that will run their applications in an environment that allows for remote software and data access.  Just as an example, suppose you're interested in analyzing the temperature data from the last 100 years and generating some graphics based on that.  You could download all data to your computer and then run a program to go through all of the text.  After a waiting a while, you'd get your results back.

Or if you had access to a cluster, you could split the task up and submit it to a few hundred cores and get the results back much more quickly than running the application on your personal computer.  The only problem is that you may have terabytes of data and your application may be a few gigabytes in size and you'd rather not have to transfer all of this over to the cluster and then convince the administrators that they should install your application so that you can use it.

That's where Parrot and Chirp come in.  Chirp allows you to export your data from your system to remote computers over the internet.  Parrot lets you run your application in an environment that intercepts local file access and transparently turns it into remote network access.  This is all done in user space so you can even use Parrot to run a shell script that  then runs an application that's actually located elsewhere.  So if you can run a shell script and access the web, you can run an application and get read/write access to data from remote sources without having to install a bunch of libraries and binaries or transfer large amounts of data in to do your work.

SkeletonKey works with Parrot and Chirp to generate a shell script that will do all the legwork for you.  I.e. you give SkeletonKey a simple ini file with your configuration file and it'll generate a shell script that'll download your application as well as the appropriate Parrot binaries and then run your application in a Parrot environment that has all of the software and data access you may want.

I have more information on how well this compares to accessing your files or data locally but that'll need to wait until the next entry.