LONI Pipeline Processing Environment

The LONI Pipeline is a free workflow application primarily aimed at Neuroimaging Researchers. With the LONI Pipeline, users can create workflows that take advantage of all the greatest Neuroimaging tools available, quickly.

Visit Website

Download

Features

If there are tools that you need that can only work on another operating system, you can install a Pipeline server on that computer and connect from your client to do processing and analysis remotely.
If you don\'t want to define your own modules, you can connect to the LONI Pipeline server and take advantage of all the modules we have defined for use in our lab. This way, you also get the benefit of all your processing being done on our 600 CPU computing grid.
If you have a grid at your disposal the Pipeline can exploit parallelism in your workflow, and process large datasets in about the time it takes to process a single one.
Putting together workflows only requires knowledge of your tools and your goal, instead of programming languages and scripts.

Description

The LONI Pipeline is a simple graphical environment for constructing complex scientific analyses of data. It provides a visually intuitive interface to data analysis while also allowing for diverse programs to interact seamlessly. The Pipeline allows researchers to share their methods of analysis with each other easily and provides a simple platform for distributing new programs, as well as program updates, to the desired community. The environment also takes advantage of supercomputing environments by automatically parallelizing data-independent programs in a given analysis whenever possible. Finally, the LONI Pipeline can run in a client-server mode, allowing access to compute servers running analysis software that benefits from a dedicated machine with vast computational resources.

Visit the Pipeline website.

System Requirements

Windows

Version: 6.3
Size: 27 MB
OS: NA
Processor: Not processor-specific
Memory: Less than 64Mb for client - 100Mb + X (X denotes variables like client load, etc.)
Software: Java 1.5 or higher

Linux

Version: 6.3
Size: 26 MB
OS: NA
Processor: Not processor-specific
Memory: Less than 64Mb for client - 100Mb + X (X denotes variables like client load, etc.)
Software: Java 1.5 or higher

Mac

Version: 6.3
Size: 27 MB
OS: NA
Processor: Not processor-specific
Memory: Less than 64Mb for client - 100Mb + X (X denotes variables like client load, etc.)
Software: Java 1.5 or higher

Installation

Setting up a Client:
-------------------
Download the jar file from the Downloads page. Make sure Java 1.5 or above is installed on your local system. Launch it using the following command ::

java -jar Pipeline.jar

Setting up a Server:
-------------------
Setting up a Pipeline server
Introduction
Installation
Requirements
Downloading
Starting the server
Configuration
Hostname
Temp file location
Port number
Maximum Simultaneous Jobs
Server Library
Authentication
Adding module definitions

Introduction

Setting up your own Pipeline server is a great way to remotely take advantage of the power of a cluster or a just a dedicated computer with many helpful programs installed on it. More importantly, you can enable many people to take advantage of all this power all through the easy to use interface of the Pipeline client.
Installation
- Requirements
  
  The Pipeline server can run on any system that is supported by JDK 1.5 or higher, so the first thing to do is head over to Sun and download the latest JRE/JDK. If you run the server on Windows, you will not be able to use privilege escalation (you might not even need/want it), but all other features are available.
  
  The amount of memory required varies based on the load you will expect on the server, but for a reference point, the Pipeline server running on pipelinev4.loni.ucla.edu has been set to accept a max load of 620 jobs, and its memory footprint hovers between 50-300MB depending on the load and garbage collection scheme.
- Downloading
  
  Head over to the Pipeline download page and download the latest version of the program. The server and the client are both in the same jar file, so you only need to change the Main entry point when starting up the server. Extract the contents of the download to the location you want to install the server at.
- Starting the server
  
  Now let's start the server for the first time. Get to a prompt and switch to the directory where you copied the Pipeline.jar and lib directory and type:
  $ java -classpath Pipeline.jar server.Main
  
  Assuming you have java in your path, you should have received the following message back in your terminal window:
  Server Started on port 8000
  
  That's not enough to have a fully functional server yet, but we're a step closer, so go ahead and break out of the process by hitting Ctrl-C.
Configuration

Next, we need to setup our preferences file. When you run the server for the first time, it should have created directory where all your preferences and logs will be stored. Depending on your operating system, you can find this directory in one of the following locations:
Linux/Unix - $HOME/.pipeline/
OS X - $HOME/Library/Preferences/Pipeline/
Windows - %HOME%\Application Data\LONI\pipeline\
Windows Vista - %HOME%\AppData\LONI\Pipeline\

Open up your favorite text editor, and paste in the following sample preferences file:

pipelinev4.loni.ucla.edu
/ifs/tmp/

Save the file out as "preferences.xml"
- Hostname
  
  The element specifies the hostname of the computer that you want the server to run on. Ironically, this element requires the fully qualified domain name of the computer that it is on, not just the hostname. For example, "mycomputername" would be a hostname, whereas, "mycomputer.labname.university.edu" would be my fully qualified domain name.
- Temp file location
  
  The element specifies where all intermediate files for all the executed programs are stored on the computer. The Pipeline server will create a structure under there that will look like:
  
  /username/timestamp/
  
  Where username is the user that is running the server and timestamp is the time at which each workflow gets translated before execution. Inside each of those 'timestamp' folders will be all the intermediate files produced by executables from submitted workflows. Depending on the number of users using your server and the kind of work they do, this directory can balloon up very quickly.
- Port number
  
  If no port number is specified in the preferences, then the server will attempt to list on port 8000. If you want to change the port number use the element in your preferences.xml file:
  
  pipelinev4.loni.ucla.edu
  8001
  /ifs/tmp/
- Maximum Simultaneous Jobs
  
  As your server becomes busier and busier, at times you will have users submitting more jobs at once than your server has enough capacity to handle. In order to prevent your system or cluster from coming to a grinding halt, you can set the maximum number of simultaneous jobs in the preferences. By default, the Pipeline server will set this value equal to the number of cores/cpus that you have available in your computer. For example, a computer with a dual processor quad core chips, will have a maximum number of simultaneous jobs of 8. If you want to change this (because you have a DRMAA enabled grid available) you can set this preference to any value you want.
  
  pipelinev4.loni.ucla.edu
  8001
  /ifs/tmp/
  620
  
  Take note that this will not reject jobs submitted by users after the limit has reached. It will just queue them up until there is an available slot for execution. For grid setups, you should probably have the limit be a little higher than the number of compute nodes available to you, because submitting through DRMAA takes a non-negligible amount of time, and its best to keep your compute nodes crunching at all times.
- Server Library
  
  When Pipeline client users connect to a server, the client syncs up the library of module definitions available on that server. The location of that library on the server is specified by the element in the preferences. By default, the location is set to one of the following locations (based on OS), so you don't need to specify this preference if you're happy with it:
  Linux/Unix - $HOME/documents/Pipeline/ServerLib/
  OS X & Windows Vista - $HOME/Documents/Pipeline/ServerLib
  Windows - %HOME%\Application Data\LONI\pipeline\
  When the server starts up, it reads in all the .pipe files in the ServerLibraryLocation directory (and all its subdirectories) and monitors it for changes/additions in any of the files while it runs. Simply put all the module definitions that you want to make available to users into this directory, and when clients connect they will obtain a copy of the library on their local system. If you add/delete/change any of the definitions in this directory, the server will automatically see the change (no restart required) and synchronize clients again when they reconnect.
Authentication

Now that you've configured everything, its time to set up authentication so you can actually let users into your server. The Pipeline authenticates users using the Java Authentication and Authorization Service (JAAS), which allows the server operator to authenticate usernames and passwords against any type of system that they want. When a user connects to the server, the Pipeline tries to create a new LoginContext and if the creation is successful, attempts to call the login() method. If true is returned, we allow the user to continue and otherwise the user is disconnected from the server with an "Authentication Failed" message.

In order for the Pipeline server to successfully create a LoginContext, we need to write a little code in Java that handles the authentication scheme. This essentially boils down to 1) implementing the LoginModule interface, 2) packaging the class into a jar file, and 3) making sure its contents are available in the classpath of the server when you launch it. For steps 1 and 2, I will redirect you to the excellent documentation provided by Sun on how to complete those tasks.

Once you've got your jar file, you need to create a configuration file to reference the LoginModule inside your jar file. So fire up your favorite text editor and type the following:
/** Login Configuration for the Pipeline **/

PipelineLogin { edu.ucla.loni.pipeline.security.LONILoginModule required debug=true; };

In your configuration file, you should replace "edu.ucla.loni.pipeline.security.LONILoginModule" with the path to the LoginModule class you implemented. Now save the file out as pipeline_security.config into the same directory where you placed the Pipeline.jar file and start up the server.
$ java -Djava.security.auth.login.config=pipeline_security.config -classpath Pipeline.jar server.Main

As you can see, we're setting the system property java.security.auth.login.config to pipeline_security.config, so when the Pipeline tries to create a LoginContext, JAAS will check this property for a filename, go into the file and read in the class name of the LoginModule specified. Using reflection, it'll load the class and return it to the Pipeline.

Purpose

A New Way

The LONI Pipeline is a workflow application that allows users to easily describe their executables (ie. create a module) and connect them together to create complex analyses all without having to code a single line in a scripting language.

Once you've created a module for use in the LONI Pipeline, you can save it into your personal library and reuse it in other workflows you create by simply dragging and dropping it in.

Cross Platform Compatible
Because the LONI Pipeline is written in Java, you can work in whatever operating system suits you best. If there are tools that you need that can only work on another operating system, you can install a Pipeline server on that computer and connect from your client to do processing and analysis remotely.

Linux OS X Windows

If you don't want to define your own modules, you can connect to the LONI Pipeline server and take advantage of all the modules we have defined for use in our lab. When you create a workflow using our modules, you also get the benefit of all your processing being done on our 600 CPU computing grid. All you need to gain access to it is a LONI account. DRMAA Grid Support

If you have a grid at your disposal, the Pipeline is the perfect tool for taking advantage of all that processing power. The Pipeline can exploit parallelism in your workflow, and make processing of large data sets take about as much time as a single one. All of this without any additional work required by the user to specify how to break up the workflow.

Simple User Interface

With the clean and easy to use interface of the Pipeline, you can free your mind to think about research problems instead of system administration. Putting together workflows only requires knowledge of your tools and your goal, instead of programming languages and scripts. While your workflow is executing you can see exactly what step the Pipeline is currently on, and even see the output of a particular step when it finishes instead of waiting for the entire workflow to finish.