Skip to Main Content U.S. Department of Energy

MeDICi: Middleware for Data-Intensive Computing

The MeDICi Integration Framework (MeDICi, or MIF, for short) is a middleware platform for building complex, high performance analytical applications. These applications typically comprise a pipeline of software components, each of which performs some analysis on incoming data and passes on its results to the next step in the pipeline. Pipelines are commonplace in all sorts of application domains, including science, engineering and business, so the applicability of MeDICi is broad. We've used MeDICi for building pipelines that handle network traffic for cybersecurity in near real-time, and pipelines that move large data sets between scientific instruments and petascale data archives.

The core MeDICi API is Java based, and at run-time it exploits a lightweight, fast and reliable run-time container, namely Mule. In addition, the MeDICi technology is designed to specifically address two difficult aspects of building pipeline-based applications, namely:

  1. Pipeline creation. The software components in a pipeline have usually been created by different programmers using various programming languages, and each may have particular hardware platform dependencies. MeDICi provides features that makes creating pipelines from heterogeneous, distributed components easier. It also supports buffering of data between components and executing multiple concurrent instances of components to improve performance.
  2. Handling large data. Passing large data sets between pipeline components can kill application performance. MeDICi provides features that give pipeline designers choices on how to pass data in pipelines to maximize application performance.

Here's a simple depiction of a pipeline in MeDICi.

MeDICI Integration Framework Overview

The MeDICi Integration Framework (MIF) has been used in several applications, and is open source and available for download. This wiki contains documentation that describes the overall architecture of MIF and how to use the MIF API to build application pipelines. There's also a list of papers that we've published.

We're also starting to analyze the performance of the MIF container. See here for MIF Benchmark Test Details. This is work in progress.

MeDICi is funded by Pacific Northwest National Laboratory's Data Intensive Computing Initiative. See here for details on the MeDICi people.

index.html.txt · Last modified: 2010/09/21 13:41 by iango