Driving Scientific Applications by Data in Distributed Environments

Joel Saltz1, Umit Catalyurek1, Tahsin Kurc1, Mike Gray1, Shannon Hastings1, Steve Langella1, Sivaramakrishnan Narayanan1, Ryan Martino2, Steven Bryant2, Malgorzata Peszynka2, Mary Wheeler2, Alan Sussman3, Michael Beynon3, Christian Hansen3, Don Stredney4, Dennis Sessanna4

1Department of Biomedical Informatics, The Ohio State University

2Center for Subsurface Modeling, The University of Texas at Austin

3Department of Computer Science, University of Maryland

4Interface Laboratory, The Ohio Supercomputer Center

Abstract. Traditional simulation-based applications for exploring a parameter space to understand a physical phenomenon or to optimize a design are rapidly overwhelmed by data volume when large numbers of simulations of different parameters are carried out. Optimizing reservoir management through simulation-based studies, in which large numbers of realizations are sought using detailed geological descriptions, is an example of such applications. In this paper, we describe a Software architecture to facilitate large scale dimulation studies, involving ensembles of long-running simulations and analysis of vast volumes of output data. This architecture is built on top of two frameworks we have developed: IPARS and DataCutter. These frameworks make it possible to implement tools and applications to run large-scale simulations, and generate and investigate terabyte-scale datasets efficiently.

LNCS 2660, pp. 355-364.

Last modified: