Hasham, K., Delgado Peris, A., Anjum, A., Evans, D., Gowdy, S., Hernandez, J. M., Huedo, E., Hufnagel, D., van Lingen, F. and McClatchey, R.
CMS workflow execution using intelligent job scheduling and data access strategies.
IEEE Transactions on Nuclear Science, 58 (3).
- Accepted Version
Publisher's URL: http://dx.doi.org/10.1109/TNS.2011.2146276
Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies, such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust analysis environment. In this paper, we discuss an example of a pilot job based infrastructure (as used in the CMS Tier0 analysis workflow at CERN) and intelligent data reuse and jobs execution strategies to minimize its scheduling, queuing, execution and data access latencies, which have helped to achieve significant gains in the overall turnaround time of the workflow.
|Additional Information:||© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.|
|Uncontrolled Keywords:||data cahce, grid, latency, pilot jobs, workflow|
|Faculty/Department:||Faculty of Environment and Technology > Department of Computer Science and Creative Technologies|
Professor R. McClatchey
|Deposited On:||15 Nov 2010 14:42|
|Last Modified:||13 Aug 2013 10:03|
Request a change to this item
Total Document Downloads in Past 12 Months