The Grid 2: Blueprint for a New Computing Infrastructure (2nd edition)

  • Malcolm Atkinson ,
  • Ann L. Chervenak ,
  • Peter Kunszt ,
  • Inderpal Narang ,
  • Norman W. Paton ,
  • Dave Pearson ,
  • Arie Shoshani ,
  • Paul Watson ,

Chapter Scientific Data Federation, in The Grid 2: Blueprint for a New Computing Infrastructure (2nd edition)

Published by Morgan Kaufmann | 2003

Astronomy is a wonderful Grid application because datasets are inherently distributed and yet form a fairly uniform corpus. In particular: The astronomy community has a fairly unified taxonomy, vocabulary, and codified definition of metrics and units [24]. Modern data is carefully peer reviewed and collected with rigorous statistical and scientific standards. Data provenance is tracked, and derived data sets are curated fairly carefully. Most data is publicly available and will remain available for the foreseeable future. Even though old data is much less precise than current data, old data is essential when studying time-varying phenomena. Each astronomy archive covers part of the electromagnetic spectrum for a period of time and a subset of the celestial sphere. All the archives are from the same sky and the same celestial objects, although different observations are made at different times. Increasingly, astronomers perform multispectral studies or temporal studies combining data related to the same objects from multiple instruments and archives. Cross-comparison is possible because data are well documented and schematized with a common reference frame, and have clear provenance.