User Tools

Site Tools


This is an old revision of the document!

Data Management

Your data is stored on the STRW data storage media which can be either the local disk of your workstation, the local-to-the-server direct attached disks of a server or on the bulk storage devices. As part of the data management cycle you need to keep track of your data and make sure that at the end of your employment at the Observatory you either copy your data to your new institute or provide the computergroup with a list of pointers to allow them to store your data onto a save archive.

Sources of data

There are three main sources of data in the astronomical community at the Observatory, they are

  • Data from the big international observatories
  • Data generated in the Sackler and Optical labs
  • Data produced by theoretical work and simulations

For the three categories, three different scenarios are in place with regards to data management.

Data from the big international observatories

At the big international observatories data acquisition takes place in a controlled fashion. Data processing is done in the head quarters computing centers and is stored and archived at the storage facilities of sites. Is some cases the that are proprietary for at maximum one year, after which they become public.

Each international observatory has a portal that can be accessed to download observational data. So the processed 'raw' data are available always and to any user. Once data has been downloaded and will be used in further processing, it is the responsibility of the user to keep track of the data and to retain its integrity. Most of this functionality is already available in the standard data reduction packages provided by the big observatories. But data can be processed by personal code for which the user needs to keep track of the data products and the data integrity him/herself. The Linux operating system can help out though its file and directory ownerships and access rights.

Depending on the processing unit data can be stored on the local disks at your workstation or on the disks attached to the server that is used for processing. This provides the best data-to-cpu throughtput. data stored at the local workstation should be stored on the /data2 device as this is always a RAID (1 or 5) type of disk. There are no backups for those disks. Local disks in servers are always a RAID (5 through 60) type. Again no backups are made for those disks. It is therefore important that you keep the original data processing software save as it may be needed to reproduce intermediary data product in case of accidental loss.

In general final data products are published and made available to the community. This can be done in several ways, depending on the size of the data or the type of product. For small data set the Centre Donnees Stelleair at Strassbourg is the standard place when published in European journals. For larger data sets one might opt to make the data available through a project website (which can be hosted by the Observatory) or through the associated big international observatory.

Data generated in the Sackler and Optical labs

Data produced by theoretical work and simulations

linux/datamanagement.1511786716.txt.gz · Last modified: 2017/11/27 12:45 by deul