Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Internet FAQ Archives

Scientific Data Format Information FAQ

[ Usenet FAQs | Web FAQs | Documents | RFC Index | Restaurant inspections ]
Archive-name: sci-data-formats
Last-modified: 13 Oct 1995

See reader questions & answers on this topic! - Help others by sharing your knowledge
Recent changes:

  ==within last two weeks==
Format changes to document
Correction to CDF information
Corrected URL of JCAMP standard
New URL for GeoTIFF information

  ==within last four weeks==
Added DLG-3 information
Added DEM information
Corrected location of CIF information

This is the FAQ for the newsgroup.  Contents:

 1) How to use this document
 2) How to get a current copy of this document
 3) Resources for format information
 4) Resources for visualization software information
 5) How to use the data retrieval methods
 6) Why isn't my favorite format on this list?

Each (major) section has a "Subject:" line, so you can search on the
subject title above to find the section quickly.

This article is copyright (c) 1995 by Ilana Stern.  It may be freely
distributed provided that this copyright notice and the information
on retrieving a current copy are not removed. 

Comments, corrections, or additions should be sent to Ilana Stern


Subject: 1) How to use this document

Most FAQ (Frequently Asked Questions) documents list many questions and 
their answers.  This FAQ is (mostly) devoted to answering only one question:

"Where can I find documentation and software for [X] data format?" 

As the amount of information available over the networks has been 
increasing, so have the methods by which this information can be obtained.  
No longer is direct usage of FTP the only, or even the most frequent, method 
of obtaining data;  we now have Gopher, Wais, and WWW, as well as many
site-specific interfaces.  Because the information itself may be accessible
in many different ways, this FAQ will identify resources in terms of
URLs (Uniform Resource Locators).  This will also help us convert this
FAQ to a hypertext document, so that it can be used with a WWW browser
to go directly to any of the listed sources.

Here's a glossary, so you can decode the URLs if necessary to reach 
the sites: 

	<URL:[directory/[filename]]>	ftp site
	<URL:[directory/[filename]]>	www server
	<URL:telnet://>				telnet site
	<URL:gopher://[directory/[filename]]>	gopher server
	<URL:wais://>				wais server
	<>				newsgroup

So, for example, if a document is available at
it means that you should ftp to, and the information is
in the top-level directory.

If you don't know what these information retrieval methods are, see
the section "How to use the data retrieval methods".


Subject: 2) How to get a current copy of this document

If you are reading this document after 11 Oct 1995, you are reading an 
outdated copy. A current copy of this document can be obtained by anonymous 
FTP to <URL:>.
If you don't know what FTP is, see the section "How to use the data retrieval 

If you can't use FTP, send email to with
	send /pub/usenet/news.answers/sci-data-formats
  as the only text in the message (leave the subject blank).

A current somewhat hypertext version of this document can be obtained 
from <URL:>.
Real hypertext versions are available at
and (for European users in particular) at
If you would like to archive this FAQ in either hypertext or plaintext
format, and want to receive a new copy automatically at every update, 
please send me email.


Subject: 3) Resources for format information

1) CDF
4) HDF
5) netCDF
7) PDS
8) Miscellaneous graphics formats
10) SDTS
11) HDS
12) MedFileS
13) CXF
15) CIF
16) OpenMath
17) GeoTIFF
18) DLG-3
19) DEM

Format: 1) CDF

  CDF (Common Data Format) is a library and toolkit for storing, manipulating,
and accessing multi-dimensional data sets.  The basic component of CDF is a 
software programming interface that is a device independent view of the 
CDF data model.  
  All software and related information, including a FAQ and hypertext User's
Guide with searchable WAIS index, are available from the WWW site:
  A user's guide and software are on <URL:> 
for VMS and <URL:> for all others.
  A recent paper for CDF is available from 
  A mailing list,, exists for discussion of CDF.
To subscribe, please send email to "" with the
command "SUBSCRIBE cdf-users" in the body of your message.
  Questions can be directed to
  A client-server software layer called CSCDF, which can be used with 
the CDF library to provide applications access to remote CDF datasets,
can be obtained from its author, Hillel Steinberg, by email at

Format: 2) FITS

  FITS (Flexible Image Transport System) is the standard data interchange 
and archival format of the worldwide astronomy community.  The NOST Standard 
and User's Guide, some software, and test files are available from 
  The site <URL:> has other software and a different 
set of test files, and electronic copies of FITS proposals that are under 
development or in the international approval process.  Archives of the 
USENET newsgroups, sci.astro.fits (which is devoted 
to discussion of FITS), and others that are of interest to astronomers can 
be found here.  This site is also accessible via ftp at
  The "FITS Support Office" which contains many useful documents and links
to other information, is at 
  A WAIS index that can be searched for FITS information is at 
  If you've searched all these resources and still have questions, you 
can direct them to

Format: 3) GRIB

  GRIB (GRid In Binary) is the World Meteorological Organization (WMO) 
standard for gridded meteorological data.  Unfortunately it is still not 
very "standard", as some organizations use their own versions.  A format 
description for WMO GRIB, and software to read general GRIB grids, can be 
found at <URL:>.  The format 
description can also be found at 
  If you need GRIB to read ECMWF data, the above format description, along 
with the ECMWF-specific parameter table, and a list of differences between
the WMO and the ECMWF versions of GRIB, is in 
<URL:>.  Read code can be 
found in <URL:>.
  If all else fails, contact Ilana Stern at

Format: 4) HDF

  HDF (Hierarchical Data Format) is a self-defining file format for transfer 
of various types of data between different machines. The HDF library contains 
interfaces for storing and retrieving compressed or uncompressed raster images 
with palettes, and an interface for storing and retrieving n-Dimensional 
scientific datasets together with information about the data, such as labels, 
units, formats, and scales for all dimensions.
  Source code and documentation are on <URL:>.  
Some general information on HDF, including a FAQ, is available from
  The HDF WWW information server, with links to the above plus an in-progress 
HTML reference manual is on <URL:>.

Format: 5) netCDF

  NetCDF (Network Common Data Form) is an interface for scientific data 
access which implements a machine-independent, self-describing, extendible 
file format.  All netCDF information is available via the WWW site
  Source code and documentation for the netCDF data access library 
is available from <URL:>.
  A FAQ is available from 
<URL:> or in text from 
  Past netCDF support inquiries have been archived and can be searched from
  The netCDF User's Guide is available as a hypertext (HTML) document
from <URL:>,
in compressed PostScript at 
<URL:>, or in source form 
with the netCDF source distribution.
  A recent paper (Jenter and Signell, 1992) which provides a good introduction 
to netCDF is available as <URL:>.  
  A visual browser for netCDF format data files is available from
  A mailing list,, exists for discussion of the 
netCDF interface, and for announcements of netCDF news:  to subscribe, send 
a message to containing the line:
"subscribe netcdfgroup".  The archives of netcdfgroup are available from 
<URL:>, and can be searched 
at <URL:wais://>.
  For more information, contact

Format: 6) VICAR 

  VICAR (Video Image Communication and Retrieval) is a collection of image
processing programs supported by the Multimission Image Processing
Laboratory (MIPL) at the Jet Propulsion Laboratory (JPL), for use
in manipulating and analyzing spacecraft images.  The image format
used by VICAR programs, and for all or most data from JPL-managed
missions, is referred to as VICAR format.  An independent third-party
description of the VICAR image format is available at
  A much more comprehensive and official description of the VICAR
image format was recently spotted at
<URL:>.  Contact for more information.

Format: 7) PDS

  In recent years, the Planetary Data System (PDS) has been responsible
for archiving space mission data on CD-ROM media, using its own self-
describing data format, variously know as PDS or ODL (Object Description 
Language).  At least some of the current projects (e.g. Magellan, Galileo) 
are using the PDS format as a "pointer" to detached VICAR-format imagery on 
the mission CDROM volumes. 
  The PDS Standards Reference Document can be found at
<URL:>.  For more information,

Format: 8) Miscellaneous graphics formats
  These formats for storing graphics files -- TIFF, GIF, JPEG, FLI, CGM,
and so on -- are more properly discussed in the newsgroup  
A small amount of documentation on these and other graphics formats is on 
<URL:>;  other archive sites
are <URL:>, and
  The site <URL:> has information
on the MPEG format.
  The FAQ and resource file have more information on where
to find read and conversion programs for these formats.  You can find
them at <URL:>.
  A good (hardcopy) reference for graphics formats is  _Graphics
File Formats_, by David C. Kay and John R. Levine (Windcrest Books,
ISBN 0-8306-3060-0, about US$30.00 in paperback).
  See section 17 for information on GeoTIFF.

Format: 9) SAIF

  SAIF (Spatial Archive and Interchange Format) is a Canadian standard
for the exchange of geographic data.  It uses an object oriented data 
model, and consists of definitions of the underlying building blocks, 
including tuples, sets, lists, enumerations, and primitives.
  A company has formed to provide tools and training for the SAIF data
standard.  Safe Software may be contacted by email at
or by phone at either (604) 241-4424 or (604) 583-2016.  They maintain a 
WWW page for SAIF at <URL:>
which will be continually updated. 
  The SAIF specification is also available by FTP at
<URL:> and
  There is a SAIF Mailing List:  send email to "" with 
the subject "SAIF Request" to be added to the list.

Format: 10) SDTS
  SDTS (Spatial Data Transfer Standard) is a Federal standard (Federal 
Information Processing Standard (FIPS) 173) for transfer of geologic and 
other spatial data.  Documentation and examples are available from the USGS
at <URL:> (for WWW users;
this is an html interface to the ftp site, which can also be accessed
directly, although not as nicely, at <URL:>.
  For more information, contact

Format: 11) HDS

  HDS (Hierarchical Data System) is a freely available database system.
It is particularly suited to the storage of large multi-dimensional arrays 
(with their ancillary data) where efficiency of access is a requirement.
It is presently used in astronomy, for storing (in particular) images, 
spectra and time series.
  Documentation, and information on obtaining the source code, is
available at <URL:> or in a LaTex document at

Format: 12) MedFileS
  The Medical File Standard (MedFileS) is a global project coordinated 
via the internet to provide a standard for the recording of clinical medical 
information.  Anyone may participate in the project or obtain the current 
standard by e-mail to "".  Information is obtained by 
sending commands in the subject line of e-mail messages.  The command 
"send distrib." will provide a full description of the e-mail distribution
system.  The command "send overview." will provide a document detailing the 
MedFileS project.
  NOTE:  an attempt on 19 Dec 1994 to obtain MedFileS failed.

Format: 13) CXF

  CXF provides representation of chemical substances and queries, 
including atoms, fragments, molecules, and reactions.  Also available are 
various substance types, including organics, inorganics, polymers, salts, 
hydrates, multi-component mixtures and biosequences.  
  The specification is available at <URL:>.
  For more information, interested users should contact Thomas Steckert 
( or Joseph Mockus (  Questions and 
comments also are welcome.  

Format: 14) JCAMP

  JCAMP is a draft standard for spectra data (IR & NMR) and chemical stuff
which is related to netCDF.  Some references:
  JCAMP-DX for NMR, A. N. Davies, P. Lampen, Applied Spectroscopy,
1993, 47, 1093-1099;
  A proposed European Implementation of the JCAMP-DX Format, D. N. Rutledge, 
P. Mcintyre, Chemometrics and Intelligent Laboratory Systems, 1992, 16, 95-101
  JCAMP-DX, A standard format for exchange of infrared-spectra in computer 
readable form, J. G. Grasselli, Pure and Applied Chemistry  1991, 63, 1781-1792
  JCAMP-CS A standard exchange format for chemical-structure information, 
J.Gasteiger, B. M.  P. Hendriks, P. Hoever, C. Jochum, H. Somberg, Applied 
spectroscopy, 1991, 45, 4-11
  Also, see the DEC 1994 issue of Applied Spectroscopy.
  A viewer is at <URL:>
  The mass spectrometry standard is available at
<URL:> ( 

Format: 15) CIF

  CIF (Crystallographic Information File) is becoming standard in the
crystallography world and related fields:

Format: 16) OpenMath

  The OpenMath effort aims at developing a standard exchange format for 
mathematical objects (such as formulae processed by computer algebra systems).
The OpenMath home page is located at

Format: 17) GeoTIFF

  A new set of TIFF tag extensions for georeferencing raster data within
TIFF 6.0, GeoTIFF, was announced July 1995.  Information is available
at <URL:> and
specifications and source code are available via ftp at
<URL:>.  A mailing list for discussion
of the development of this standard is;  to
subscribe send email to with 
subscribe geotiff  your-name-here
as the body of the message.

Format: 18) DLG-3

  The Digital Line Graph (DLG) format is used by USGS to store geographical
vector data.  Documentation on this format is available at

Format: 19) DEM

  A Digital Elevation Model (DEM) consists of a sampled array of elevations 
for ground positions that are normally at regularly spaced intervals.  
<URL:> has information
about this format (along with data availability) from the USGS.


Subject: 4) Resources for visualization software information

  Many visualization software packages exist which are intended to be 
used with data in one or more of these standard formats.  Here are 
pointers to some lists of information about this software.  (Note that
this is somewhat outside the scope of this document, which is really
only intended to discuss data formats, but I think this will be
useful to many.)
  Brief descriptions and pointers to software that can be used with
netCDF is at <URL:>.
  A page of links to many scientific visualization and graphics software
packages is at <URL:>.
  A page of links to both graphics software and various scientific data 
format descriptions is at <URL:>.
  An article comparing several scientific visualization techniques and
packages is available at <URL:>.


Subject: 5) How to use the data retrieval methods

  This section only describes FTP and telnet in any detail;  for other
methods, FTP sites are given, so you can get information on them yourself.

How to use FTP

FTP (File Transfer Protocol) allows transfer of files between two computers
which are on the Internet.  To access the FTP areas listed here, at your
system prompt type "ftp" followed by the name of the desired system.  For 
example, to access you'd type


Use "anonymous" as your login and your email address as the password (if

[Note: quotes ("like this") are used to set off names of directories and
files, or commands you'd type, and are not part of these names.]

Not all FTP systems accept the same commands, but here's a list of the
most useful:

	ls      list files in the current directory.
	cd      change directory, e.g. "cd wx" changes to the wx directory.
	binary  sets binary mode
	ascii   sets ascii mode (the default).  Use for retrieving text.
	get     retrieves a file, e.g. "get readme" gets a file called readme.
	bye     exits FTP.

If you can't seem to connect to the site, check to see if it is a telnet
site.  If it is, follow the instructions in the following section instead.

If you can't FTP from your site, use one of the following ftp-by-mail servers:     

Send an e-mail message to the closest address, with the lines:

	reply your_address@some.where     <- with your email address
	connect         <- for example
	cd datasets/ds111.2/software
	get access_sun.f

For complete instructions, send a one-line message reading "help" to the
server.  Please don't ask me for help!

How to use telnet

Type "telnet" followed by the name or IP number of the desired system.  These
publicly accessible systems generally allow you to log in but put you in
a restricted shell, from which only a certain menu of commands is available.
The description for the site will include the login to use.

If you can't seem to connect to the site, re-check its description in the
document; if it's an FTP site, follow the instructions in the previous 
section instead.

Gopher information

Available by ftp at

Wais information

Available by ftp at

WWW information

Available by ftp at
WWW is so easy to use that you might as well just hop in and try it, so 
ask your sysadmin if you have a WWW browser such as NCSA Mosaic or 


Subject: 6) Why isn't my favorite format on this list?

If you don't see a format you're interested in here, it could be one
of three reasons.  First of all, there are a lot of formats which are
out of the scope of this newsgroup:  it ain't named *sci*.data.formats
for nuthin', you know.  Formats used in commercial spreadsheet and
word-processing software aren't scientific data formats, and aren't
discussed in this group.

Second, it may be that nobody has given the FAQ organizer any information
on sources for information on that format.  So ask the newsgroup -- and
if you do get a response, please let me know what it is!

Finally, you may ask on the net, and hear nothing, because the 
data format description just *isn't* publicly available.  For most 
scientific data formats, this is a Bad Thing, and most archivists and
scientists *want* to have their format information available.  If
you have such information, but don't have resources to make it
available, please ask around and see if you can get it into an FTP
area or other resource.  Please don't publicize private or proprietary
formats without the permission of the author, though.

/\       Backcountry skiing is for anarchists and coyote angels.  Your feet 
  \_][     get cold and no one admires your new outfit.  [C. L. Rawlins]
      \__Ilana Stern | |

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:

Last Update March 27 2014 @ 02:12 PM