NAME
dds_user - Data Dictionary System, Generic Application
Guide
SYNOPSIS
Data Dictionary System, Generic Application Guide
DESCRIPTION
Most application programs that use DDS share certain
features. This man page describes these features and how
they are used with applications.
Note: These conventions are recommended, but are not manda-
tory. Some DDS applications may deviate from them, to
achieve their objective.
DDS applications are controlled by definitions (name, value
pairs). These definitions may come from the command line, a
parameter file, or a graphical user interface. GUI specific
conventions are detailed in another document.
Control definitions fall into three categories: input, out-
put, and algorithm control. Algorithm control is specific
to each application and associated definitions should be
documented with it. Definitions that control input and out-
put are common to most DDS applications. These are
described below, in a non-GUI context.
Control definitions are kept in a dictionary named par:.
Each processing step has its own, independent par:. It con-
tains definitions from the command line and parameter file.
COMMAND LINE
Application are controlled by command line arguments. These
arguments are copied into a temporary file and interpreted
as a dictionary. For example, the following command line
executes a DDS application (bridge) with several arguments.
The contents of the temporary dictionary are shown below.
bridge foo in_data= line23.usp in_format= usp out= bar
DDS_Rev= 2.2a
********************************************
******************* TITLE ******************
********************************************
cmd_title= bridge, convert seismic data
cmd_name= bridge
cmd_user= zrls13
cmd_date= Tue Sep 19 09:53:16 1995
cmd_host= gpss83
cmd_cwd= /home/gpsa/zrls13/test
cmd_pid= 22889
cmd_args= foo
in_data=line23.usp
in_format=usp
out= bar
The first group of definitions in this temporary dictionary
are automatically created. They identify the processing
step, user name, when it started, and where it was run. The
second group of definitions preserve the command line argu-
ments. These definitions are accumulated in the output his-
tory, along with the input dictionary. This provides a
comprehensive history that can be used to evaluate process-
ing or repeat it again.
Care must be taken, when definitions are placed on the com-
mand line. Special characters must be protected from shell
interpretation. These include (#, ", ', *, ?, $, (, ), <,
>, |, &). Shell documentation should be consulted, if these
characters are used on the command line.
PARAMETER FILE
Control definitions can be placed in a parameter file, and
referenced from the command line. This circumvents problems
with the shell and special characters. It also provides an
alternative to (re-)typing long command lines. The follow-
ing could be placed in a file named foo.par. DDS applica-
tions could reference this parameter dictionary by defining
"par= foo.par" on the command line.
in_data= classic.usp
in_format= usp
map:usp:segy.GrpDatum= 10 * RfSrEl
out= bar
out_format= segy
Definitions on the command line have precedence over the
same name within a parameter dictionary. For example, nomi-
nal values can be placed in a parameter file, and one attri-
bute can be varied on the command line with each execution.
In the following example the command line overrides the
definition of in_data=, provided in the parameter file.
bridge par= foo.par in_data= test.usp
INPUT CONTROL
Input data is described by an input dictionary. If the dic-
tionary exists, it is defined as follows:
in= {dict_filename | stdin: | fdn:}
This dictionary describes the binary input data and its
processing history. If in= stdin:, then shell operators
(< and |) may be used to redirect input. If not defined,
the dictionary defaults to stdin:. The default is
ignored, if standard input is connected to a terminal.
Its also ignored, if par: defines format or in_format.
This allows stdin: to be used for binary data, when pip-
ing from a non-DDS (usp, segy, disco) application. The
fdn: mnemonic identifies an input file descriptor.
Sometimes an input dictionary does not exist for the binary
data. This is typical when the data was created by a non-
DDS application. In this case, the following must be
defined in par:
in_format= {usp | segy | disco | cube | ...} attributes ...
This overrides the input data format. If the in diction-
ary exists, it should define format. If not, the user
must provide a hint by defining in_format (no default).
The user may need to define a filename for the binary input
data. For example:
in_data= {data_filename | stdin: | fdn:}
This overrides the filename for the binary input data.
If the in dictionary exists, it should define data. If
not, the user must define in_data, unless the default
(stdin:) is acceptable. Shell operators (< and |) may
be used to redirect input, when in_data= stdin:. The
fdn: mnemonic identifies an input file descriptor.
Hint: Avoid using shell "< file" redirection. Neither the
application nor processing history can see the actual
filename used by the Shell. Define in= or in_data= instead.
This hint also applies to output redirection. Shell pipes
introduce the same uncertainty, but it can not be easily
avoided.
OUTPUT CONTROL
An output dictionary is used to describe the output binary
data. It also contains the processing history accumulated
with the data.
out= {dict_filename | stdout: | tmpxxx: | fdn:}
The output history is written to this dictionary. It
contains a copy of the processing history (input diction-
ary) and definitions introduced by this step. Shell
operators (> and |) may be used to redirect output, when
out= stdout:. The dictionary defaults to stdout:. The
default is ignored, if standard output is connected to a
terminal. Its also ignored, if par: defines out_data.
This allows stdout: to be used for binary data, when pip-
ing into a non-DDS (usp, segy, disco) application. The
tmpxxx: mnemonic creates a temporary disk file that is
automatically removed. The fdn: mnemonic identifies an
output file descriptor.
The end user may choose the format for the binary output
data. Perhaps the next processing step requires a different
standard format than the input.
out_format= {usp | segy | disco | cube | ...} attributes ...
This overrides the format for output data. The output
format is the same as the input, unless explicitly
changed.
The end user may choose the filename for the binary output
data. It may be on another file system that provides speed,
capacity, or special backup.
out_data= {data_filename | stdout: | dict: | tmpxxx: | fdn:}
This overrides the filename for binary output. If
out_data isn't defined, a name is derived automatically.
The data name is based upon the name of the out diction-
ary and media. If the dictionary is a regular file, then
the data is output to another regular file. The default
filename is DATA_PATH/dict_name.fmt_name. If the dic-
tionary is not a regular file (i.e. pipe, tape, ...), or
if the out_data value is dict:, then binary data is
attached to the output dictionary, i.e. they share the
same file, pipe stream, or tape media. The tmpxxx:
mnemonic creates a temporary disk file that is automati-
cally removed. The fdn: mnemonic identifies an output
file descriptor.
AUXILIARY INPUT AND OUTPUT
Application programs may support auxiliary input and output.
Auxiliary data is controlled by definitions analogous to
"in", and "out". The dictionary for auxiliary data may be
defined by names like "vel", "Vp", "Vs", "den", or "snap".
An application may allow the end user to override the format
and data names. For example, "vel", "vel_format", and
"vel_data" may be accepted. Auxiliary dictionaries and data
typically do not use stdin: and stdout:.
AXIS DEFINITIONS
Binary data is projected into a cartesian coordinate system.
DDS supports from one to nine axis. Each axis is identified
by name. The dictionary which describes the binary data
must define the number of dimensions and their names. It
must also define the maximum size for each dimension. The
first dimension varies most rapidly, like Fortran arrays.
This example defines three dimensions, their names and
sizes:
axis= t x shot
size.t= 1000
size.x= 96
size.shot= 2500
Each axis may have additional attributes. Nominal attri-
butes include origin, delta, units, and sort. Additional
attributes may be required for a particular DDS application.
delta.t= 4
delta.x= 50
origin.x= 0.
units.t= msec
units.x= feet
sort.x= TrcNum
sort.shot= SoPtNm
Abbreviations for axis and unit names should be consistent
and meaningful. This requires the cooperation of end users
(and applications!). Recommended names are described in
dds_expert(1). If you deviate from these conventions, the
DDS virus will plague you until corrected.
Line headers for standard seismic formats specify some axis
parameters. None of them provide sufficient information to
define three dimensions with nominal attributes. Standard
formats are grossly inadequate for describing hypercubes
with higher dimensions.
Output line headers are constructed from axis definitions.
Information is inherently lost, because of header limita-
tions.
Input line header information is merged into axis defini-
tions. Precedence is given to control parameters defined in
par:. This allows the end user to override an incorrect
header value. If not specified, line header information is
honored. If not available, the input history can provide a
default.
LINE HEADERS
Line headers for standard seismic formats have many fields.
They can be described using DDS notation for binary struc-
tures. Each line header field is assigned a symbolic name,
type, position, size, alignment, and descriptive text.
These are described in man pages dds_usp(1), dds_segy(1),
and dds_disco(1).
Input line header information is converted into definitions.
A definition name is constructed from the format and field
name. For example, the SEGY field ReelNum becomes
"segy_ReelNum= nnn". Precedence is given to control parame-
ters defined in par:. This allows the end user to override
an incorrect header value. If not specified, line header
information is honored. If not available, the input history
can provide a default.
Output line headers are constructed from current defini-
tions. If nothing is defined, a default value (zero, or
blank) is assumed.
What's the bottom line? Line header information for a par-
ticular format is accurately preserved and reconstructed via
dictionaries. Extraneous definitions are minimized, through
the use of defaults. Values that were "hidden" in a binary
field, can be examined in dictionaries by mortals. Explicit
definitions can supplement or correct erroneous input header
values.
Well, almost... The current version of DDS does not
preserved and reconstructed the USP HLH (Historical Line
Header). The same is true for Disco processing history.
Perhaps the next release of DDS will do it.
TRACE HEADER FIELDS
The structure of traces can be described, similar to line
headers above. This allows standard and generic trace for-
mats to be specified with the same DDS notation. Seamless
Seismic Data Access is facilitate by this common framework.
All fields in a trace are created equal; DDS does not
discriminate between headers and samples. Traces may con-
tains many fields, or only one. The seismic Samples field,
if it exists, normally has the size defined by the first
axis. Operations that can be applied to headers, can also
be applied to Samples, and visa versa. Well, almost...
Field map definitions only support scalar expressions. The
next DDS release should support vectors too.
Fields may have any binary data type supported by DDS.
Types are not limited to those supported by the host com-
puter. The list includes character, unsigned, integer,
float, and complex. Well, almost... complex support is
currently kludged as komplex. Binary representations
include IEEE, Cray, IBM, ASCII, and EBCDIC. Supported types
are described in dds_expert(1). Well, almost... Some com-
binations of binary modes and host computers are pending
conversion software.
Each field has a constant element count. If the count is
greater than one, the field is a vector. If equal to one,
its a scalar. If zero, no storage is reserved for a value.
If the count is minus one, it corresponds to the first axis
size.
TRACE FIELD MAPPING
Fields can be converted from one trace format to another.
Conversion is controlled with map definitions. Default maps
exist between all formats. They are adequate for most, but
not all, field conversions. The end user should understand
what the defaults are, and override them as needed.
What is the default map from format "A" into "B"?
dds_rosetta(1) summarizes the default map between standard
formats. It provides a quick and dirty view of what's hap-
pening. Detailed descriptions of formats and default maps
are in dds_usp(1), dds_segy(1), dds_disco(1), and
dds_format(1).
Defaults are not a substitute for knowledge. The end user
must know what is available in the input header. This
includes the field name, what it should represent, and
whether the value is accurate or erroneous. The end user
must know what is wanted in the output header. If the input
is accurate, the default map is probably ok. If not, the
end user must correct the problem with an explicit map
definition.
How can I override the map from format "A" into "B"?
A map definition looks like "map:in_fmt:out_fmt.out_field=
expression". Map definition names are prefixed by map:.
in_fmt and out_fmt name the input and output format to which
the definition applies. out_field names a field in the out-
put format. An "*" may be used for any of these items, to
match any name. expression is an algebraic expression. It
may contain constants, field names, operators, and
parenthesis. The syntax is described in dds_map(1). Exam-
ple map definitions:
comment= map from SEGY format, to USP field ...
map:segy:usp.TrcNum= CdpTrcNum
map:segy:usp.SrRcMX= 0.5 * (SrcX + GrpX)
map:segy:usp.SrRcMY= 0.5 * (SrcY + GrpY)
comment= map from Disco format, to USP field ...
map:disco:usp.TrcNum= SEQNO
map:disco:usp.SrPtXC= field("SHT-X", void)
map:disco:usp.SrPtYC= field("SHT-Y", void)
comment= map from any format, to USP field ...
map:*:usp.GrpElv= 0.0
READ and WRITE PROCESSING
The end user may request special read and write processing.
Traces can be dropped, or padded. If the trace is dead,
then samples can be zeroed. Index fields can be forced to
increment, or synchronized with header changes. This
feature is controlled by definitions named read_usp,
read_segy, read_disco, and write_usp, write_segy,
write_disco. They are described in dds_usp(1), dds_segy(1),
and dds_disco(1).
Example: SEGY may contain seismic and auxiliary traces.
The field TrcIdCode should distinguish between them. Appli-
cation may want only the seismic, or auxiliary, or both. If
"read_segy= drop_aux", the auxiliary traces are dropped. If
the value is "-drop_aux", they are retained.
DEBUG SUPPORT
Messages are printed to standard error, when problems are
detected. Nominal processing can also be monitored. The
contents of each trace read or written can be printed. The
real names of temporary files can be printed, and their con-
tents preserved for inspection.
The end user can define "dds_debug= keywords" to control
this feature. Recognized values are shown below. Keyword
groups are mutually exclusive. The keyword read (or write)
may be added to limit the dump to input (or output) traces.
tab(|); l l l. keyword|max|description buf|n|all fields
diff|n|only field changes diff2|n|secondary sort changes
list "f1 f2 ..."|1|tabular field list usp segy
disco|1|standard tabular lists
If "dds_debug= buf 8" is defined, all fields in all traces
are printed. Vector field dumps are limited to 8 elements.
This prints a lot of (unwanted) information.
If "dds_debug= read diff" is defined, only input traces are
printed. By default, vector field dumps are limited to 30
elements. The name and value of all fields in the first
trace are printed. After the first, fields are printed only
if they differ from the previous trace. This makes
interesting information more visible, by eliminating clutter
from constant fields. The end user can replace uncertainty
with knowledge, when faced with new data.
Hint: If only the dump is wanted, define "out_data=
/dev/null" to dispose of binary data.
If the read keyword is eliminated, both input and output
traces are printed. This can be used to quality control
field mapping from one format to another. An input trace is
displayed, followed by the corresponding output trace. Some
applications may dump traces by ensemble (records, or 2D
planes).
If diff2 is used, more clutter is eliminated. Traces are
compared to two previous values. Constant fields are
printed, only if they start changing. Varying fields are
printed, only if they become constant. This technique
highlights field changes at secondary sort (record) boun-
daries. All fields are printed from the first two traces,
to provide a reference point.
The list keyword is used to select fields for a tabular
dump. Selected fields are printed on one line for each
trace. Only the first value from a vector field is
displayed. The keywords usp, segy, and disco are predefined
lists. If multiple standards are requested, a compromise is
made to limit print line width. Example:
dds_debug= usp
comment= "usp", as an explicit list
dds_debug= list "RecNum TrcNum RecInd DphInd
SrcLoc SoPtNm SrcPnt LinInd DstSgn StaCor"
Standard error will scroll off the screen, unless
redirected. Redirection syntax is dependent upon the Unix
shell. These examples assume the C shell is used to execute
bridge.
bridge par= file >& log
bridge par= file |& more
bridge par= file |& xcat
bridge par= file |& tee log
(bridge par= file | cmd2 ...) |& tee log
Hint: ctrl-c can be used to interrupt execution, after suf-
ficient data is printed.
These examples assume the Bourne shell is used.
bridge par= file 2> log
bridge par= file 2>&1 | more
bridge par= file 2>&1 | xcat
bridge par= file 2>&1 | tee log
(bridge par= file | cmd2 ...) 2>&1 | tee log
DEVICE ATTRIBUTES
Physical devices (tape and disk drives) have special attri-
butes. For example, tape drives have labels, density,
lengths, and block sizes. Attributes supported by DDS are
described by dds_device(1). The end user is normally con-
cerned with only a few keyword attributes. They are label,
dsn, cram, merge, online, and offline.
Device attributes are controlled by device definitions.
This example assumes multiple reels of tape are input.
Traces from all reels are concatenated together. If a reel
has multiple files, they are merged.
in_data= /dev/rst0
/dev/rst0= merge label "RE2201 RE2202 RE2203"
FORMAT ATTRIBUTES
Binary data formats have a name and other attributes. The
attributes control variations within a basic format. Recog-
nized attributes are described by dds_usp(1), dds_segy(1),
dds_disco(1), and dds_format(1).
Example format definition with explicit attributes:
in_format= segy charisma
out_format= segy float 4 ieee
DICTIONARY FILENAMES
Dictionary filenames have a few constraints. They should
not terminate with ":" (colon). These names are reserved
for mnemonic dictionaries. Filenames must not contain "=",
white space, or control characters. Users may follow addi-
tional conventions. For example, dictionaries that describe
binary data may be prefixed by "H" (History).
By default, disk files are created in the current working
directory.
DATA FILENAMES
Binary data filenames have a few constraints. They should
not terminate with ":" (colon). These names are reserved
for mnemonic data streams. Filenames must not contain "=",
white space, or control characters.
By default, data filenames are suffixed by their format,
i.e. line7.usp.
By default, disk files are created in the current working
directory. This default may be changed by defining
"DATA_PATH= /alternate/path". This allows data to be
diverted to special file systems. They may provide special
capacity, speed, or backup support.
BUGS
Contact Randy Selzler, APR x3413, zrls13@amoco.trc.com
SEE ALSO
tab(|); lB l. dds(1)|end user overview dds_user(1)|(this)
generic guide for DDS applications dds_apps(1)|list of
current DDS applications dds_usp(1)|USP seismic format
description dds_segy(1)|SEGY seismic format description
dds_disco(1)|Disco seismic format description
dds_rosetta(1)|Rosetta stone for standard formats
dds_map(1)|algebraic notation for field maps
dds_device(1)|tape and disk device attributes
dds_format(1)|generic DDS format description dds_type(1)|DDS
binary data types dds_expert(1)|advanced topics for end
users dds_help(1)|what to do, when things go wrong
AUTHOR
R. L. Selzler, APR, Tulsa
COPYRIGHT
copyright 2001, Amoco Production Company
All Rights Reserved
an affiliate of BP America Inc.
Man(1) output converted with
man2html