NAME
ufh - examine and modify an SIS data stream
SYNOPSIS
ufh script
DESCRIPTION
Ufh accepts as input an SIS data stream (available at stdin)
and a file containing programming commands for ufh's inter-
nal scripting language, shuffle. Ufh usually produces as
output an SIS data which has been modified in accordance
with the script program (but see TYPICAL USAGE, below).
STRUCTURE OF THIS MANUAL PAGE
Ufh is a relatively complex program and has a relatively
complex manual "page." The remainder of this document is
divided into
TYPICAL USAGE
SCRIPT STRUCTURE
The major features of a ufh script.
SHUFFLE The syntax and semantics of ufh's internal script-
ing language,
Some Administrative Sections
Notes on bugs and planned improvements, ack-
nowledgements, etc.
OTHER AVAILABLE DOCUMENTATION
We have compiled a set of tutorial examples. Look in
~usp/doc/ufh or ask your local USP guru. A set of sample
scripts: we strongly recommend that you look at these.
TYPICAL USAGE
One use of ufh is as a data filter in an SIS data stream.
In this role, ufh accepts an SIS data element (either a line
header or a record), alters it as instructed by a shuffle
script, and sends the data onward via stdout to either
another process or a file.
A somewhat different role is as the end of an SIS data
stream. In this mode, ufh reads an SIS data stream but it
emits some other kind of information, typically formatted
text and numbers.
It is also, of course, practicable to use ufh as a sort of
C-like basic interpreter.
Each of these usages is demonstrated by an example in the
example collection mentioned above.
SCRIPT STRUCTURE
The contents of a script file is a series of function defin-
itions written in shuffle, an internal C-like language.
Each function definition has the structure
func FunctionName() {
... shuffle statements ...
}
func AnotherFunctionName() {
... shuffle statements ...
}
where FunctionName is the function's name. Functions can
return values of any type. There can be any number of func-
tions in a script.
Ufh recognizes four special function names, each of which is
called in response to some state of the data read from stan-
dard input.
Begin() is called before anything is read
from standard input.
OnLineHeader() is called when a line header is
read. The contents of the line
header are available in the global
variable LH. Ufh expects a normal
SIS stream: there must be exactly
one line header present and it must
be the first item in the stream.
OnTrace() is called whenever a trace has been
read (a trace consists of a trace
header and the trace time series).
The contents of the trace are
available in the global variable
Tr.
End() is called after end-of-file has
been encountered on standard input.
SHUFFLE
Names in Shuffle
Variable and function names are limited by the same restric-
tions as C. Names are case-independent. The only time
shuffle retains alphabetic case is in the contents of
strings.
Variables and Types
Shuffle variables are always regular (named) variables.
Regular variables are always global and are created simply
by being assigned to.
The type of a variable is simply the type of whatever was
last assigned to it. The possible contents of a variable
are:
Nothing a value and a type; it is the value of a
variable which has not been given a
value,
double a double-precision floating-point value
(the only kind of number),
string a sequence of ASCII characters (inter-
nally terminated by an ASCII zero),
LineHeader an SIS line header,
Trace an SIS trace (consisting of a trace
header and a trace time series)
stream a pointer to an input or output stream
(such as stdout or a value returned by
fopen() or popen()) or
array an array of values of any of these
types.
An array's elements need not all have the same type. An
array element may itself contain an array. Array subscripts
start at zero (like C) and go up (although you are free to
pretend that they start at one). Referring to a subscripted
element (such as x[10] or z[i][j][k]) causes all intermedi-
ate elements which do not already exist to be created and
assigned the empty type, Nothing.
Function Arguments
Function arguments are disabled at present. We have the
necessary mechanisms in place but have not yet settled on a
syntax. This shortcoming will be rectified and shuffle will
support argument-passing in a versatile, simple way.
Comments
Shuffle supports three styles of comments:
shell from # to end-of-line,
C everything from /* to */, and
C++ everything from // to end-of-line.
Comments cannot be nested. Don't look for trouble.
Syntax
Shuffle uses semicolons and curly braces in the same way C
does. All simple statements should end with semicolons.
Block statements should be wrapped in curly braces (and do
not have a teminating semicolon). Block statements includes
function definitions.
The operators `=', `*', `/', `-', and `+' are available for
operating on numeric values. Also available are `%' (for
modulo) and `^' (for raised-to-the-power-of). The increment
and decrement operators, `++' and `--', are available in
both their postfix and prefix forms.
Only `=' and `+' are available for strings, with `+' denot-
ing concatenation. Strings cannot be mixed with other types
at present; thus the expression "3" + 7 is illegal (and will
be caught by the compiler whenever possible).
The comparison operators `==', `>', `<', `>=', `<=', and
`!=' are implemented for numeric comparisons. No other
types may be used as arguments to these operators at
present.
There is no comma operator. There are no pointers and there
are no address-related operators.
Function Definition
Functions are identified by a leading func. Functions must
be defined before they are used. There are three different
ways to return from a function (same as C):
return(expr) return the value expr,
return return the value 0, or
fall off the end equivalent to return(0).
Control Flow
Shuffle supports C's if(test) and if(test)...else con-
structs. Conditionals may be nested and stacked exactly as
in C.
Shuffle also supports while(test) and the three-element
for(init;test;incr) looping control constructs.
Break and continue are not yet supported.
Storage Management
Storage is managed automatically and all assignments are
assignments by value, which means that a new copy of the
data is used. There is no notion of pointers or of explicit
memory management by the user.
Member Access
The LineHeader and Trace types support a structure-like con-
vention for access to their individual data elements
(including the trace data samples). If, for example, lineH
holds a line header, lineH.NumSmp provides (read or write)
access to the number-of-samples field.
String-valued entries are accessible just as numeric ones
are, with the qualification that assignment into a header
field occurs from right-to-left. If, for example, we assign
a string shorter than the SIS field into that field, the
string will appear right-adjusted and padded on the left
with blanks. (This unusual convention is ransom to his-
tory.)
In addition to the header fields, the trace samples can be
accessed through a pseudo-array. If trace holds a trace,
trace.Series[i] will access the ith sample.
See the example collection mentioned earlier.
Built-in Elementary Functions
sin(x)
cos(x)
atan(x)
log(x)
log10(x)
exp(x)
sqrt(x)
int(x)
nint(x)
abs(x)
Miscellaneous Functions
time() returns the current wall clock time in (double)
seconds since 0:00 GMT, January 1, 1970. Granu-
larity is system dependent (see gettimeofday(2)).
random() returns a random number in (I think) the half-open
interval [0,1) (see random(3)).
sbreak() returns the top of the process' data area (as a
number). (Only useful purpose known to me is to
check for memory leaks while debugging ufh.)
exit(x) causes ufh to teminate and return the value x to
the shell that invoked it. (By convention exit(0)
denotes success and any other value indicates
failure. A script which exits by falling off the
end returns 0.)
strlen(s) returns the number of characters in the string s
(actually a synonym for size()).
size(x) returns the size of x (this might be useful in
detecting bad header sizes, etc). If x is a
string the returned value is the length of the
string (the trailing nul is not counted). If x is
an SIS object the returned value is the size in
bytes. If x is an array of floating-point numbers
(such as an SIS trace) the returned value is the
number of entries in the array. If x is an array
the returned value is the number of elements.
Otherwise, the returned value is 0 if x has the
type Nothing and 1 in all the remaining cases..
strtonum(s)
interprets the contents of s (which had better be
a string) as a number and returns its value. If s
does not at least begi with a legitimitate numeric
value, this function will break the program.
floattostr(f, fmt)
converts the value of f, as a floating-point
number, into a string using the printf(3) format
in the string fmt. Fmt should contain a
floating-point format string such as "%g", etc.
inttostr(i, fmt)
converts the value of f, as n integer, into a
string using the printf(3) format in the string
fmt. Fmt should contain an integer format string
such as "%d", etc.
Input/Output
output(obj)
writes obj to stdout: obj must be either a Line-
Header or a Trace.
print(a,b,...)
writes formatted forms of the arguments a, b, etc
to stderr.
popen(cmd, mode)
opens a pipeline to the process cmd in the direc-
tion (reading or writing) specified by mode, and
returns a file pointer value suitable for passing
to fprint, etc. Cmd is executed to create the
target process and can be any legal sh(1) command
string. Mode should be "w" for writing to the cmd
and "r" for reading from the cmd (the quotes are
required).
pclose(stream)
closes stream (which must be a value returned by
an earlier call of popen()). This call closes the
i/o stream and waits for the remote process to
exit (see popen(3)).
fopen(filename, mode)
opens the file specified by the path filename for
i/o in the direction specified by the string mode
and returns a file pointer value suitable for
passing to fprint (below). Mode should be one of
"r", "w", "a", "r+", "w+", or "a+" where the
quotes are required (the first two are by far the
most common); see fopen(3) for more details.
fclose(stream)
closes stream (which must be a value returned by
an earlier call of fopen).
fflush(stream)
writes any buffered data to stream.
fprint(stream, a, b, ...)
writes formatted forms of a, b, etc, to stream.
Popen() and pclose() provide very substantial flexibility to
scripts. The user should carefully note, however, that many
programs (such as xgraph(1) which is used in one of the
examples) do not do anything interesting until they have
read everything available from stdin. These programs will
not see end-of-file on stdin until the shuffle script has
called pclose().
A child process which is popen'd in mode "w" shares stdout
with the parent process (ufh in this case). Thus a child
process invoked in this manner can be used as an output
filter for the parent process. Values written by the child
process to its stdout will emerge as though from the parent
process' stdout. A similar mechanism works for read pipe-
lines.
Built-in Trace Operations
trstats(tr)
computes some useful statistics about the trace
tr. Trstats return value is an array of values.
Starting with the 0th member, this arrays holds
the minimum of the sample values, the maximum of
the sample values, the average of the sample
values, and the average of the squares of the sam-
ple values.
tradd(tr, offset)
returns a new trace which differs from the input
trace, tr, by having the value offset added to all
of its samples. The contents of tr are not
changed.
trset(tr, value)
returns a new trace which differs from the input
trace, tr, by having the value value or all of its
samples. The contents of tr are not changed.
Predefined Values
pi 3.14159265358979323846
enatural 2.71828182845904523636
gamma 0.57721566490153286060 (Euler's constant - for the
snobs in the audience)
rad2deg 57.29577951308232087860 (degrees per radian)
golden 1.61803398874989484820 (the golden mean - for the
esthetes)
nothing the empty value, a pile of ashes, an Amoco career.
hardware a string name for the hardware upon which we are
running (such as "sun", "cray").
MISSING FEATURES THAT WILL BE SUPPLIED SOMEDAY
Shuffle has no access to command line arguments.
There is no file-include mechanism. There should be one
that searchs default directories, etc.
The op= operators are missing.
Break and continue are missing.
Function arguments are not available.
There is no provision for user error handling.
There is no way to access the historical line header.
Comparison operators should be extended to string and empty
types.
ACKNOWLEDGEMENTS
Ufh is descended from a programmable calculator, hoc, dis-
cussed in The Unix Programming Environment by Brian W. Ker-
nighan and Rob Pike (1984, Prentice-Hall). Using hoc as a
starting point was very helpful.
BUGS
Is it better to have a script language that is very similar
to a language one knows (like C) but not identical, or is it
better to have something entirely different (like lisp)?
Ufh won't do everything anyone can dream up.
SEE ALSO
scan(1)
AUTHOR
Martin L. Smith
COPYRIGHT
copyright 2001, Amoco Production Company
All Rights Reserved
an affiliate of BP America Inc.
Man(1) output converted with
man2html