NAME
splitr - split up combined data set into separate streams
SYNOPSIS
splitr [ -Nntap ] [ [ -O[],-O[],...,-O[]otap ] [ -F[]ftap ]
] [ -nntr ] [ -bnblk ] [ -R ] [ -Y ] [ -rm ] [ -T ] [ -p ] [
-norig ] [ -V ] [ -? ]
DESCRIPTION
splitr splitr a combined data set into separate sets for
further processing
splitr gets both its data and its parameters from command
line arguments and from special otherwise unused line and
trace header positions. These arguments specify the input,
up to 1000 output data set names, and verbose printout if
desired. In addition an input data set can be split accord-
ingly. Note: splitr cannot be pipelined on output except in
IKP.
There are 3 basic modes of operation: (1) input data stream
is routed in rotary fashion into the output streams, either
record-by-record or trace-by-trace, or the output stream is
duplicated into the available output streams (multi-pronged
"Y"), or the input stream is partitioned such that a portion
of the data in sequence is sent to the output streams; (2) a
rolling chunk of records is sent to the available outputs on
an add-1-drop-1 basis; (3) each input record is split into
equal portions and each portion is sent into separate
streams.
This is the ideal tool in XIKP for setting up parallel pro-
cessing streams. Within this tool splitr is run through
scripts labelled splitr1toN under the "Trace Editting" sec-
tion. Virtually all the output options have reverse counter-
parts in program gather. A maximum of 1000 files is allowed
but most computer systems have built in limits considerably
smaller than this - beware of these limitations.
Command line arguments
-N ntap
Enter the input data set name or file immediately after
typing -N. This input file should include the complete
path name if the file resides in a different directory.
Example -N/b/vsp/dummy tells the program to look for
file 'dummy' in directory 'vsp' stored on the 'b' disk.
-O[],-O[],...,-O[] otap(i)
Enter each output data set name or file immediately
after typing -O. i.e. -O[otap(i)], i=1,1000... Option-
ally...
-F ftap
-R Option 1: Enter the command line argument '-R' to do a
rotary function on records: each successive input
record is read and then passed to an ouput stream, the
next record is read and passed to the next stream.
This process continues cycling through the available
number of output streams record by record until the
entire input data set has been read. This process can
be exactly undone in gather (or the mergeNto1 processes
in XIKP) by flagging "records back-to-back" option.
The data can be gathered back together using a mergn with
one of the record options.
-rm Option 1: Enter the command line argument '-rm' to do a
rotary function on traces: each successive input trace
is read and then passed to an ouput stream, the next
trace is read and passed to the next stream. This pro-
cess continues cycling through the available number of
output streams trace by trace until the entire input
data set has been read. The data in each output stream
will be 1 trace records. This process can be exactly
undone in gather (or the mergeNto1 processes in XIKP)
by flagging "multiplex traces" and "restore original
number of traces/rec" options.
The data can be gathered back together using a mergn with
one of the record options or the multiplex option.
-T Enter the command line argument '-T' to take each input
trace will be boken up into approximately equal time
chunks and these chunks sent to the successive output
data streams. The output data streams will each have
the same number of traces/rec and numbers of records as
the input data set; the number of samples per trace
will be different.
-p Option 1: Enter the command line argument '-p' to par-
tition the input data stream into as many output files
as specified by -O[] command line entries or IKP con-
nections. The total input records will be divided by
the number of outputs with leftovers apportioned to the
leading output streams; those numbers of records will
then be written to the output streams starting from the
leftmost -O[] or IKP socket. This process can be
exactly undone in gather (or the mergeNto1 processes in
XIKP) by flagging one of the "input data sets put one
after the other" options (the first assumes line
headers on the individual data streams; the second
assumes a line header only on the first stream).
-Y Option 1: Enter the command line argument '-Y' to do a
simple "Y" function, i.e. duplicate the input data set
on the available output streams. The form is to send
out all input trace 1's over the available streams,
followed by all trace 2's, followed by all trace 3's,
.... The effect is to generate multiple parallel
streams. The data can be gathered together using
gather (or the mergeNto1 processes in XIKP) which will
assemble the streams in groups of traces comong from
the streams (trace 1's, trace 2's , etc). The user
will have to run an editt to reconfigure the data into
contiguous data sets.
-ro Data is output to the streams in a rolling buffer of
nroll records (see option 2 below).
-b nblk
Option 1: in rotary record mode (-R) this allows more
than one record to be sent to each output stream at a
time. The default = 1. Option 2: this governs the
number of records in the rolling buffer. Once the
buffer is filled it is dumped to the current output
stream. Then a record is read from the input stream and
added to the end of the buffer while the first record
in the buffer is dropped. This buffer is then dumped to
the next output stream. For each input record each
output stream will have nroll output records. Since the
partition is centered on the input record (i.e. for
nroll=5 input record 11 will result in records 9, 10,
11, 12, 13 being output) the ends of the data will have
partitions that have dead records either at the start
(for input records 1, 2, ...) or at the end (for
records ..., N-1, N). Since there is no communication
between splitr and gather except through the line
header any value of nblk in the split process must also
be filled out on the gather command line (-n nrec). The
default state will cause this to automatically happen.
-n ntr
Option 3: A nonzero entry causes the input data to be
split record-by-record into m data sets each with ntr
traces/record. ntr must divide into the input
traces/record evenly or the program aborts. If this
entry is zero splitr assumes the input has been created
by mergn in which case all the info needed is contained
in the headers. This is especially useful in dealing
with multicomponent data which has been multiplexed in
doublets or triplets. This process can be exactly
undone in gather (or the mergeNto1 processes in XIKP)
by flagging "input records put in super blocks" option.
-norig
Enter the command line argument '-norig' to avoid stor-
ing the input # rec and # trc/rec in the output line
header (in words OrNREC & OrNTRC). Sometimes these
values get buggered and its better to follow the down-
stream gather or whatever with a utop ... -L[] -R[]
-h0OrNTRC=[] -h1OrNREC=[] so that these header words
are exactly what you think they are.
-V Enter the command line argument '-V' to get additional
printout.
-? Enter the command line argument '-?' to get online
help. The program terminates after the help screen is
printed.
See Also
gather
COPYRIGHT
copyright 2001, Amoco Production Company
All Rights Reserved
an affiliate of BP America Inc.
Man(1) output converted with
man2html