Originální popis anglicky:
csplit - split files based on context
Návod, kniha: POSIX Programmer's Manual
csplit [-ks][-f
prefix][-n number] file arg1
... argn
The
csplit utility shall read the file named by the
file operand,
write all or part of that file into other files as directed by the
arg
operands, and write the sizes of the files.
The
csplit utility shall conform to the Base Definitions volume of
IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
The following options shall be supported:
- -f prefix
- Name the created files prefix 00,
prefix 01, ..., prefixn. The default is xx00
... xx n. If the prefix argument would create a
filename exceeding {NAME_MAX} bytes, an error shall result, csplit
shall exit with a diagnostic message, and no files shall be created.
- -k
- Leave previously created files intact. By default,
csplit shall remove created files if an error occurs.
- -n number
- Use number decimal digits to form filenames for the
file pieces. The default shall be 2.
- -s
- Suppress the output of file size messages.
The following operands shall be supported:
- file
- The pathname of a text file to be split. If file is
'-' , the standard input shall be used.
The operands
arg1 ...
argn can be a combination of the following:
- /rexp/[offset]
-
A file shall be created using the content of the lines from the current line
up to, but not including, the line that results from the evaluation of the
regular expression with offset, if any, applied. The regular
expression rexp shall follow the rules for basic regular
expressions described in the Base Definitions volume of
IEEE Std 1003.1-2001, Section 9.3, Basic Regular
Expressions. The application shall use the sequence "\/"
to specify a slash character within the rexp. The optional offset
shall be a positive or negative integer value representing a number of
lines. A positive integer value can be preceded by '+' . If the
selection of lines from an offset expression of this type would
create a file with zero lines, or one with greater than the number of
lines left in the input file, the results are unspecified. After the
section is created, the current line shall be set to the line that results
from the evaluation of the regular expression with any offset applied. If
the current line is the first line in the file and a regular expression
operation has not yet been performed, the pattern match of rexp
shall be applied from the current line to the end of the file. Otherwise,
the pattern match of rexp shall be applied from the line following
the current line to the end of the file.
- %rexp%[offset]
-
Equivalent to / rexp/[offset], except that no
file shall be created for the selected section of the input file. The
application shall use the sequence "\%" to specify a
percent-sign character within the rexp.
- line_no
- Create a file from the current line up to (but not
including) the line number line_no. Lines in the file shall be
numbered starting at one. The current line becomes line_no.
- {num}
- Repeat operand. This operand can follow any of the operands
described previously. If it follows a rexp type operand, that
operand shall be applied num more times. If it follows a
line_no operand, the file shall be split every line_no
lines, num times, from that point.
An error shall be reported if an operand does not reference a line between the
current position and the end of the file.
See the INPUT FILES section.
The input file shall be a text file.
The following environment variables shall affect the execution of
csplit:
- LANG
- Provide a default value for the internationalization
variables that are unset or null. (See the Base Definitions volume of
IEEE Std 1003.1-2001, Section 8.2, Internationalization
Variables for the precedence of internationalization variables used to
determine the values of locale categories.)
- LC_ALL
- If set to a non-empty string value, override the values of
all the other internationalization variables.
- LC_COLLATE
-
Determine the locale for the behavior of ranges, equivalence classes, and
multi-character collating elements within regular expressions.
- LC_CTYPE
- Determine the locale for the interpretation of sequences of
bytes of text data as characters (for example, single-byte as opposed to
multi-byte characters in arguments and input files) and the behavior of
character classes within regular expressions.
- LC_MESSAGES
- Determine the locale that should be used to affect the
format and contents of diagnostic messages written to standard error.
- NLSPATH
- Determine the location of message catalogs for the
processing of LC_MESSAGES .
If the
-k option is specified, created files shall be retained.
Otherwise, the default action occurs.
Unless the
-s option is used, the standard output shall consist of one
line per file created, with a format as follows:
"%d\n", <file size in bytes>
The standard error shall be used only for diagnostic messages.
The output files shall contain portions of the original input file; otherwise,
unchanged.
None.
The following exit values shall be returned:
- 0
- Successful completion.
- >0
- An error occurred.
By default, created files shall be removed if an error occurs. When the
-k option is specified, created files shall not be removed if an error
occurs.
The following sections are informative.
None.
- 1.
- This example creates four files, cobol00 ...
cobol03:
csplit -f cobol file '/procedure division/' /par5./ /par16./
After editing the split files, they can be recombined as follows:
Note that this example overwrites the original file.
- 2.
- This example would split the file after the first 99 lines,
and every 100 lines thereafter, up to 9999 lines; this is because lines in
the file are numbered from 1 rather than zero, for historical reasons:
- 3.
- Assuming that prog.c follows the C-language coding
convention of ending routines with a '}' at the beginning of the
line, this example creates a file containing each separate C routine (up
to 21) in prog.c:
csplit -k prog.c '%main(%' '/^}/+1' {20}
The
-n option was added to extend the range of filenames that could be
handled.
Consideration was given to adding a
-a flag to use the alphabetic
filename generation used by the historical
split utility, but the
functionality added by the
-n option was deemed to make alphabetic
naming unnecessary.
None.
sed ,
split
Portions of this text are reprinted and reproduced in electronic form from IEEE
Std 1003.1, 2003 Edition, Standard for Information Technology -- Portable
Operating System Interface (POSIX), The Open Group Base Specifications Issue
6, Copyright (C) 2001-2003 by the Institute of Electrical and Electronics
Engineers, Inc and The Open Group. In the event of any discrepancy between
this version and the original IEEE and The Open Group Standard, the original
IEEE and The Open Group Standard is the referee document. The original
Standard can be obtained online at http://www.opengroup.org/unix/online.html
.