Originální popis anglicky:
file - determine file type
Návod, kniha: POSIX Programmer's Manual
file [-dh][-M file][-m
file] file ...
file -i
[-h] file
...
The
file utility shall perform a series of tests in sequence on each
specified
file in an attempt to classify it:
- 1.
- If file does not exist, cannot be read, or its file
status could not be determined, the output shall indicate that the file
was processed, but that its type could not be determined.
- 2.
- If the file is not a regular file, its file type shall be
identified. The file types directory, FIFO, socket, block special, and
character special shall be identified as such. Other
implementation-defined file types may also be identified. If file
is a symbolic link, by default the link shall be resolved and file
shall test the type of file referenced by the symbolic link. (See the
-h and -i options below.)
- 3.
- If the length of file is zero, it shall be
identified as an empty file.
- 4.
- The file utility shall examine an initial segment of
file and shall make a guess at identifying its contents based on
position-sensitive tests. (The answer is not guaranteed to be correct; see
the -d, -M, and -m options below.)
- 5.
- The file utility shall examine file and make
a guess at identifying its contents based on context-sensitive default
system tests. (The answer is not guaranteed to be correct.)
- 6.
- The file shall be identified as a data file.
If
file does not exist, cannot be read, or its file status could not be
determined, the output shall indicate that the file was processed, but that
its type could not be determined.
If
file is a symbolic link, by default the link shall be resolved and
file shall test the type of file referenced by the symbolic link.
The
file utility shall conform to the Base Definitions volume of
IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines,
except that the order of the
-m,
-d, and
-M options shall
be significant.
The following options shall be supported by the implementation:
- -d
- Apply any position-sensitive default system tests and
context-sensitive default system tests to the file. This is the default if
no -M or -m option is specified.
- -h
- When a symbolic link is encountered, identify the file as a
symbolic link. If -h is not specified and file is a symbolic
link that refers to a nonexistent file, file shall identify the
file as a symbolic link, as if -h had been specified.
- -i
- If a file is a regular file, do not attempt to classify the
type of the file further, but identify the file as specified in the STDOUT
section.
- -M file
- Specify the name of a file containing position-sensitive
tests that shall be applied to a file in order to classify it (see the
EXTENDED DESCRIPTION). No position-sensitive default system tests nor
context-sensitive default system tests shall be applied unless the
-d option is also specified.
- -m file
- Specify the name of a file containing position-sensitive
tests that shall be applied to a file in order to classify it (see the
EXTENDED DESCRIPTION).
If the
-m option is specified without specifying the
-d option or
the
-M option, position-sensitive default system tests shall be applied
after the position-sensitive tests specified by the
-m option. If the
-M option is specified with the
-d option, the
-m option,
or both, or the
-m option is specified with the
-d option, the
concatenation of the position-sensitive tests specified by these options shall
be applied in the order specified by the appearance of these options. If a
-M or
-m file option-argument is
-, the results
are unspecified.
The following operand shall be supported:
- file
- A pathname of a file to be tested.
Not used.
The
file can be any file type.
The following environment variables shall affect the execution of
file:
- LANG
- Provide a default value for the internationalization
variables that are unset or null. (See the Base Definitions volume of
IEEE Std 1003.1-2001, Section 8.2, Internationalization
Variables for the precedence of internationalization variables used to
determine the values of locale categories.)
- LC_ALL
- If set to a non-empty string value, override the values of
all the other internationalization variables.
- LC_CTYPE
- Determine the locale for the interpretation of sequences of
bytes of text data as characters (for example, single-byte as opposed to
multi-byte characters in arguments and input files).
- LC_MESSAGES
- Determine the locale that should be used to affect the
format and contents of diagnostic messages written to standard error and
informative messages written to standard output.
- NLSPATH
- Determine the location of message catalogs for the
processing of LC_MESSAGES .
Default.
In the POSIX locale, the following format shall be used to identify each
operand,
file specified:
"%s: %s\n", <file>, <type>
The values for <
type> are unspecified, except that in the POSIX
locale, if
file is identified as one of the types listed in the
following table, <
type> shall contain (but is not limited to)
the corresponding string, unless the file is identified by a
position-sensitive test specified by a
-M or
-m option. Each
space shown in the strings shall be exactly one <space>.
Table: File Utility Output Strings
If file is: |
<type> shall contain the
string: |
Notes |
Nonexistent |
cannot open |
|
Block special |
block special |
1 |
Character special |
character special |
1 |
Directory |
directory |
1 |
FIFO |
fifo |
1 |
Socket |
socket |
1 |
Symbolic link |
symbolic link to |
1 |
Regular file |
regular file |
1,2 |
Empty regular file |
empty |
3 |
Regular file that cannot be read |
cannot open |
3 |
Executable binary |
executable |
4,6 |
ar archive library (see ar) |
archive |
4,6 |
Extended cpio format (see pax) |
cpio archive |
4,6 |
Extended tar format (see ustar in
pax ) |
tar archive |
4,6 |
Shell script |
commands text |
5,6 |
C-language source |
c program text |
5,6 |
FORTRAN source |
fortran program text |
5,6 |
Regular file whose type cannot be determined |
data |
|
- Notes:
- 1.
- This is a file type test.
- 2.
- This test is applied only if the -i option is
specified.
- 3.
- This test is applied only if the -i option is not
specified.
- 4.
- This is a position-sensitive default system test.
- 5.
- This is a context-sensitive default system test.
- 6.
- Position-sensitive default system tests and
context-sensitive default system tests are not applied if the -M
option is specified unless the -d option is also specified.
In the POSIX locale, if
file is identified as a symbolic link (see the
-h option), the following alternative output format shall be used:
"%s: %s %s\n", <file>, <type>, <contents of link>"
If the file named by the
file operand does not exist, cannot be read, or
the type of the file named by the
file operand cannot be determined,
this shall not be considered an error that affects the exit status.
The standard error shall be used only for diagnostic messages.
None.
A file specified as an option-argument to the
-m or
-M options
shall contain one position-sensitive test per line, which shall be applied to
the file. If the test succeeds, the message field of the line shall be printed
and no further tests shall be applied, with the exception that tests on
immediately following lines beginning with a single
'>' character
shall be applied.
Each line shall be composed of the following four <blank>-separated
fields:
- offset
- An unsigned number (optionally preceded by a single
'>' character) specifying the offset, in bytes, of the
value in the file that is to be compared against the value field of
the line. If the file is shorter than the specified offset, the test shall
fail.
If the
offset begins with the character
'>' , the test
contained in the line shall not be applied to the file unless the test on the
last line for which the
offset did not begin with a
'>' was
successful. By default, the
offset shall be interpreted as an unsigned
decimal number. With a leading 0x or 0X, the
offset shall be
interpreted as a hexadecimal number; otherwise, with a leading 0, the
offset shall be interpreted as an octal number.
- type
- The type of the value in the file to be tested. The type
shall consist of the type specification characters c , d ,
f , s , and u , specifying character, signed decimal,
floating point, string, and unsigned decimal, respectively.
The
type string shall be interpreted as the bytes from the file starting
at the specified
offset and including the same number of bytes
specified by the
value field. If insufficient bytes remain in the file
past the
offset to match the
value field, the test shall fail.
The type specification characters
d ,
f , and
u can be
followed by an optional unsigned decimal integer that specifies the number of
bytes represented by the type. The type specification character
f can
be followed by an optional
F ,
D , or
L , indicating that
the value is of type
float,
double, or
long
double, respectively. The type specification characters
d and
u can be followed by an optional
C ,
S ,
I , or
L , indicating that the value is of type
char,
short,
int, or
long, respectively.
The default number of bytes represented by the type specifiers
d ,
f , and
u shall correspond to their respective C-language types
as follows. If the system claims conformance to the C-Language Development
Utilities option, those specifiers shall correspond to the default sizes used
in the
c99 utility. Otherwise, the default sizes shall be
implementation-defined.
For the type specifier characters
d and
u , the default number of
bytes shall correspond to the size of a basic integer type of the
implementation. For these specifier characters, the implementation shall
support values of the optional number of bytes to be converted corresponding
to the number of bytes in the C-language types
char,
short,
int, or
long. These numbers can also be specified by an
application as the characters
C ,
S ,
I , and
L ,
respectively. The byte order used when interpreting numeric values is
implementation-defined, but shall correspond to the order in which a constant
of the corresponding type is stored in memory on the system.
For the type specifier
f , the default number of bytes shall correspond
to the number of bytes in the basic double precision floating-point data type
of the underlying implementation. The implementation shall support values of
the optional number of bytes to be converted corresponding to the number of
bytes in the C-language types
float,
double, and
long
double. These numbers can also be specified by an application as the
characters
F ,
D , and
L , respectively.
All type specifiers, except for
s , can be followed by a mask specifier
of the form &
number. The mask value shall be AND'ed with the value
of the input file before the comparison with the
value field of the
line is made. By default, the mask shall be interpreted as an unsigned decimal
number. With a leading 0x or 0X, the mask shall be interpreted as an unsigned
hexadecimal number; otherwise, with a leading 0, the mask shall be interpreted
as an unsigned octal number.
The strings
byte,
short,
long, and
string shall also
be supported as type fields, being interpreted as
dC ,
dS ,
dL , and
s , respectively.
- value
- The value to be compared with the value from the
file.
If the specifier from the type field is
s or
string, then
interpret the value as a string. Otherwise, interpret it as a number. If the
value is a string, then the test shall succeed only when a string value
exactly matches the bytes from the file.
If the
value is a string, it can contain the following sequences:
- \character
The backslash-escape sequences as specified in
the Base Definitions volume of IEEE Std 1003.1-2001, Table 5-1,
Escape Sequences and Associated Actions ( '\\' , '\a' ,
'\b' , '\f' , '\n' , '\r' , '\t' ,
'\v' ). The results of using any other character, other than an octal
digit, following the backslash are unspecified.
- \octal
Octal sequences that can be used to represent
characters with specific coded values. An octal sequence shall consist of a
backslash followed by the longest sequence of one, two, or three octal-digit
characters (01234567). If the size of a byte on the system is greater than 9
bits, the valid escape sequence used to represent a byte is
implementation-defined.
By default, any value that is not a string shall be interpreted as a signed
decimal number. Any such value, with a leading 0x or 0X, shall be interpreted
as an unsigned hexadecimal number; otherwise, with a leading zero, the value
shall be interpreted as an unsigned octal number.
If the value is not a string, it can be preceded by a character indicating the
comparison to be performed. Permissible characters and the comparisons they
specify are as follows:
- =
The test shall succeed if the value from the
file equals the value field.
- <
The test shall succeed if the value from the
file is less than the value field.
- >
The test shall succeed if the value from the
file is greater than the value field.
- &
The test shall succeed if all of the set bits
in the value field are set in the value from the file.
- ^
The test shall succeed if at least one of the
set bits in the value field is not set in the value from the
file.
- x
The test shall succeed if the file is large
enough to contain a value of the type specified starting at the offset
specified.
- message
- The message to be printed if the test succeeds. The
message shall be interpreted using the notation for the
printf formatting specification; see printf() . If the
value field was a string, then the value from the file shall be the
argument for the printf formatting specification; otherwise, the
value from the file shall be the argument.
The following exit values shall be returned:
- 0
- Successful completion.
- >0
- An error occurred.
Default.
The following sections are informative.
The
file utility can only be required to guess at many of the file types
because only exhaustive testing can determine some types with certainty. For
example, binary data on some implementations might match the initial segment
of an executable or a
tar archive.
Note that the table indicates that the output contains the stated string.
Systems may add text before or after the string. For executables, as an
example, the machine architecture and various facts about how the file was
link-edited may be included. Note also that on systems that recognize shell
script files starting with
"#!" as executable files, these
may be identified as executable binary files rather than as shell scripts.
Determine whether an argument is a binary executable file:
file "$1" | grep -Fq executable &&
printf "%s is executable.\n" "$1"
The
-f option was omitted because the same effect can (and should) be
obtained using the
xargs utility.
Historical versions of the
file utility attempt to identify the following
types of files: symbolic link, directory, character special, block special,
socket,
tar archive,
cpio archive, SCCS archive, archive
library, empty,
compress output,
pack output, binary data, C
source, FORTRAN source, assembler source,
nroff/
troff/
eqn/
tbl source
troff output, shell script, C shell
script, English text, ASCII text, various executables, APL workspace, compiled
terminfo entries, and CURSES screen images. Only those types that are
reasonably well specified in POSIX or are directly related to POSIX utilities
are listed in the table.
Historical systems have used a "magic file" named
/etc/magic to
help identify file types. Because it is generally useful for users and scripts
to be able to identify special file types, the
-m flag and a portable
format for user-created magic files has been specified. No requirement is made
that an implementation of
file use this method of identifying files,
only that users be permitted to add their own classifying tests.
In addition, three options have been added to historical practice. The
-d
flag has been added to permit users to cause their tests to follow any default
system tests. The
-i flag has been added to permit users to test
portably for regular files in shell scripts. The
-M flag has been added
to permit users to ignore any default system tests.
The IEEE Std 1003.1-2001 description of default system tests and
the interaction between the
-d,
-M, and
-m options did
not clearly indicate that there were two types of "default system
tests". The "position-sensitive tests'' determine file types by
looking for certain string or binary values at specific offsets in the file
being examined. These position-sensitive tests were implemented in historical
systems using the magic file described above. Some of these tests are now
built into the
file utility itself on some implementations so the
output can provide more detail than can be provided by magic files. For
example, a magic file can easily identify a
core file on most
implementations, but cannot name the program file that dropped the core. A
magic file could produce output such as:
/home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
but by building the test into the
file utility, you could get output such
as:
/home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
These extended built-in tests are still to be treated as position-sensitive
default system tests even if they are not listed in
/etc/magic or any
other magic file.
The context-sensitive default system tests were always built into the
file utility. These tests looked for language constructs in text files
trying to identify shell scripts, C, FORTRAN, and other computer language
source files, and even plain text files. With the addition of the
-m
and
-M options the distinction between position-sensitive and
context-sensitive default system tests became important because the order of
testing is important. The context-sensitive system default tests should never
be applied before any position-sensitive tests even if the
-d option is
specified before a
-m option or
-M option due to the high
probability that the context-sensitive system default tests will incorrectly
identify arbitrary text files as text files before position-sensitive tests
specified by the
-m or
-M option would be applied to give a more
accurate identification.
Leaving the meaning of
-M - and
-m - unspecified allows an
existing prototype of these options to continue to work in a
backwards-compatible manner. (In that implementation,
-M - was roughly
equivalent to
-d in IEEE Std 1003.1-2001.)
The historical
-c option was omitted as not particularly useful to users
or portable shell scripts. In addition, a reasonable implementation of the
file utility would report any errors found each time the magic file is
read.
The historical format of the magic file was the same as that specified by the
Rationale in the ISO POSIX-2:1993 standard for the
offset,
value, and
message fields; however, it used less precise type
fields than the format specified by the current normative text. The new type
field values are a superset of the historical ones.
The following is an example magic file:
0 short 070707 cpio archive
0 short 0143561 Byte-swapped cpio archive
0 string 070707 ASCII cpio archive
0 long 0177555 Very old archive
0 short 0177545 Old archive
0 short 017437 Old packed data
0 string \037\036 Packed data
0 string \377\037 Compacted data
0 string \037\235 Compressed data
>2 byte&0x80 >0 Block compressed
>2 byte&0x1f x %d bits
0 string \032\001 Compiled Terminfo Entry
0 short 0433 Curses screen image
0 short 0434 Curses screen image
0 string <ar> System V Release 1 archive
0 string !<arch>\n__.SYMDEF Archive random library
0 string !<arch> Archive
0 string ARF_BEGARF PHIGS clear text archive
0 long 0x137A2950 Scalable OpenFont binary
0 long 0x137A2951 Encrypted scalable OpenFont binary
The use of a basic integer data type is intended to allow the implementation to
choose a word size commonly used by applications on that architecture.
None.
ar ,
ls ,
pax
Portions of this text are reprinted and reproduced in electronic form from IEEE
Std 1003.1, 2003 Edition, Standard for Information Technology -- Portable
Operating System Interface (POSIX), The Open Group Base Specifications Issue
6, Copyright (C) 2001-2003 by the Institute of Electrical and Electronics
Engineers, Inc and The Open Group. In the event of any discrepancy between
this version and the original IEEE and The Open Group Standard, the original
IEEE and The Open Group Standard is the referee document. The original
Standard can be obtained online at http://www.opengroup.org/unix/online.html
.