If open or
with-open-file
gets a non-complete
:external-format
argument
ef-spec
then the system decides which external format to use by calling the function guess-external-format.
The default behavior of guess-external-format is as follows:
:default
, this finds a match based on the filename; or (if that fails), looks in the Emacs-style (-*-) attribute line for an option called ENCODING or EXTERNAL-FORMAT; or (if that fails), chooses from amongst likely encodings by analysing the bytes near the start of the file, or (if that fails) uses a default encoding. Otherwise
ef-spec
's name is assumed to name an encoding and this encoding is used.
:eol-style
parameter, it then also analyses the start of the file for byte patterns indicating the end-of-line style, and uses a default end-of-line style if no such pattern is found. The file in this example was written by a Windows program which writes the Byte Order Mark at the start of the file, indicating that it is Unicode (UCS-2) encoded. The routine in step 1 above detects this:
(set-default-character-element-type 'simple-char)
=>
SIMPLE-CHAR
(with-open-file (ss "C:/temp/unicode-notepad.txt")
(stream-external-format ss))
=>
(:UNICODE :LITTLE-ENDIAN T :EOL-STYLE :CRLF)
The behavior of guess-external-format is configurable via the variables
*file-encoding-detection-algorithm* and
*file-eol-style-detection-algorithm*. See the manual pages for details.