名前 | 形式 | 機能説明 | オプション | オペランド | 使用例 | 終了ステータス | 属性 | 関連項目 | 注意事項
The auto_ef utility identifies the encoding of given file. It judges the encoding by using the iconv code conversion, esp., by using the fact that whether a certain code conversion was successful or not with the file, and also doing frequency analyses on the character sequences appear in the file.
(ASCII)
(JIS)
(Japanese EUC)
(Japanese PC Kanji, CP932, Shift JIS)
(UTF-8)
(Korean EUC)
(Unified Hangul)
(ISO-2022 Korean)
(ISO-2022 CN/CN-EXT)
(Simplified Chinese EUC, GB2312)
(Simplified Chinese GB18030/GBK)
(BIG5)
(Traditional Chinese EUC)
(Hong Kong BIG5)
(West European, etc)
(East European, etc)
(Cyrillic, etc)
(Arabic)
(Greek)
(Hebrew)
(windows-1250, corresponding to ISO-8859-2)
(windows-1251, corresponding to ISO-8859-5)
(windows-1252, corresponding to ISO-8859-1)
(windows-1253, corresponding to ISO-8859-7)
(windows-1255, corresponding to ISO-8859-8)
(corresponding to iso-8859-5)
By default auto_ef returns a single, most possible encoding for text in specified file. To get all possible encodings for the file, use -a option.
Also by default, auto_ef uses the most fastest process to examine the file. To get more accurate result, use -l option.
-e encoding_list can be used when you want to make auto_ef examine data with limited set of encodings.
The following options are supported:
Show usage message.
Make auto_ef examine data only with specified encodings. For example, when specified "ko_KR.euc:ko_KR.cp949" as encoding_list, auto_ef examines text only with CP949 and ko_KR.euc. Without this option, auto_ef examines text with the all encodings. Multiple encodings can be specified by separating encodings using colon.
Show all possible encodings in order of possibility, with scores in range between 0.0 and 1.0. Higher score means higher possibility. For example,
% auto_ef -a test_file eucJP 0.89 zh_CN.euc 0.40 ko_KR.euc 0.01 |
Without this option, only one encoding with the highest score is shown.
Specify the level of judgment. Value of level can be 0, 1, 2, or 3. Level 3 produces best result but can be slow. Level 0 is fastest but result can be less accurate than in higher levels. Default is level 0.
% auto_ef file |
% auto_ef -l 2 file |
auto_ef -e "eucJP:ko_KR.euc" file |
See attributes(5) for descriptions of the following attributes:
ATTRIBUTE TYPE |
ATTRIBUTE VALUE |
---|---|
Interface Stability |
Stable with exception |
Availability |
SUNWautoef |
libauto_ef(3LIB), auto_ef(3AUTO_EF), 国際化対応言語環境の利用ガイド
Interface stability of output format, when option -a is specified, is Evolving. Other interfaces are Stable.
名前 | 形式 | 機能説明 | オプション | オペランド | 使用例 | 終了ステータス | 属性 | 関連項目 | 注意事項