Choose Import Character Encoding
On the Scan & Upload File page of the Import Assistant, choose the type of character encoding for your CSV import. The type of character encoding you choose depends on the type of file you want to import.
Encoding schemes generally act the same way for common characters, such as ‘a', ‘A', and ‘0'. For special characters that programs may generate automatically as you type, such as typographic quotation marks, and for characters with diacriticals, encoding schemes differ, for example 0xE9 vs. 0x8E for ‘e' + acute accent.
NetSuite supports the following types of character encoding:
-
(Unicode) UTF-8 encoding — UTF-8 character encoding is the most widely used format for international users importing a CSV file created in a language other than U.S. English.
If you intend to use the format, you should ensure your file contains valid UTF-8 characters. You may have to use a third party editor to convert your file to UTF-8 prior to the import into NetSuite. Some editors will append the BOM (Byte Order Marker) to the beginning of the file indicating that it is UTF-8 encoded. UTF-8 encoded CSV files are imported regardless of whether the BOM is present.
Note that on Windows platforms, you can use Notepad to convert your CSV file into UTF-8 format. Open your CSV file in Notepad, click File > Save As and select UTF-8 from the Encoding dropdown before clicking Save in the Save As popup window.
-
Western (Windows 1252) encoding — Typically, NetSuite users create CSV files for import in Microsoft Excel, which uses Windows 1252 character coding. Western (Windows 1252) is the default for the U.S. edition.
-
Western (ISO-8859-1) encoding — ISO-8859-1 encodes what is commonly referred to as “Latin alphabet no. 1,” consisting of 191 characters from the Latin script. The character-encoding is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is also commonly used in most standard romanizations of East Asian languages.
-
Chinese Simplified (GB18030) — GB18030 is the registered Internet name for the official character set of the People's Republic of China (PRC), superseding GB2312. The character set is formally called “Chinese National Standard GB 18030-2005: Information technology -- Chinese coded character set.” GB abbreviates Guójiā Biāozhun, which means national standard in Chinese. (Description cited from Wikipedia.) Chinese Simplified (GB18030) is the default when the Chinese language preference has been selected.
-
Chinese Simplified (GBK) encoding extension — Used in mainland China and Singapore.
-
Traditional Chinese (Big5) encoding — Typically used in Taiwan (Province of China), Hong Kong, and Macao.
-
Japanese (Shift-JIS) encoding — Shift-JIS character encoding is the most widely used format for Japanese users intending to import a CSV file. Japanese (Shift-JIS) is the default for the Japanese edition and when the Japanese language preference has been selected.
-
(Western) MacRoman — CSV files created in Excel running on Macs use MacRoman character encoding.
-
Korean (ISO-2022-KR) encoding — Character code structure for Korean text (ISO standard).
-
Korean (EUC-KR) encoding — One of the more widely-used legacy character encodings (extended UNIX code).
If invalid characters prevent a file from being processed for import, you receive an error and are able to fix it. For more information, see CSV Import Error Reporting.