There are thousands of languages in the world. All these languages are different in the words and syntax they use, and many of these languages use different alphabets and character sets. As a global product, Oracle gives you the ability to support many different languages and character sets. Oracle supports language-dependent data and language-independent functions.
- Language-dependent Data:
Once a piece of data is stored in the Oracle database, it is meant to be stored securely forever. Because of this, Oracle uses a particular character set to store the data. This character set is responsible for creating binary representations of the data in the database. Oracle can store the characters in the database whether a language uses a relatively small alphabet that can be stored in a single byte, like English, or a more complex representation that can use multiple bytes for each character, such as Chinese or Japanese.
- Language-independent Functions and Support
The data in the database is different from the language used by an end user. You may have a situation where an end user wants to be able to store data in one language, yet requires the use of functions or the delivery of error messages in a different language. Even more common is the scenario where a single database will be used by individuals who understand different languages. Oracle allows each user to specify his or her own language preference. A single Oracle database can, for instance, deliver error messages in many different languages simultaneously. Although the interpretation of data is fixed to a particular language, the use of that data is not. In the next lesson, you will be introduced to the parameters used to determine how Oracle handles national language characters in the runtime environment.
Oracle's National Language Support (NLS) is a comprehensive framework that enables the database to handle multiple languages, character sets, and regional data formats. This allows Oracle databases to store, process, and retrieve data in various languages and cultural conventions. Here's an overview of how Oracle implements NLS and supports foreign languages:
-
Character Sets and Encoding
- Oracle supports a wide range of character sets, including single-byte (e.g., ASCII, ISO-8859-1) and multi-byte character sets (e.g., UTF-8, UTF-16).
- Unicode (UTF-8 and UTF-16) is particularly important for global applications, as it can represent virtually all characters from all languages.
- The database character set determines how text data is stored, while the national character set is used for specific data types like
NCHAR
, NVARCHAR2
, and NCLOB
.
-
Locale-Specific Settings
- Oracle uses NLS parameters to control locale-specific behavior. These parameters can be set at different levels:
- Database level: Default settings for the entire database.
- Instance level: Settings for a specific database instance.
- Session level: Settings for a specific user session.
- SQL statement level: Overrides for individual SQL statements.
- Common NLS parameters include:
NLS_LANGUAGE
: Controls language for messages and day/month names.
NLS_TERRITORY
: Determines default date, number, and currency formats.
NLS_DATE_FORMAT
: Specifies the default date format.
NLS_CURRENCY
: Defines the currency symbol.
NLS_SORT
: Specifies the linguistic sort order for character data.
-
Linguistic Sorting and Comparison
- Oracle provides linguistic sorting (collation) to handle language-specific rules for sorting and comparing text data.
- For example, in Spanish, "ch" is treated as a single letter and sorted between "c" and "d."
- The
NLS_SORT
parameter allows you to specify the collation sequence (e.g., BINARY
, FRENCH
, SPANISH
).
-
Date, Time, and Number Formats
- Oracle adapts date, time, and number formats to match regional conventions. For example:
- Dates can be displayed as
DD/MM/YYYY
in the UK or MM/DD/YYYY
in the US.
- Decimal separators can be a period (.) or comma (,), depending on the locale.
- These formats are controlled by NLS parameters like
NLS_DATE_FORMAT
, NLS_NUMERIC_CHARACTERS
, and NLS_TIMESTAMP_FORMAT
.
-
Error Messages and User Interface
- Oracle provides translated error messages and user interface elements in multiple languages.
- The
NLS_LANGUAGE
parameter determines the language used for error messages and other database-generated text.
-
Support for Multilingual Applications
- Oracle supports multilingual applications through features like:
- Unicode data types (
NCHAR
, NVARCHAR2
, NCLOB
) for storing text in multiple languages.
- Globalization Development Kit (GDK): A set of tools and APIs to help developers build globalized applications.
- Locale Builder: A tool for creating custom locale definitions.
-
Data Conversion
- Oracle provides utilities like the
ALTER DATABASE CHARACTER SET
command and the CSSCAN
tool to migrate databases to different character sets.
- The
NLS_LANG
environment variable (used by Oracle clients) ensures proper character set conversion between the client and the database.
-
Time Zone Support
- Oracle supports time zone-aware data types like
TIMESTAMP WITH TIME ZONE
and TIMESTAMP WITH LOCAL TIME ZONE
.
- This is crucial for applications that operate across multiple time zones.
-
Documentation and Resources
- Oracle provides extensive documentation on NLS, including the Oracle Database Globalization Support Guide, which covers best practices for implementing multilingual support.
Example of NLS Usage
-- Set session-level NLS parameters
ALTER SESSION SET NLS_LANGUAGE = 'FRENCH';
ALTER SESSION SET NLS_TERRITORY = 'FRANCE';
ALTER SESSION SET NLS_DATE_FORMAT = 'DD/MM/YYYY';
-- Query with locale-specific settings
SELECT TO_CHAR(SYSDATE, 'Day, DD Month YYYY') FROM DUAL;
-- Output: "Mardi, 10 Octobre 2023"
By leveraging NLS, Oracle ensures that applications can operate seamlessly in a global environment, accommodating diverse languages, cultural conventions, and regional preferences.
The NLS_LENGTH_SEMANTICS parameter still exists and is supported in Oracle 19c. This parameter controls whether the length semantics for `CHAR` and `VARCHAR2` data types are measured in bytes or characters. The two possible values are:
- `BYTE`: The length of the column is specified in bytes. This is the default value.
- `CHAR`: The length of the column is specified in characters, which is useful in multibyte character set environments.
Using `NLS_LENGTH_SEMANTICS` can be particularly important in applications that handle multilingual data, where the difference between byte length and character length can have significant implications for storage and processing.
You can change this parameter at the session level using `ALTER SESSION`, allowing flexibility depending on the needs of the specific session or operation.
NLS_LENGTH_SEMANTICS
Property |
Description |
Parameter type |
String |
Syntax |
NLS_LENGTH_SEMANTICS = string
Example: NLS_LENGTH_SEMANTICS = 'CHAR'
|
Default value |
BYTE |
Modifiable |
ALTER SESSION |
Range of values |
BYTE | CHAR |