The fact that messages that contain characters outside of the GSM character set require Unicode encoding and are limited to 70 characters per message instead of the expected 160 character limit, is a frustrating limitation for many languages.
Messages longer than 70 characters can, of course, be sent, but they are sent as multipart (segmented) messages, and reassembled by the receiving client. If a message that contains Unicode characters is longer than 70 characters, it is broken into segments of 67 characters for sending. To send a 160 character message requires 3 SMS messages if the message contains characters that are not part of the GSM character set.
The original GSM protocol was developed in Western Europe, so it includes most Western European characters, plus capital letters in the Greek alphabet in order to facilitate Greek SMS support. You can view a table of characters in the GSM character set at the following link: http://www.nowsms.com/long-sms-text-messages-and-the-160-character-limit
A recent posting on the NowSMS Technical Forums discusses recent developments in the 3GPP specifications to add additional national language support to the SMS standard, and eventually overcome these limitations. For additional discussion of this topic, we recommend joining the discussion at http://www.nowsms.com/discus/messages/1/69650.html. The start of this discussion is highlighted below.
I have a question about sending SMS messages using Turkish national characters. Specifically, these characters are a problem:
Ğ, ğ, Ş, ş, İ, ı, ç (upper case Ç is ok?)
I can send messages that contain these characters ok, but NowSMS encodes the messages with Unicode. This means if I send a message longer than 70 characters, it costs me to send two or more messages … instead of normal 160 character limit.
But in Turkey I have heard that there is a way to send these national characters without forcing the whole message to use Unicode encoding. I do not know how it works, but I have heard that this feature is called a locking shift table. Instead of the standard GSM 7-bit character table, mobile phones in Turkey must support a locking shift table that replaces the GSM characters with national characters for Turkey.
Can NowSMS support this SMS locking shift table?
The fact that messages that contain characters outside of the GSM character set require Unicode encoding and are limited to 70 characters per message instead of the expected 160 character limit, is indeed frustrating for many languages.
The shift tables that you mention are a relatively new development. There is a concept of a locking shift table that replaces the GSM 7-bit character set, and a single shift table which provides additional characters.
You are correct that these shift tables can replace and extend the default GSM 7-bit character set table so that more national characters can fit into a single SMS.
We’ve received a number of inquiries from handset testing labs about them, but I’m not sure that they are used in production systems. (Perhaps they are in active use in Turkey as I can see that there was national legislation there that prompted the 3GPP to develop a solution.)
In addition to Turkey, there are shift tables defined for the Spanish and Portuguese languages. These were all added in 3GPP release 8 (3GPP TS 23.038 and 23.040), which only started being released in 2008.
The Spanish shift table adds ç, Á, Í, Ó, Ú, á, í, ó, and ú.
The Portuguese shift tables add support for the following national language characters: Á À Â Ã ª á à â ã É Ê é ê Í í Ó Ô Õ º ó ô õ Ú Ü ú ü ` ç ∞
3GPP Release 9 adds 10 shift tables for languages of the Indian subcontinent: Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Punjabi, Tamil , Telugu, and Urdu.
Update: Shift table support is now available in NowSMS. Additional information and discussion is available at the following link: http://www.nowsms.com/discus/messages/1/70000.html.
For comments and further discussion, please click here to visit the NowSMS Technical Forums (Discussion Board)...