Bug: Encoding

Discussion of the version of HanDBase that runs on the iPhone and iPod touch devices. This includes the synchronization conduits as well.

Bug: Encoding

Postby vlada » Mon Mar 18, 2013 9:42 am

Dear all, I have read that database format doesn't support unicode and it may work with some European languages. I have problem with Czech language.

There is a problem with some acented characters in Czech language (chars like: ěščřžýáíéúůĚŠČŘŽÝÁÍÉÚŮ). The behaviour is not consistent and these situations occur:
- Char is stored and displayed without problem (for example: í é á)
- Char is stored but is displayed like something else (for example: Ř)
- The whole text (field) containing a character is cleared and lost on save (for example: Č č) !!!!

It is possible to make the behaviour consistent? The lost of the text is the worsest behaviour.

Possible solution:
- support unicode (the best but probably not easy and possible)
- support another 8bit encoding that is able to hold characters for particular language - for Czech for example ISO/IEC 8859-2 (probably not possible also)
- add options to convert all characters to ASCII (e.g. éě=e, č=c, Ť=T, Ř=R....) (I think that it is easiest)
- convert only unsupported character to their ascii equivalent

Kind regards V.
vlada
 
Posts: 6
Joined: Fri Mar 15, 2013 11:17 am

Re: Bug: Encoding

Postby dhaupert » Mon Mar 18, 2013 9:48 am

Hi there,

The character set that we use is part of the format of the database itself and it's Windows Latin 1, also known as Windows-1252.

Here's info on it and the characters included:

http://en.wikipedia.org/wiki/Windows-1252

Changing to a different encoding would kill the cross compatibility on devices that don't support them (eg, Palm) and thus the long term solution is a new database format that supports unicode and thus all languages. I hope to do this in the future, but there are some big items on the list that must be tackled first including the Mac Desktop improvements, Forms on Android, and a cloud sync solution. Hoping after those get done I can start on this major update!
dhaupert
 
Posts: 4111
Joined: Tue May 26, 2009 11:51 am

Re: Bug: Encoding

Postby vlada » Mon Mar 18, 2013 10:45 am

Hi, thank you for explanation. What about adding the conversion to ASCII for non-supported chars?
It is boring to lost whole text when you miss one unsupported char usually inserted by iPhone spelling correction :(

It should be quite simple function that convert acented character to their ascii reprezentation. if there is not appropriate function, you can put into setting two strings (one unicode, one ascii) where user provide conversion matrix like:
ěščřžýáíéĚŠČŘŽÝÁÍÉ
escrzyaieESCRZYAIE
vlada
 
Posts: 6
Joined: Fri Mar 15, 2013 11:17 am

Re: Bug: Encoding

Postby dhaupert » Tue Mar 19, 2013 10:25 am

Hi there,

I thought that losing the accent on the character could have some disasterous effects on the text and didn't really want to take the chance on that. I do believe long term the solution is to switch to unicode and thus do things right the first time.
dhaupert
 
Posts: 4111
Joined: Tue May 26, 2009 11:51 am

Re: Bug: Encoding

Postby vlada » Thu Mar 21, 2013 2:12 am

Hi, I suppose that losing accent on the character is more preferred than lossing the whole text :!:
Of course the losing accent will be optional behaviour - particular (non english) user can decide if he preffer lose the accent ...
Because the acented text is not supported for most languages user must write the text without accents :!:
So the disasterous efect on the text is already present in manual input (there is no another way) and automatic conversion helps the user to input the text and minimize chance that due to one accented character the whole texi is lost!!!

Long term solution to do it right (probably unicode) is great idea but what should we do an years before the unicode version will be done?
vlada
 
Posts: 6
Joined: Fri Mar 15, 2013 11:17 am

Re: Bug: Encoding

Postby dhaupert » Thu Mar 21, 2013 6:44 am

Hi there,

The reason I'm saying that I'd rather do the long term solution is that I'm not available to work on this now anyway- I have several big projects/updates/add-ons that I am working on this year and so I wouldn't be working on this until a later time anyway. The change you're proposing is not a trivial one- there are hundreds of places in the code that text is encoding and decoded from Unicode (the native iOS encoding used in text boxes on the device) to our own encoding for storage. Your change also adds a lot of processing time to these operations which will effect the speed of the program for all of the people who are already able to use it in their language. So there are a bunch of reasons why this type of change is unlikely at this time.

We have always been up front about the encoding limitations- it's even in the description of the app on the App Store, because it's a limitation of the program and not something that is going to change very soon. I'm sorry for the inconvenience!
dhaupert
 
Posts: 4111
Joined: Tue May 26, 2009 11:51 am


Return to HanDBase for iPhone and iPod touch

Who is online

Users browsing this forum: No registered users and 2 guests