I would like to reopen a question related to the following:
(Czech) character set support in gvim 7.3 on Windows 7
Basically, in that post I noticed that some Czech characters were being displayed as black squares. So I posted the question and noticed that the problem seemed to go away by changing the font. I thought that solved the problem because the characters in the file I was using displayed correctly.
However, I have noticed the following: while some Czech characters display correctly by changing the font from the Gvim menu, others do not display correctly:
For instance when I paste the character Ů (Latin capital letter u with ring above) or ů (Latin small letter u with ring above), no font displays the resulting character correctly. For instance, the Fixedsys font displays a black square and a small u, respectively, while Lucida Console displays a capital U and a small U, respectively. I have tried all fonts available from the gvim drop-down menu, and none seem to work for this particular case.
The problem does not end here. The input method for unicode characters produces the wrong characters:
CTRL-V u0160 should produce the Czech character (Š) but the backquote (') is inserted instead. CTRL-V u016e should produce the Czech character (Ů) but the n character (n) is inserted instead. And the list goes on.
As if that were not enough, there is a list of alternative input method key combinations at the following site (which is a list of digraphs): http://code.google.com/p/vim/source/browse/runtime/doc/digraph.txt
but despite having the latest verion of gvim, when I type ":digraphs", this list does not show up. Only the old list from gvim 7.3 shows up, which does not include these.
For instance CTRL-K U0 and CTRL-K u0 both produce the character zero instead of the following:
Ů U0 016E 0366 LATIN CAPITAL LETTER U WITH RING ABOVE
ů u0 016F 0367 LATIN SMALL LETTER U WITH RING ABOVE
To summarize, despite gvim 7.4 being recently released, none of the distributed fonts are compatible with the Czech language, inserting unicode via CTRL-V seems to produce the wrong characters, and digraph support is incomplete.
Thank you for your answers.
Answer
Problem is that coding Latin-2 (iso-8859-2) and Windows-1250 (used by windows) differ in some characters:
ž, š, ť, Ž, Š, Ť
All differences are summarized at Wikipedia or Czech version
If you set encoding=cp1250
, then it'll be ok.
I don't want to prolong comments so I'm adding that here.
There is a problem that standard code page uses only 1byte
(hex 100) for characters, so there are ISO standards for different languages. If you have set encoding iso-8859-2
and trying to add unicode character (hex 160) Š
, than gvim loops over to character (hex 60). You have to use codes ISO-8859-2, where Š
ìs (hex 089). Other codes here: http://cs.wikipedia.org/wiki/ISO_8859-2
UTF-8 on the other hand uses 2bytes
and contains simultaineously all? letters and signs. So if you use set encoding=utf-8
and then add U0160
or U5927
you'll get Š
resp. 大
.
Fixedsys
contains ů and Ů, OR there is a difference in font versions between Windows language mutations (I use Czech version), but I doubt that. You can use windows utility Charmap.exe
, there you can select desired font and check which characters it supports, even their unicode code.
I was trying briefly some of default fonts in GVim and there seems to be some that supports Chinese (ie MS Mincho
), but I don't which signs are important.
GVim seems to be supporting only monospace
character fonts so, if you'll be searching for another font be aware of that. :)
Comments
Post a Comment