Correct encoding of cmsimple content

To acchieve correct presentation of websites, it needs a harmonic configuration, as many elements contribute:

Whenever you see text in a browser, a symphony of settings and configurations is played:

  • server
  • browser
  • CMSimple configuration
  • templates
  • language files
  • the online editor
  • and maybe other factors as well …

As a website-visitor you have one problem: you must configure your browser to automatically detect the encoding of the actual webpage, not using a default encoding for all websites of the world

As a website-producer, you have far more problems, unfortunately

but there is help, as knowledge about settings and configuration will help you to fix this problem.

The following information is particularly target at CMSimple(_XH) installations in which meta_codepage is configured to UTF-8, what's the default since CMSimple_XH 1.2 and mandatory from CMSimple_XH 1.5 onwards.

adjust your browser correctly

whenever you see some funny question marks or other unexpected characters instead of special characters like german umlauts for example, üäöÜÄÖß,

check this:

your browser should detect the character encoding of the website automatically and not set a fixed encoding itself

in Firefox you can define that for example: View / Character Encoding / Automatically Other Browsers offer similar functions

find out if the editor is the problem

As a CMSimple User and Administrator, you have to check far more things than a website visitor, for sure

First you must identify whether the editor is the reason for the broken character encoding or not

To do this,

  1. disable Javascript in your browser
  2. open one page in edit mode. You will get only a textarea to enter text, no editor at all
  3. enter some text with special characters
  4. submit and check the page in View-mode.

If the special characters are wrong, it is definitely not the editor who causes the problem.

Check the server configuration

This information relates to APACHE webserver, as I have no knowledge of other servers unfortunately

Check the HTTP_ACCEPT_CHARSET of APACHE:

write the following php-script, name it info.php for example, upload it to your server and call it in the browser

do not forget to delete this file after the check, for security reasons

<?php
phpinfo();
?>

1. If you find the following information in the output, the server is configured well (for our purposes):

HTTP_ACCEPT_CHARSET UTF-8,*

configured well means that your server accepts UTF-8 plus possibly other encodings

2. If you do not see UTF-8 there,

  • you might ask your hosting-provider to add UTF-8 to your configuration
  • you can place a .htaccess-file in the root of your web or in the folder, where you installed CMSIMPLE

with the following content:

AddCharset UTF-8

If that doesn't work, you might try to force the charset of CMSimple with the following directive in .htaccess:

<filesMatch "^index.php$">
AddDefaultCharset UTF-8
</filesMatch>

3. If you see read that the server accepts UTF-8, but not as the first characterset, maybe like this:

HTTP_ACCEPT_CHARSET iso-8859-1,UTF-8,*

then some of the browsers will not use UTF-8, but use iso-8859-1 instead

in that case ask your hoster to change the sequence of the charactersets or use a separate .htaccess-file for your cmsimple-directory (see above)

CMSimple-Settings

The CMSimple-Settings and the active template must be synchronized:

  • log in in CMSimple
  • go to Settings / edit language / META and set the meta-information like this:
meta_codepage: UTF-8
  • check that your active template adds no other codepage-information
  • go to Settinges / edit template and check that the template contains this string:
<?php echo head() ;?>
  • but no other meta-information, absolutely forbidden is a statement like the following one:
<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />

because the correct information is added by CMSimple to the output

further template-related actions

collision of the file format

there might be a collision between the file format of your template and the relating css-stylesheet when they are formatted in ANSI but not in UTF-8-formate

so it is suggested that you

  • download both files, template.htm and stylesheet.css,
  • open them in a UTF-8-capable editor
  • save them in UTF-8 format and
  • re-upload them to your server into the template-directory
  • open the file with that editor and use “Format / convert to UTF-8 without BOM ” ⇐ without BOM is important!

Remember: you need an editor, not a wordprocessor for this step!

If you are a Windows-user, I suggest Notepad++, which is a really formidable editor

one more step

if you whish, you might add the following directive to your CSS-file stylesheet.css in the active template-directory as the very first line: @charset “utf-8”;

CMSimple Files: language file, content, plugin files etc.

It is helpful to convert other files as well to UTF-8-format.

To do so, download some files of your CMSimple installation, open them in the UTF-8-editor (Notepad++ for example), save them in UTF-8 without BOM format and re-upload them to your CMSimple directory at the server again.

These files are essential:

- content/content.htm
- otherlanguagedirectory/content/content.htm
- the language-files in the folder cmsimple/languages
- check your plugins for relevant files as they also should be formatted to UTF-8 format:
- - language files
- - stylesheets
- - templates
- - include files

FCKeditor configuration

The FCKEditor, like we distribute it in the FCKEditor4CMSimple-package, is UTF-8 by default. Older versions of FCKEditor do not support completely UTF-8, but you should not use those old versions for sure

The FCKEditor-Website at http://docs.fckeditor.net/FCKeditor_2.x/Developers_Guide/Localization says: “An important thing is to save the files in the UTF-8 encoded text format. Otherwise, some strange characters could appear instead of any special characters used by different languages, like accented letters, symbols, etc.”

The Editor uses some configuration files which you might edit If so, check them that they are all in UTF-8-file-format. These files and their location is defined in the editor-configuration:

FCKConfig.StylesXmlPath = '../fckstyles.xml' ;
FCKConfig.TemplatesXmlPath = '/mytemplates.xml' ;

There are also languages files for the editor in the directory “editor/lang” , which you can check if they are in the correct format or not

If you use Editor-plugins with the FCKEditor, check these plugins as well whether they contain language files etc. which might be stored in a wrong format

FCKeditor4CMSimple keeps the relevant configuration files in the folder “custom_configurations”:

  • custom_fck_editorarea.css
  • custom_fckstyles.xml
  • custom_fcktemplates.xml
  • fckconfig_cmsimple.js

check them if they are stored in the correct file format

additional information and weblinks

There was a suggestion to add a codepage-directive to forms which you use with CMSimple, so the user-input is stored in UTF-8-format in all cases to achieve this set your form-definitions in your pages, articles like this:

<form action="...." method="...." accept-charset="UTF-8">

There is a lot of information about UTF-8 out there in the web, a very interesting one is this:

Can I use UTF-8 on the Web?

editors/correct_encoding_of_cmsimple_content.txt · Last modified: 2018/04/05 14:48 (external edit)
Webdesign: NMuD chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0