phplist

NOTE:: Before reporting an issue, make sure you are running the latest version, currently 3.3.1


View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0001644phplist applicationHTML Email Supportpublic30-08-04 04:2717-05-11 15:48
Reporterkiang 
PriorityhighSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version2.8.11 
Target Version4.0.xFixed in Version2.11.6 
Summary0001644: Subject error with UTF-8 encode in Traditional Chinese
DescriptionI'm trying to use phplist in UTF-8 enviornment with Traditional Chinese language. Everything seems OK but the subject. Some characters of the subject were be translated to no meanful ones. Anyone could please tell me where or which of the scripts I can trying to fix this problem?? I'm still looking into the code..
Additional InformationVersion: 2.8.11
I've modify the encode of the language file, config for both text and html emails to UTF-8.
TagsNo tags attached.
Attached Filesdiff file icon utf8_fix_for_svn_r1703.diff [^] (3,378 bytes) 21-01-10 10:13 [Show Content]

- Relationships Relation Graph ] Dependency Graph ]
related to 0003139resolvedmichiel Hebrew support 
related to 0003721closed phplist 2.10.x 
related to 0004079resolvedmichiel corrupted russian message subject and from fields when editing 
related to 0011562resolvedmichiel Random Character Encoding Bug in SHIFT-JIS Japanese emails body & Subject 
related to 0011585resolvedmichiel Custom Placeholders / Attributes with special characters in HTML-area 
related to 0005528resolvedmichiel Overwriten config value 
parent of 0013382resolvedmichiel Encoding problems 
parent of 0013291resolvedsupport HTML Email Support and character entity encoding 
parent of 0015536resolvedmichiel Wrong encoding for text version through PhpMailer 
has duplicate 0015245resolveduser4402 Message footer does not display special characters, like é ó ö ü etc. 
has duplicate 0014238resolveduser4402 Wrong encoding of pages 
has duplicate 0015250resolveduser4402 RSS feeds encoded in ISO-8859-1 do not display correctly in UTF-8 encoded messages 
has duplicate 0015258resolveduser4402 email body being sended with UTF-8 encoding 
has duplicate 0009309resolveduser4402 Special characters (ä, ö, é, ç, ã etc.) do not display correctly with UTF-8 charset selected 
has duplicate 0015159resolveduser4402 can dispaly chinese properly, and I also upload simplfied chinese , pls update it. thanks. 
related to 0008134resolvedmichiel Send Message - After "Save Changes" - Hebrew Subject broken 
related to 0015241resolveduser4391 Subject will empty when we edit the message 
related to 0015324resolvedmichiel Subject and From turn to Gibberish when saved not in English 
related to 0015362resolvedmichiel overall handling of charsets 
related to 0015407resolvedmichiel pagetop seems not to be included 
related to 0015298resolvedmichiel userdata substitution in URL not working for UTF databases 

-  Notes
(0002132)
michiel (manager)
04-10-04 13:36

As I wouldn't know how to solve that, do you have any tips of how to make that work?
(0005995)
kiang (reporter)
07-08-05 20:12

I found the problem.

Line 869, 875 in the file 'lists/admin/send_core.php', the script try to use htmlentities function for process characters. In my enviornment, it would be better if use this function in following format:

htmlentities($subject, ENT_QUOTES, $_SESSION['adminlanguage']['charset'])

I don't know if this cause other problems in other enviornments. I could always solve the same problem in other page. :)
(0010629)
user1177
13-02-06 16:53

Did this get resolved in 2.10?
(0019579)
michiel (manager)
04-10-06 20:21

instead of using the admin language from the session, I've hardcoded UTF-8, because it may as well be that someone has the interface in english, but wants to send chinese. Let's see if that sorts it
(0043080)
nordblad (reporter)
18-03-08 20:49

htmlentities($subject,ENT_QUOTES,'UTF-8')

gives me problems with Swedish special characters (åäö). The message subject just disappears when I click "Save Changes". I guess my input is in ISO-8859-1, because

htmlentities($subject,ENT_QUOTES,'ISO-8859-1')

fixes it. So does

htmlspecialchars($subject,ENT_QUOTES),

which looks even nicer to me. Wouldn't that work for Chinese too?
(0050558)
h2b2 (manager)
17-03-09 22:06

A somewhat similar issue involving PHP 5.2.5 was discussed in the PHP bug tracker: http://bugs.php.net/bug.php?id=43549 [^]

This discussion seems to indicate that if you set htmlentities to UFF-8, you'll need to make sure that the charset for the html page containing the form is also set to UTF-8.

Currently the 'send a message' page produced by phplist is set to:
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

Instead of:
   <meta http-equiv="content-type" content="text/html; charset=utf-8" />
(0050565)
h2b2 (manager)
20-03-09 06:22

On my system, entering special characters in the subject field did not make the subject field disappear completely. Only the special characters disappeared, while normal characters still where displayed, e.g.: "This is a Tést" would display as "This is a Tst".

So I ran a quick test which confirms my previous note, i.e., you need to set the content-type of the HTML page holding the input form fields to UTF-8 if you want to htmlentities($subject,ENT_QUOTES,'UTF-8') to work correctly, i.e., if <meta http-equiv="content-type" content="text/html; charset=utf-8" /> is used, the subject field "This is a Tést" would display in full in the received text and html message.

To do so I had to change $strCharSet in english.inc from ISO-8859-1 to utf-8. Other language files than english.inc would probably need recoding all special characters to UTF-8.

My test system is configured as follows.
Configuration page:
 - Charset for HTML messages: UTF-8
 - Charset for Text messages: UTF-8

config.php:
 - $language_module = "english.inc"; # with this change in english.inc: $strCharSet = 'utf-8';
 - define("HTMLEMAIL_ENCODING","quoted-printable");
 - define("TEXTEMAIL_ENCODING",'7bit');

Server info:
phplist 2.10.9
Linux/Apache
PHP 5.2.3
MySQL 4.1.12 - with *database encoding* set to: utf8_unicode_ci


Some remarks:
- I expected the charset for the backend's html pages to be defined by the settings in languages.php, e.g.:
    "en" => array("English ","iso-8859-1","iso-8859-1, windows-1252 "),
This is not the case, and I'm not sure what exactly the language.php charset settings are used for.
- I wonder whether it is a good idea to hardcode UTF-8 anywhere in the code. It would seem more flexible to have the charset configurable, e.g. through the charset defined on the configuration page.
- Inclusion of UTF-8 encoded .inc language files should perhaps be considered for future phplist releases, along with the existing iso-* encoded files.
- Installation procedures (and documentation) could perhaps include giving the user a choice of charsets to use for database encoding.
(0050577)
h2b2 (manager)
22-03-09 20:50

I found an interesting article which identifies different aspects that come into play in a PHP/MySQL/UTF-8 application. These are the principal ones:
- the database (individual tables + any text columns) should be set to UTF-8
- the PHP server should send a header telling the browser to expect UTF-8, e.g.:
     header('Content-Type: text/html; charset=utf-8' );
- the HTML page's Content-Type should be set to UTF-8, i.e.:
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
- and, the PHP-MySQL connection should be set to UTF-8, since it will otherwise default to latin1

The article also provides a useful solution (based on SET NAMES) and some code examples.
For more info, please see: http://www.adviesenzo.nl/examples/php_mysql_charset_fix/ [^]
(0050579)
h2b2 (manager)
23-03-09 01:06

Additional server related info from my test system, collected by using the following query in phpMyAdmin: SHOW VARIABLES LIKE 'character_set_%'

character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_results utf8
character_set_server latin1
character_set_system utf8
(0050580)
michiel (manager)
23-03-09 02:32

great, that's very helpful research. thanks for that.
(0050581)
h2b2 (manager)
23-03-09 04:02

Glad to help.

Just a few additional comments regarding hardcoding the charset. Currently UTF-8 has been hardcoded in the following files (v2.10.9):

processbounces.php
  $message = html_entity_decode($message,ENT_QUOTES,'UTF-8');
  
processqueue.php
  $line = html_entity_decode($line,ENT_QUOTES,'UTF-8');
      
sendemaillib.php
  $text = html_entity_decode ( $text , ENT_QUOTES , 'UTF-8' );
        
send_core.php
  value="'.htmlentities($subject,ENT_QUOTES,'UTF-8').'"
  value="'.htmlentities($from,ENT_QUOTES,'UTF-8').'"
  value="'.htmlentities($forwardsubject,ENT_QUOTES,'UTF-8').'"
    
class.phplistmailer.php
  $this->Body = html_entity_decode($text ,ENT_QUOTES, 'UTF-8' ); #$text;
  $this->AltBody .= html_entity_decode($text ,ENT_QUOTES, 'UTF-8' );#$text;
  $this->Body .= html_entity_decode($text ,ENT_QUOTES, 'UTF-8' );#$text;
                
So, the problem will probably not only occur in the subject line, but most likely also in the From: (name) line, and the forward subject line. (Haven't checked the forum for reports on this yet).

It seems to me that instead of hardcoding UTF-8, it might be an improvement if all hardcoded instances of 'UTF-8' were replaced by something like $GLOBALS['strCharSet']
In that case, it is the .inc language file's encoding that will determine the charset for the whole phplist system (frontend, backend, and backend input fields) except for the Charset settings for HTML and Text messages on the configuration page, and except for the encoding for the database and the database connection.

While this would be an improvement, it is not yet an ideal situation, considering that things may still go wrong if the database encoding and/or database connection isn't compatible with the charset, or if the user forgets to change the charset used for message encoding (configuration page) for instance.

I guess the best way to solve this, would be to have a phplist installation script give the user a choice from a number of charsets. The installation script should then make sure the whole system, including the database/db connection, is made ready for use under the chosen charset.
(0050612)
h2b2 (manager)
02-04-09 22:49

-
Probably related to:
http://mantis.phplist.com/view.php?id=5017 [^]
http://mantis.phplist.com/view.php?id=9309 [^]
http://mantis.phplist.com/view.php?id=13382 [^]
http://mantis.phplist.com/view.php?id=13291 [^]
http://mantis.phplist.com/view.php?id=14238 [^]
http://mantis.phplist.com/view.php?id=15241 [^]
http://mantis.phplist.com/view.php?id=15245 [^]

Possibly related to:
http://mantis.phplist.com/view.php?id=15250 [^]


Also found a report on the forum which seems to confirm the issue also occurs when the name in the From: line contains special characters.
See: http://forums.phplist.com/viewtopic.php?p=61323#61323 [^]
(0050620)
h2b2 (manager)
09-04-09 08:09

-
A useful suggestion from the forum:

==== Start Quote ====

to support encoding properly you should add these lines:

mysql_query("SET CHARACTER_SET_CLIENT=utf8");
mysql_query("SET CHARACTER_SET_RESULTS=utf8");
mysql_query("SET CHARACTER_SET_CONNECTION=utf8");


to mysql.inc

==== End Quote ====
(0050837)
adrian15 (reporter)
21-01-10 10:17

A similar problem appears in revision 1703 from the svn.
I am trying to write in Spanish but I think it applies to other languages also.
I attach a patch (I think it is not a definitive patch but a workaround) to solve this problem.

In my opinnion most of the problems that I have might come from the fact that, whatever the reason is, pagetop page is not included in any of the admin pages.

But I am not quite sure because I am not an expert on this utf8 issues.

adrian15
(0051065)
haipo (reporter)
01-08-10 20:34

Hello,
After reading carefully all the expalnations and trying everything described here,
I still have the same problem described above.
When writing the subject in Hebrew, strange signs apear and the text is corrupted.
The same happens with the "From".
I have chosen the laguage text "hebrew-utf8"
I set the Charset for HTML messages = UTF-8, Charset for Text messages = UTF-8
And still no use.
Please advise,
Yaron, Haipo.co.il
(0051115)
h2b2 (manager)
06-10-10 04:11

For more in-depth info on MySQL encoding pitfalls and solutions, see: http://mysql.rjweb.org/doc.php/charcoll [^]
(0051116)
h2b2 (manager)
06-10-10 04:52

@haipo: since v2.10.11, you will also need to set the charset of admin interface pages to UTF-8.
See this forum post for more info: http://forums.phplist.com/viewtopic.php?p=80089#p80089 [^]


Copyright © 2000 - 2017 MantisBT Team
Powered by Mantis Bugtracker