<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Reasons for using UTF-8</title>
	<atom:link href="http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/feed/" rel="self" type="application/rss+xml" />
	<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/</link>
	<description>repeat 4[fd 100 rt 90]</description>
	<lastBuildDate>Mon, 30 Jan 2012 21:18:07 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: sole</title>
		<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47115</link>
		<dc:creator>sole</dc:creator>
		<pubDate>Wed, 19 Dec 2007 15:03:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47115</guid>
		<description>I understand (and I agree with it): sometimes it&#039;s not easy to switch to utf8. I for example have to do some work for a certain website which has currently a mix of latin and utf8 content and it&#039;s looking to take quite a bit of work!

But once that&#039;s sorted out &quot;you just forget about that&quot;.

Unfortunately I don&#039;t have any experience with programming Windows Controls in unicode or utf8... last time I did any Windows GUI thing was with MFC and I can hardly remember anything! Sorry :-/</description>
		<content:encoded><![CDATA[<p>I understand (and I agree with it): sometimes it&#8217;s not easy to switch to utf8. I for example have to do some work for a certain website which has currently a mix of latin and utf8 content and it&#8217;s looking to take quite a bit of work!</p>
<p>But once that&#8217;s sorted out &#8220;you just forget about that&#8221;.</p>
<p>Unfortunately I don&#8217;t have any experience with programming Windows Controls in unicode or utf8&#8230; last time I did any Windows GUI thing was with MFC and I can hardly remember anything! Sorry :-/</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Norbert</title>
		<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47114</link>
		<dc:creator>Norbert</dc:creator>
		<pubDate>Wed, 19 Dec 2007 13:40:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47114</guid>
		<description>Sole, I agree with you in many points. In fact our team is thinking about converting our Windows application from Latin-1 to UTF-8, instead of going the entire way to Unicode. The reason is, that not all places in our code (of several 100.000 lines) are so cleanly written that just flipping a switch in the project settings will bring us to Unicode. It appears to me a lot painless to stay with an 8-bit based encoding like UTF-8 and nevertheless enjoy all the advantages of Unicode.

That however means to talk to all Windows Controls in Unicode (via the ...W calls) whereas the main application code remains 8-bit based. Has anybody attempted this approach before us, or would you deem this in the end more troublesome than just fixing all &quot;unclean&quot; places in the code, e.g. where literals are used without T(&quot; .... &quot;)?</description>
		<content:encoded><![CDATA[<p>Sole, I agree with you in many points. In fact our team is thinking about converting our Windows application from Latin-1 to UTF-8, instead of going the entire way to Unicode. The reason is, that not all places in our code (of several 100.000 lines) are so cleanly written that just flipping a switch in the project settings will bring us to Unicode. It appears to me a lot painless to stay with an 8-bit based encoding like UTF-8 and nevertheless enjoy all the advantages of Unicode.</p>
<p>That however means to talk to all Windows Controls in Unicode (via the &#8230;W calls) whereas the main application code remains 8-bit based. Has anybody attempted this approach before us, or would you deem this in the end more troublesome than just fixing all &#8220;unclean&#8221; places in the code, e.g. where literals are used without T(&#8221; &#8230;. &#8220;)?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sole</title>
		<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47018</link>
		<dc:creator>sole</dc:creator>
		<pubDate>Thu, 13 Dec 2007 22:16:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47018</guid>
		<description>Very good point actually. I unconsciously had it in mind but it failed to materialise when writing this.

And it&#039;s definitely right. I&#039;ve had virtually zero problems when working with mac and linux together, whereas with windows there&#039;s always the problem with encodings and line feeds (although line feeds do not show up as weird characters in the middle of the page).</description>
		<content:encoded><![CDATA[<p>Very good point actually. I unconsciously had it in mind but it failed to materialise when writing this.</p>
<p>And it&#8217;s definitely right. I&#8217;ve had virtually zero problems when working with mac and linux together, whereas with windows there&#8217;s always the problem with encodings and line feeds (although line feeds do not show up as weird characters in the middle of the page).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Isaac Z. Schlueter</title>
		<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47017</link>
		<dc:creator>Isaac Z. Schlueter</dc:creator>
		<pubDate>Thu, 13 Dec 2007 21:57:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-47017</guid>
		<description>Sole,

Another reason to use UTF-8 in all code and databases is that it greatly facilitates compatibility between team members.   Enforcing it requires a bit of buy-in from the team, but it&#039;s worth it.

On my current project at Yahoo, some of us use Macs, and others use Windows.  A few developers write their code on RHEL or FreeBSD, either in Vim or Eclipse.  The servers either FreeBSD or RHEL.

Once you set up your editor and databases to use UTF-8, it&#039;s immediately apparent when someone is saving in ISO-8859-1.  The bad characters stick out like a sore thumb. The site looks wrong if they&#039;re in the markup.  Our policy is to stop everything, immediately change the bad characters, check the file back into CVS, and then look at the CVS logs and figure out who&#039;s not using UTF-8.  It&#039;s very effective, because no one wants to be &quot;that guy.&quot;

The default encoding on most *nix systems (including Mac) is UTF-8, but the default in Windows is ISO-8859-1, and some *nix programs try to be &quot;convenient&quot; by silently supporting the Windows encoding.  Unless we are all vigilant, it will cause problems.

@Remco

&lt;blockquote&gt;people in Asia aren&#039;t very happy with it - western characters take 1 or 2 bytes, but many Asian ones take 3, 4 or even 5.&lt;/blockquote&gt;

Forgive the insensitivity, but they need to get over it.  At least for the foreseeable future, UTF-8 is the most widely supported Unicode worldwide.  There are editors for every Asian language that can save in UTF-8.  As long as a request is all in one language, the extra bytes are mostly taken care of by serving gzip-encoded pages, anyhow.  (If you&#039;re not telling your web server to gzip textual files, why not? It&#039;s not 1990. Browsers actually support gzip these days!)

Hardware increases in power and capacity with unbelievable speed, and software is necessarily fraught with irreducible complexity. So, whenever possible, it seems that it is generally best to make the hardware do some extra work (storing and encoding 5 bytes per glyph instead of 2), if doing so will make the software simpler and easier to understand (by eliminating the character encoding layer.)</description>
		<content:encoded><![CDATA[<p>Sole,</p>
<p>Another reason to use UTF-8 in all code and databases is that it greatly facilitates compatibility between team members.   Enforcing it requires a bit of buy-in from the team, but it&#8217;s worth it.</p>
<p>On my current project at Yahoo, some of us use Macs, and others use Windows.  A few developers write their code on RHEL or FreeBSD, either in Vim or Eclipse.  The servers either FreeBSD or RHEL.</p>
<p>Once you set up your editor and databases to use UTF-8, it&#8217;s immediately apparent when someone is saving in ISO-8859-1.  The bad characters stick out like a sore thumb. The site looks wrong if they&#8217;re in the markup.  Our policy is to stop everything, immediately change the bad characters, check the file back into CVS, and then look at the CVS logs and figure out who&#8217;s not using UTF-8.  It&#8217;s very effective, because no one wants to be &#8220;that guy.&#8221;</p>
<p>The default encoding on most *nix systems (including Mac) is UTF-8, but the default in Windows is ISO-8859-1, and some *nix programs try to be &#8220;convenient&#8221; by silently supporting the Windows encoding.  Unless we are all vigilant, it will cause problems.</p>
<p>@Remco</p>
<blockquote><p>people in Asia aren&#8217;t very happy with it &#8211; western characters take 1 or 2 bytes, but many Asian ones take 3, 4 or even 5.</p></blockquote>
<p>Forgive the insensitivity, but they need to get over it.  At least for the foreseeable future, UTF-8 is the most widely supported Unicode worldwide.  There are editors for every Asian language that can save in UTF-8.  As long as a request is all in one language, the extra bytes are mostly taken care of by serving gzip-encoded pages, anyhow.  (If you&#8217;re not telling your web server to gzip textual files, why not? It&#8217;s not 1990. Browsers actually support gzip these days!)</p>
<p>Hardware increases in power and capacity with unbelievable speed, and software is necessarily fraught with irreducible complexity. So, whenever possible, it seems that it is generally best to make the hardware do some extra work (storing and encoding 5 bytes per glyph instead of 2), if doing so will make the software simpler and easier to understand (by eliminating the character encoding layer.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cid R Andrade</title>
		<link>http://soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-46973</link>
		<dc:creator>Cid R Andrade</dc:creator>
		<pubDate>Tue, 11 Dec 2007 00:10:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.soledadpenades.com/2007/12/03/reasons-for-using-utf-8/#comment-46973</guid>
		<description>Posted about it in my blog</description>
		<content:encoded><![CDATA[<p>Posted about it in my blog</p>
]]></content:encoded>
	</item>
</channel>
</rss>

