Convert Accented Characters to Non-Accented in PHP

Often “in the wild”, even in English texts, you meet both accented versions and non-accented versions of certain words, such as brand names.

You’re then facing two options – you can either work with both versions, or get rid of accents. The latter option bears advantage of cutting down the dimensionality of the problem, as you don’t need to presume that you support all existing versions of the spelling, but how to get there?

PHP, together with the iconv library, provides a simple solution; presume you’re accepting the brand-name parameter via GET. Then, to get non-accented version, all you have to do is:

setlocale(LC_ALL, "en_US.utf8");
$brand_name = iconv("utf-8", "us-ascii//TRANSLIT", $_GET['brand-name']);

That simple.
The first – setlocale() – call makes sure we’re using correct locale for strings representation (often this is by default set to “C”).
The second – iconv() – call converts the string from UTF-8 to ASCII, transliterating the characters where needed.

Note: On *nix systems, you can check for supported locales using `locale -a`.
Note: Instead of hard-coding the locale to use, you could also use values of HTTP_ACCEPT_LANGUAGE and HTTP_ACCEPT_CHARSET, sent by the browser, to tailor the input locale.

One response to “Convert Accented Characters to Non-Accented in PHP

  1. Pingback: Turn a twine into a current filename in PHP | Zerbel

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s