Issue Details (XML | Word | Printable)

Key: CRM-1439
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed/Completed
Priority: Minor Minor
Assignee: Sameer Mare
Reporter: Curtis Delaney
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
CiviCRM

Google geocode error when processing accented characters

Created: 05/Dec/06 09:41 AM   Updated: 08/Dec/08 12:19 PM
Component/s: Internationalisation, Technical infrastructure
Affects Version/s: 1.6
Fix Version/s: 1.6

Time Tracking:
Not Specified


 Description  « Hide
When tyring to geocode an address with french accents an error is generated.

" warning: simplexml_load_string() [function.simplexml-load-string]:
Entity: line 1: parser error : Input is not proper UTF-8 "


The error is in line Google.php line 107

$xml = simplexml_load_string( $string );

The can be fixed with a small function to remove accents.

A function to do this could be
 
function unaccent($text) {
  static $search, $replace;
  if (!$search) {
    $search = $replace = array();
    // Get the HTML entities table into an array
    $trans = get_html_translation_table(HTML_ENTITIES);
    // Go through the entity mappings one-by-one
    foreach ($trans as $literal => $entity) {
      // Make sure we don't process any other characters
      // such as fractions, quotes etc:
      if (ord($literal) >= 192) {
        // Get the accented form of the letter
        $search[] = $literal;
        // Get e.g. 'E' from the string '&Eacute'
        $replace[] = $entity[1];
      }
    }
  }
  return str_replace($search, $replace, $text);
}

and line 107 would now be

$xml = simplexml_load_string( unaccent($string) );

 All   Comments   Work Log   Change History   FishEye      Sort Order: Ascending order - Click to sort in descending order
Piotr Szotkowski added a comment - 07/Dec/06 04:54 AM
Hm, that's very strange; we're using UTF-8 throughout the whole of CiviCRM - why would your instance send anything other than UTF-8 to the geocode service?

What is the encoding of your CMS/user framework? Are you using Drupal or Joomla?

Curtis Delaney added a comment - 07/Dec/06 06:49 AM
I've checked into this further and the problem actually has to do with the data that is returned from google. If the request returns an address with an accent then the issue arises.

Just for reference our setup is as follows.

We are using Civicrm 7822, Drupal 4.7, php 5.1.6, and mysql 5.0.22.

Curtis Delaney added a comment - 07/Dec/06 06:55 AM
here is a sample of google response that causes the problem. Note the accent is Montréal

<kml>
<Response>
<name>?7012 1er ave QC H2A3H7</name>
<Status>
<code>200</code>
<request>geocode</request>
</Status>

<Placemark>
<address>7012 1E AV, Montréal, QC, Canada</address>
<AddressDetails Accuracy="8">
<Country>
<CountryNameCode>CA</CountryNameCode>
<AdministrativeArea>
<AdministrativeAreaName>QC</AdministrativeAreaName>
<Locality>
<LocalityName>Montréal</LocalityName>
<Thoroughfare>
<ThoroughfareName>7012 1E AV</ThoroughfareName>
</Thoroughfare>
</Locality>
</AdministrativeArea>
</Country>
</AddressDetails>
<Point>
<coordinates>-73.598195,45.554040,0</coordinates>
</Point>
</Placemark>

</Response>
</kml>

Piotr Szotkowski added a comment - 11/Dec/06 06:17 AM
Assuming that you have the iconv support compiled in, can you check whether adding the line

$string = iconv('ISO-8859-1', 'UTF-8', $string);

between the getResponseBody() call and the simplexml_load_string() call in CRM/Utils/Geocode/Google.php fixes this for you?

Curtis Delaney added a comment - 11/Dec/06 07:23 AM
Works great, never thought of that. Will this be added to the next rev.

Piotr Szotkowski added a comment - 11/Dec/06 07:52 AM
Thanks for your tests. As soon as I figure out a nice way to do this without depending on having the iconv support compiled in, I'll push it to 1.6.

Piotr Szotkowski added a comment - 18/Dec/06 04:40 AM
Fixed in r7993.

Curtis, can you please check whether replacing CRM/Utils/Geocode/Google.php with the file from http://svn.civicrm.org/branches/v1.6/CRM/Utils/Geocode/Google.php works for you?

Piotr Szotkowski added a comment - 21/Jun/07 03:38 AM
Make the issue unverified for 1.8.

Piotr Szotkowski added a comment - 21/Jun/07 03:58 AM
Assigning to Sameer for 1.8 verification.