For some input strings, the regular expression intended to detect whether a string is valid UTF-8 will cause a segfault.
This has been reproduced on several PHP versions including current (5.2.5) on OSX and Linux. The segfault does not occur on PHP/win32 5.2.5.
In order to reproduce, configure CiviCRM thusly:
- Enable country menu
- Disable state/province menu
- Enable Google as mapping provider
Then save the following contact details as an address:
- Street Address - PO Box 30000
- Supplemental 1 - Adderly Tce
- City - Christchurch
- Postcode - 8147
Google Maps returns a "multiple addresses found" XML which the regex in CRM_Utils_String::IsUtf8() will crash on. This appears to be an issue with PHP, but there are alternative methods to detect a valid UTF-8 string which don't have these consequences, so I think we should use them until PHP has this bug fixed.
The practical result is that for some addresses, it is impossible for a user to change the address without disabling the mapping provider. They will see a blank screen on every attempt to save.
Some more notes and testing, including a sample file which lets you test a few different methods of testing UTF-8 validity @ http://forum.civicrm.org/index.php/topic,1112.0.html
Tested on 1.9.11960 and 1.8.stable.11165.
Regex segfault verified on Linux PHP 4.4.0, 4.4.3, 4.4.7, 5.2.0, 5.2.3, 5.2.5 and OSX PHP 5.2.3 and 5.2.4.
Regex does not segfault on Win32 with PHP5.2.5.