I have the following bugs when importing non-ascii data in CiviCRM:
1- in French, if the first letter of the first/last name has an accent in it (ex: Émilie), the first letter gets stripped. However, if the accent is somewhere else in the data, no problem.
2- data in Cyrillic (Bulgarian) is dropped completely.
The file (attached) is in UTF-8, written in Vim. First line contains headers.
Bug reproduced in the Drupal sandbox, as well as with CiviCRM 1.9, 2.0.
Thank you for any hints,
Mathieu Lutfy (bgm on #civicrm)