string - php remove "questionmarks" � from fail-encoded text -


i´m extracting text weblink file_get_contents, have no influence on text, bits talk malformed in sourcecode of weblink got contents , , sth. :

 /$%§&fdsgfkgfd � fdsfdsfs � � -->  <h1>m�lll</h1>  <h1>m�lll</h1>  <h1>m�lll</h1>  <h1>m�lll</h1>  <h1>m�lll</h1>  <h1>m�lll</h1> 

or

 <<<!-- � födns 

my php file not meant "be" html file string im dealing with,

i searched internet difficult icon,

i want remove them because not necessary, how can remove them ?

ps: i´m not looking through browser, var_dump text in console

solution:

i use tthis function first cast string utf-8 string

function convtoutf8($str)  {  if( mb_detect_encoding($str,"utf-8, iso-8859-1, gbk")!="utf-8" )  {   return  iconv("gbk","utf-8",$str);   }  else  {  return $str;  }   }  

you can discard characters not supported encoding, iconv():

$converted = iconv($input_encoding, $output_encoding . '//ignore', $original); 

there 2 drawbacks:

  1. you need know input encoding, and
  2. as can read in a user comment in manual, iconv() has bug '//ignore' not work recent versions of iconv library. suggested workaround (here utf-8):

    ini_set('mbstring.substitute_character', 'none');  $text = mb_convert_encoding($text, 'utf-8', 'utf-8'); 

however, better attempt detect input encoding , convert input output encoding. leads to:

function recode ($input, $output_encoding) {   $input_encoding = mb_detect_encoding($input);    if ($input_encoding === false)   {     $old_substitute = mb_substitute_character();     mb_substitute_character('none');       $converted = mb_convert_encoding($input, $output_encoding, $output_encoding);      mb_substitute_character($old_substitute);   }   else   {     $converted = ($output_encoding !== $input_encoding)       ? iconv($input_encoding, $output_encoding, $input)       : $input;   }    return $converted; } 

Comments

Popular posts from this blog

c++ - How to add Crypto++ library to Qt project -

jQuery Mobile app not scrolling in Firefox -

How to use vim as editor in Matlab GUI -