Navigation

Pspell / Aspell and linux

Install programs (czech dictionary):

apt-get install libpspell-dev php5-pspell aspell-cs

Use program in PHP:

$pspell_link = pspell_new("cs");
$file = file_get_contents('document.txt');

	//$words = explode(" ", $file);
	$words = preg_split("/[\s,]+/", $file);
	//$words = preg_split("/[a-zA-Z0-9]+\ [a-zA-Z0-9]+/", $file);
	//$words = preg_split('@[\W]+@', $file, -1, PREG_SPLIT_NO_EMPTY);
	//preg_match_all("/([A-Z]\. [a-z]+)/", $file, $words, PREG_PATTERN_ORDER)

foreach ($words as $word) {
	if (!pspell_check($pspell_link, $word)) {
		echo $word . '<br />';
		$suggestions = pspell_suggest($pspell_link, $word);
		foreach ($suggestions as $suggestion) {
			//$temp = htmlspecialchars(iconv("CP1250", "UTF-8", $suggestion));
			$temp = htmlspecialchars(iconv("Windows-1250", "UTF-8", $suggestion));
			echo '"' . $word . '" - "' . $temp . '" - "' . $suggestion . '" - ' . similar_text($word , $temp) . '<br />';
		}
	}
}

Or make your own dictionary (ie. czech dictionary - download here at bottom: http://ucnk.ff.cuni.cz/srovnani10.php). Export only words from document by PHP.

$pspell_config = pspell_config_create("cs");
pspell_config_personal($pspell_config, "custom.txt");
$pspell_link = pspell_new_config($pspell_config);
$file = file_get_contents('syn2010_word.txt');
$words = preg_split("/[\s,]+/", $file);
//$words = preg_replace('/[0-9]+/', '', $words);
foreach ($words as $word) {
	//$temp = htmlspecialchars(iconv("Windows-1250", "UTF-8", $suggestion));
	//echo preg_split("/[\s,]+/", trim($word)) . '<br />';
	$temp = preg_replace("/[0-9]/", "", $word);
	if (strlen($temp)>0) {
		echo $temp . '<br />';
		pspell_add_to_personal($pspell_link, $temp);
	}
}
pspell_save_wordlist($pspell_link);

Import dictionary to Aspell:

aspell --lang=cs create master /usr/lib/aspell/cs_newdictionary.rws < custom.txt

Make configuration file:

nano /usr/lib/aspell/cs_newdictionary.multi

# cs_newdictionary.multi
add cs_newdictionary.rws

Use custom dictionary in PHP:

$pspell_link = pspell_new_personal("/usr/lib/aspell/cs_newdictionary", "cs");
$file = file_get_contents('inputfile.txt');
	$words = preg_split("/[\s,]+/", $file);

foreach ($words as $word) {
	if (!pspell_check($pspell_link, $word)) {
		echo $word . '<br />';
		$suggestions = pspell_suggest($pspell_link, $word);
		foreach ($suggestions as $suggestion) {
			$temp = htmlspecialchars(iconv("Windows-1250", "UTF-8", $suggestion));
			echo '"' . $word . '" - "' . $temp . '" - "' . $suggestion . '" - ' . similar_text($word , $temp) . '<br />';
		}
	}
}

.