|
FreeLing
3.1
|
Class alternatives suggests words that are orthogrphically/phonetically similar to input word. More...
#include <alternatives.h>


Public Member Functions | |
| alternatives (const std::wstring &) | |
| Constructor. | |
| ~alternatives () | |
| Destructor. | |
| void | get_similar_words (const std::wstring &, std::list< std::pair< std::wstring, int > > &) const |
| direct access to results of underlying automata | |
| void | analyze (sentence &) const |
| spell check each word in sentence | |
Private Member Functions | |
| void | filter_candidate (const std::wstring &, const std::wstring &, int distance, std::map< std::wstring, int > &) const |
| filter given candidate and decide if it is a valid alternative. | |
| void | filter_alternatives (const std::list< std::pair< std::wstring, int > > &, word &) const |
| adds the new words that are posible correct spellings from original word to the word analysys data | |
Private Attributes | |
| foma_FSM * | sed |
| FSM for orthographic edit distance. | |
| std::multimap< std::wstring, std::wstring > | orthography |
| remember from which word(s) every phonetic form came from (only for phonetic distances) | |
| phonetics * | ph |
| The class that translates a word into phonetic sounds. | |
| int | DistanceThreshold |
| Maximum distance to consider an entry as an alternative. | |
| int | MaxSizeDiff |
| Maximum lentgh difference to consider a word as a possible correction. | |
| freeling::regexp | CheckKnownTags |
| tags of known word to be be checked | |
| bool | CheckUnknown |
| whether unknown words should be checked | |
| int | DistanceType |
Static Private Attributes | |
| static const int | ORTHOGRAPHIC = 1 |
| type of distance used | |
| static const int | PHONETIC = 2 |
Class alternatives suggests words that are orthogrphically/phonetically similar to input word.
Results may be used for spell checking.
| freeling::alternatives::alternatives | ( | const std::wstring & | altsFile | ) |
Constructor.
Create a alternatives module, loading dictionary and options.
Create phonetic transcriptor
References freeling::util::absolute(), freeling::config_file::add_section(), CheckKnownTags, CheckUnknown, freeling::config_file::close(), DistanceThreshold, DistanceType, ERROR_CRASH, freeling::config_file::get_content_line(), freeling::config_file::get_section(), freeling::phonetics::get_sound(), freeling::util::lowercase(), MaxSizeDiff, freeling::util::new_tempfile_name(), freeling::config_file::open(), freeling::util::open_utf8_file(), ORTHOGRAPHIC, orthography, ph, PHONETIC, sed, freeling::foma_FSM::set_cutoff_threshold(), TRACE, WARNING, wstring2int, and wstring2string.
| void freeling::alternatives::analyze | ( | sentence & | se | ) | const [virtual] |
spell check each word in sentence
Navigates the sentence adding alternative words (possible correct spelling data)
Implements freeling::processor.
References CheckKnownTags, CheckUnknown, DistanceType, filter_alternatives(), freeling::foma_FSM::get_similar_words(), freeling::phonetics::get_sound(), int2wstring, ORTHOGRAPHIC, ph, PHONETIC, freeling::regexp::search(), sed, and TRACE.
| void freeling::alternatives::filter_alternatives | ( | const std::list< std::pair< std::wstring, int > > & | , |
| word & | |||
| ) | const [private] |
adds the new words that are posible correct spellings from original word to the word analysys data
adds the new words that are valid alternatives.
References freeling::word::alternatives_begin(), freeling::word::alternatives_end(), freeling::word::clear_alternatives(), DistanceThreshold, DistanceType, filter_candidate(), freeling::word::get_alternatives(), freeling::word::get_lc_form(), ORTHOGRAPHIC, orthography, PHONETIC, and TRACE.
Referenced by analyze().
| void freeling::alternatives::filter_candidate | ( | const std::wstring & | , |
| const std::wstring & | , | ||
| int | distance, | ||
| std::map< std::wstring, int > & | |||
| ) | const [private] |
filter given candidate and decide if it is a valid alternative.
References int2wstring, MaxSizeDiff, and TRACE.
Referenced by filter_alternatives().
| void freeling::alternatives::get_similar_words | ( | const std::wstring & | , |
| std::list< std::pair< std::wstring, int > > & | |||
| ) | const |
direct access to results of underlying automata
Provide direct access to results of underlying automata, in case caller only want the list of strings.
References DistanceType, freeling::foma_FSM::get_similar_words(), freeling::phonetics::get_sound(), MaxSizeDiff, ORTHOGRAPHIC, orthography, ph, PHONETIC, sed, and TRACE.
tags of known word to be be checked
Referenced by alternatives(), and analyze().
bool freeling::alternatives::CheckUnknown [private] |
whether unknown words should be checked
Referenced by alternatives(), and analyze().
int freeling::alternatives::DistanceThreshold [private] |
Maximum distance to consider an entry as an alternative.
Referenced by alternatives(), and filter_alternatives().
int freeling::alternatives::DistanceType [private] |
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
int freeling::alternatives::MaxSizeDiff [private] |
Maximum lentgh difference to consider a word as a possible correction.
Referenced by alternatives(), filter_candidate(), and get_similar_words().
const int freeling::alternatives::ORTHOGRAPHIC = 1 [static, private] |
type of distance used
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
std::multimap<std::wstring,std::wstring> freeling::alternatives::orthography [private] |
remember from which word(s) every phonetic form came from (only for phonetic distances)
Referenced by alternatives(), filter_alternatives(), and get_similar_words().
phonetics* freeling::alternatives::ph [private] |
The class that translates a word into phonetic sounds.
Referenced by alternatives(), analyze(), get_similar_words(), and ~alternatives().
const int freeling::alternatives::PHONETIC = 2 [static, private] |
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
foma_FSM* freeling::alternatives::sed [private] |
FSM for orthographic edit distance.
Referenced by alternatives(), analyze(), get_similar_words(), and ~alternatives().
1.7.6.1