|
FreeLing
3.1
|
The class dictionary implements dictionary search and suffix analysis for word forms. More...
#include <dictionary.h>


Public Member Functions | |
| dictionary (const std::wstring &, const std::wstring &, bool, const std::wstring &, bool invDic=false, bool retok=true) | |
| Constructor. | |
| ~dictionary () | |
| Destructor. | |
| void | add_analysis (const std::wstring &, const analysis &) |
| add analysis to dictionary entry (create entry if not there) | |
| void | remove_entry (const std::wstring &) |
| remove entry from dictionary | |
| void | search_form (const std::wstring &, std::list< analysis > &) const |
| Get dictionary entry for a given form, add to given list. | |
| bool | annotate_word (word &, std::list< word > &, bool override=false) const |
| Fills the analysis list of a word, checking for suffixes and contractions. | |
| void | annotate_word (word &) const |
| Fills the analysis list of a word, checking for suffixes and contractions. | |
| std::list< std::wstring > | get_forms (const std::wstring &, const std::wstring &) const |
| Get possible forms for a lemma+pos. | |
| void | analyze (sentence &) const |
| analyze given sentence | |
Private Member Functions | |
| bool | check_contracted (const std::wstring &, std::wstring, std::wstring, std::list< word > &) const |
| check whether the word is a contraction, and if so, fill the list with the contracted words | |
| std::list< std::wstring > | tag_combinations (std::list< std::wstring >::const_iterator, std::list< std::wstring >::const_iterator) const |
| Generate valid tag combinations for an ambiguous contraction. | |
| bool | parse_dict_entry (const std::wstring &, std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const |
| parse data string into a map lemma->list of tags | |
| std::wstring | compact_data (const std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const |
| compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory | |
| bool | less (const std::wstring &, const std::wstring &, const std::map< std::wstring, std::wstring > &) const |
| compare two strings (lemmas or PoS) using given list of preferences | |
| void | sort_list (std::list< std::wstring > &, const std::map< std::wstring, std::wstring > &) const |
| sort given list using given preferences | |
Private Attributes | |
| bool | AffixAnalysis |
| configuration options | |
| bool | InverseDict |
| bool | RetokenizeContractions |
| affixes * | suf |
| suffix analyzer | |
| database * | morfodb |
| key-value file or hash | |
| database * | inverdb |
| std::map< std::wstring, std::wstring > | lemma_prefs |
| std::map< std::wstring, std::wstring > | pos_prefs |
The class dictionary implements dictionary search and suffix analysis for word forms.
| freeling::dictionary::dictionary | ( | const std::wstring & | Lang, |
| const std::wstring & | dicFile, | ||
| bool | activateAff, | ||
| const std::wstring & | sufFile, | ||
| bool | invDic = false, |
||
| bool | retok = true |
||
| ) |
Constructor.
Create a dictionary module, open database.
References freeling::config_file::add_section(), freeling::config_file::close(), DB_MAP, DB_PREFTREE, ERROR_CRASH, freeling::config_file::get_content_line(), freeling::config_file::get_section(), freeling::config_file::open(), and TRACE.
Destructor.
Destroy dictionary module, close database.
| void freeling::dictionary::add_analysis | ( | const std::wstring & | form, |
| const analysis & | newan | ||
| ) |
add analysis to dictionary entry (create entry if not there)
References freeling::analysis::get_lemma(), freeling::analysis::get_tag(), freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, and wstring2list.
| void freeling::dictionary::analyze | ( | sentence & | se | ) | const [virtual] |
analyze given sentence
Dictionary search and affix analysis for all words in a sentence, using given options.
Implements freeling::processor.
References int2wstring, freeling::sentence::rebuild_word_index(), TRACE, and TRACE_SENTENCE.
Referenced by freeling::maco::analyze().
| bool freeling::dictionary::annotate_word | ( | word & | , |
| std::list< word > & | , | ||
| bool | override = false |
||
| ) | const |
Fills the analysis list of a word, checking for suffixes and contractions.
Returns true iff the form is a contraction, returns contraction components in given list
| void freeling::dictionary::annotate_word | ( | word & | w | ) | const |
Fills the analysis list of a word, checking for suffixes and contractions.
Search form in the dictionary.
Never retokenizing contractions, nor returning component list. It is just a convenience equivalent to "annotate_word(w,dummy,true)"
Add* found analysis to the given word. Do not retokenize contractions, nor return a component list.
| bool freeling::dictionary::check_contracted | ( | const std::wstring & | , |
| std::wstring | , | ||
| std::wstring | , | ||
| std::list< word > & | |||
| ) | const [private] |
check whether the word is a contraction, and if so, fill the list with the contracted words
Check whether the given word is a contraction, if so, obtain composing words (and store them into lw).
References freeling::word::add_analysis(), ERROR_CRASH, freeling::word::get_n_analysis(), list2wstring, TRACE, and wstring2list.
| wstring freeling::dictionary::compact_data | ( | const std::list< std::pair< std::wstring, std::list< std::wstring > > > & | ) | const [private] |
compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory
References freeling::LEMMA_DIVIDER, list2wstring, and freeling::TAG_DIVIDER.
| list< wstring > freeling::dictionary::get_forms | ( | const std::wstring & | , |
| const std::wstring & | |||
| ) | const |
Get possible forms for a lemma+pos.
References WARNING, and wstring2list.
| bool freeling::dictionary::less | ( | const std::wstring & | , |
| const std::wstring & | , | ||
| const std::map< std::wstring, std::wstring > & | |||
| ) | const [private] |
compare two strings (lemmas or PoS) using given list of preferences
| bool freeling::dictionary::parse_dict_entry | ( | const std::wstring & | , |
| std::list< std::pair< std::wstring, std::list< std::wstring > > > & | |||
| ) | const [private] |
parse data string into a map lemma->list of tags
References wstring2list.
| void freeling::dictionary::remove_entry | ( | const std::wstring & | form | ) |
remove entry from dictionary
References list2wstring, and wstring2list.
| void freeling::dictionary::search_form | ( | const std::wstring & | s, |
| std::list< analysis > & | la | ||
| ) | const |
Get dictionary entry for a given form, add to given list.
Search form in the dictionary, according to given options, Add* found analysis to the given list.
References freeling::analysis::init(), int2wstring, freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, TRACE, and wstring2list.
Referenced by freeling::affixes::CheckRetokenizable(), and freeling::affixes::SearchRootsList().
| void freeling::dictionary::sort_list | ( | std::list< std::wstring > & | , |
| const std::map< std::wstring, std::wstring > & | |||
| ) | const [private] |
sort given list using given preferences
bubble sort given list (of lemmas or tags) using given preferences
| list< wstring > freeling::dictionary::tag_combinations | ( | std::list< std::wstring >::const_iterator | , |
| std::list< std::wstring >::const_iterator | |||
| ) | const [private] |
Generate valid tag combinations for an ambiguous contraction.
References wstring2list.
bool freeling::dictionary::AffixAnalysis [private] |
configuration options
database* freeling::dictionary::inverdb [private] |
bool freeling::dictionary::InverseDict [private] |
std::map<std::wstring,std::wstring> freeling::dictionary::lemma_prefs [private] |
database* freeling::dictionary::morfodb [private] |
key-value file or hash
std::map<std::wstring,std::wstring> freeling::dictionary::pos_prefs [private] |
affixes* freeling::dictionary::suf [private] |
suffix analyzer
1.7.6.1