simplexify word properties usage
Closed word properties (finite number of values, eg 'pos') are the most concerned.
Currently, 'pos' queries must be expressed with dedicated codes for each language and tagger. The user must learn a tagset to be able to express a query. But a general user may want to use simplified pos categories.
The goal is to ease access in queries and results.
This can be done by:
- U/X design: eg Hyperbase Windows 'pos'+'feat' selection UI - a unique form contains all possible values and sub-values (features) with complete names instead of codes
- each language and tagger and language model must have a dedicated form and mapping between form fields and values and codes
- a generic set of values could be designed for all languages (to some extent: eg noun, adjective, verb, adverb, pronoun, determiner, other)
- each language and tagger and language model must have a dedicated form and mapping between form fields and values and codes
- building new (simplified) word properties with the help of word annotations or a dedicated algorithm applied to native properties
- building the simplexified:native mapping at CQP level (eg CQP macros?) to allow the use of CQL syntax with simplexified properties