RCP: 0.7.7, Manage corpus with holes
Manage corpus with holes in TXM ?
Actually a sub-corpus in CQP is an abstract result based on the root
corpus (a named query result).
CQP doesn’t seem to offer a way to create sub-corpus with holes from the
root corpus. For example, creating a sub-corpus where some tags has been
removed will lead to some strange behaviors in TXM.
A index/lexicon will not return the tokens from the removed tags, as
expected.
But the global context stays the root corpus leading to:
- “wrong” contexts in Concordance (the tokens contained in the removed tags stay in the contexts)
- “wrong” edition (the tokens contained in the removed tags stay in the edition, problematic or not? or at least do a “diff” to highlight root corpus VS sub-corpus ?)
- the Internal view structural properties information becomes wrong because the positions do not take in account the holes in sub-corpus
- TODO: try to define other potential problems in TXM commands
(from redmine: issue id 1296, created on 2015/04/02 by Sebastien Jacquot)
- Relations:
- relates #841 (closed)