URS, Word selection fails when text filename contains regex special chars
The URS Word selection fails when the corpus a text +filename+ contains +regex+ special chars like "(", ".".
Diagnostic
+regex+ special chars of generated CQL built from the Text id are not escaped thus the word position is not found.
Solution 1
Use the java.util.regex.Pattern.quote()
standard method.
Instead of using our own implementation of regex chars escaping (addBackSlash()).
Solution 2
Use CQLquery.addBackslashes(textid) CQLquery.addBackSlash(textid)
Validation test
- Create a TXT corpus with a file named test_0.8.4_.txt
- Try annotating with URS