| Question asked
|
Step 1 of the tag filtering process (Synta … Step 1 of the tag filtering process (Syntactic Filtering) filters out tags that are small (with length of 1) and converts tags with special characters to their "base form" (presumably into an encoding such as ASCII). This first filtering step will ignore many tags in non-Western languages (those where a tag may be comprised of a single logograph, for example). Moreover, tags in non-Western languages will fail to verify against Wordnet (and possibly Wikipedia, where by far the most articles are in English). Is the architecture presented in this paper flexible enough to deal with these challenges when faced with tags and data sources in a variety of languages, or would substantial changes be required due to a lack of suitable lexical and semantic data sources in those languages? semantic data sources in those languages?
|