BusinessObjects LinguistX Platform SDK Features

Language Detection and Document Analysis
Automatic language and character encoding ID, document analysis, paragraph and sentence ID, capitalization and case normalization ID and management.

Segmentation (Tokenization)
Segmentation (Tokenization) and decompounding finds words/particles, abbreviations, contractions, punctuation; splits compound words in languages like German and Dutch.

Stemming
Stemming identifies true stems (base forms) for each surface form token; normalizes words to most basic form for more efficient indexing and better recall.

Part-of-Speech Tagging
Part-of-speech tagging for grammatical category, sub-class, and noun phrase extraction like "fourth quarter earnings" and "adverse effects"

Contact sales now:

Locate a sales representative

Request a demo:

U.S. & Canada 1 866 681 3435
Europe: 00800 55 11 55 11
Global contact list

Request more information
Find a reseller