We have applied the same algorithm approach to unstructured datasets to extract language representations that can help augment the language representations of the dataset (s) uploaded. That way if your dataset is not representative enough of the language used in your use case, it can be automatically improved.
At the moment we are able to process the Wikipedia in all its 300 languages to automatically augment extracted Intents. There is no setup process involved, the Wikipedia is pre-processed and its language representations saved in a database and the algorithm is able to choose the best augmentations for its initial Intents from that database.