Current unsupervised approaches (like LDA) require a lot of human review and pre-definition of the number of classes. In addition a good incremental approach hasn’t been achieved. Therefore, there is a need in the art for a methodology that is able to extract “intents” from various groups of texts, over a wide range of data points, with no need to predefine classes, no need to pre-label data, and no need to post-label data.
Such an “unsupervised” intent extraction lends itself to analyzing the corpus at the speed of a computer processor, rather than a human supervisor, and doing so according to a predetermined algorithm, rather than the whims of a human supervisor, will eliminate bias in the extracted intents.
Moreover, such an unsupervised intent extraction lends itself to utilization with a dynamic corpus, such that more data can be collected as the corpus expands, leading to new intents to be extract and/or further specification of the intents.