Create a project starting from a dataset

Click on Create Project > Start from a Dataset to get started. The supported formats are:

  • CSV files
  • Plain text files, separated by line breaks

In case you have an issue creating a project, make sure that your documents are encoded in UTF-8.

Using dates in your project

In order to take full advantage of the Dashboard section, you will need to upload a CSV file with a valid date column.

We support the following dates formats:

  • ISO-8601 - Example: 2019-11-20
  • MM/DD/YYYY - Example: 11/20/2019

Upload new datasets to a project

Once a project is created, you can upload new datasets to take advantage of the project's classifier. This means that the new data will be tagged (if your project has at least one tag) and classified with intents. In the Projects lists, click on Select an Action > Upload new dataset.

For example, you can start by creating a project to analyze customer support reviews with a dataset from the previous quarter. Once the project is ready, you can upload the reviews from the current quarter and get them tagged and classified to get insights on what changed, what topics increased, among others.

Datasets FAQ

What's the recommended file size or number of tickets to upload?

We recommend to upload at least 10,000 documents or 2,000 calls (longer texts) to build a reliable workflow.

Is there a maximum document size to upload?

Our Enterprise product doesn't have a size limitation, although it will be limited by the hardware that is running on, in case it is self-managed. Most of the project should create a reliable classifier with less than 250k documents.

Can I upload short texts? What’s the minimum size of each unit of text?

The technology works well with short texts (tweets, chat conversations, etc…). However the dataset needs to provide enough context, so on average we recommend comments to have at least 5 words. Other short interactions will be identified as noise.

Can I upload long texts or documents? What’s the maximum size of each unit of text?

The technology works well in long texts, however there needs to be a representative amount of texts to extract contexts.

How long does it take to generate a classifier to review?

It depends on the size of the file and the vocabulary. Usually it takes between 5-20 minutes for 5-20k comments. For a 200k texts file it takes about an hour.

Can I upload voice data?

Voice data cannot be uploaded into the platform. You first need to do a speech2text process, Microsoft, Google, IBM and Amazon all have speech2text functionalities that can be easily accessed. We also work with partners in case you want a more tailored approach.

Did this answer your question?