Automation Action – Tokenize | ThinkAutomation

Automation Action: Tokenize

Tokenize any text and assign the comma separated tokens to a variable.

Gets a list of comma separated tokens (words) for any text.

Enter the Text/HTML to tokenize. If the text is HTML then the HTML will be converted to plain text first.

Options:

  • Remove Common Words : Remove all common words (and, the, a etc.) from the tokens list.
  • Remove Email Addresses & Urls : Removes any email addresses and URLs from the tokens list.
  • Include Numeric Tokens : Include tokens containing numbers and dates in the tokens list.
  • Normalize : Normalizes common contractions (eg: ‘what’s’ to ‘what is’) and common abbreviations (eg: hi to hello, nov to november, ur to your, bday to birthday, 2day to today, plz to please, thx to thanks etc.)
  • Stem Words : Reduces words to their root form (English only). For example: the words ‘ask’,’asking’ and ‘asked’ would all stem to ‘ask’.
  • Unique : Duplicates are removed from the tokens list.
  • Include Count : The frequency is appended to each token (if unique enabled).
  • Sort By : None, frequency, word (if unique enabled).
  • Top : Return the top x words if sorted (if unique enabled).

The tokens can be assigned to a variable. Tokens are returned as a comma separated string.

Sentence Tokenization

Enable the Tokenize Into Sentences option to return a list of sentences (one per line) instead of tokens. The text will first be tokenized and then the top x sentences will be returned in keyword density order. This can be useful when used in AI automations and the Embedded Vector Database action, to shorten large text.

This is one action from over 180 actions included with ThinkAutomation. The ThinkAutomation business process automation (BPA) solution is designed to automate on-premises and cloud-based business processes that are triggered from incoming messages. Automate messages received by email, database updates, webhooks, web forms, web chat, SMS messages, Twitter, Teams messages, documents, local files and other messages sources. Create any number of workflow automations using the drag-and-drop low-code designer. Simple fixed pricing, with unlimited message processing reduces overall costs compared to hosted automation solutions.

You can also extend ThinkAutomation by creating your own custom automation actions using the built-in designer and C#/VB.net code editor.

Download Free 30 Day Trial

Back To Automation Actions List

ThinkAutomation Home