Universal Sentence Encoder lite
The Universal Sentence Encoder (Cer et al., 2018) (USE) is a model that encodes text into 512-dimensional embeddings. These embeddings can then be used as inputs to natural language processing tasks such as sentiment classification and textual similarity analysis.
This module is a TensorFlow.js
GraphModel converted from the USE lite (module on TFHub), a lightweight version of the original. The lite model is based on the Transformer (Vaswani et al, 2017) architecture, and uses an 8k word piece vocabulary.
In this demo we embed six sentences with the USE, and render their self-similarity scores in a matrix (redder means more similar):
The matrix shows that USE embeddings can be used to cluster sentences by similarity.
The sentences (taken from the TensorFlow Hub USE lite colab):
- I like my phone.
- Your cellphone looks great.
- How old are you?
- What is your age?
- An apple a day, keeps the doctors away.
- Eating strawberries is healthy.
$ yarn add @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder
$ npm install @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder
To import in npm:
or as a standalone script tag:
// Load the model.use;
To use the Tokenizer separately: