Guide to Hugging Face AutoModelFor** Classes and Tokenizers
Understanding SentenceTransformer Vs AutoTokenizer + AutoModel A tokenizer such as AutoTokenizer simply converts the words into tokens ( A numerical representation of text) however this alone doesnt produce sentence embeddings Sentencetransformer() does both tokenization and embedding computations automatically it also applies pooling(typically mean pooling) to hidden states resulting a final sentence embedding that can be directly used for various NLP tasks from sentence_transformers import SentenceTransformermodel = SentenceTransformer(“sentence-transformers/all-MiniLM-L6-v2”)sentences = [“I love machine learning”, “I am expert in AI”]embeddings = model.encode(sentences) […]