You are choosing embeddings for an NLP system that needs to represent text for semantic similarity, retrieval, and downstream modeling. Different embedding models can produce very different behavior depending on how they were trained and what they were optimized for.
What are the differences between various embedding models?