Instructions to use nvidia/NV-Embed-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/NV-Embed-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="nvidia/NV-Embed-v2", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nvidia/NV-Embed-v2", trust_remote_code=True, dtype="auto") - sentence-transformers
How to use nvidia/NV-Embed-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("nvidia/NV-Embed-v2", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Bidirectional or Casual?
class BidirectionalMistralModel(MistralModel):
config_class = BidirectionalMistralConfig
def __init__(self, config: MistralConfig):
super().__init__(config)
for layer in self.layers:
layer.self_attn.is_causal = False
self._attn_implementation = "eager"
However, MistralAttention doesn't use is_causal.
MistralFlashAttention2 uses layer.self_attn.is_causal.
MistralSdpaAttention doesn't use layer.self_attn.is_causal.
Hi, @AlignLearner . Please refer this for Sdpa attention use is_causal: "https://github.com/huggingface/transformers/blob/v4.37.2/src/transformers/models/mistral/modeling_mistral.py#L692".
@nada5 However, In v4.44.2, Sdpa attention don't use is_causal
https://github.com/huggingface/transformers/blob/v4.44.2/src/transformers/models/mistral/modeling_mistral.py#L475C9-L484C10
Following https://huggingface.co/nvidia/NV-Embed-v2#2-required-packages
And in v4.42.4, Sdpa attention don't use is_causal
https://github.com/huggingface/transformers/blob/v4.42.4/src/transformers/models/mistral/modeling_mistral.py#L645C19-L655C1