CodeRankEmbed-compressed
This is a tensor-compressed version of nomic-ai/CodeRankEmbed using tensor factorization.
Compression Details
- Compression method: Tensor factorization using TLTorch
- Factorization types: cp
- Ranks used: 4
- Number of factorized layers: 60
- Original model size: 136.73M parameters
- Compressed model size: 23.62M parameters
- Compression ratio: 5.79x (82.7% reduction)
Usage
To use this compressed model, you'll need to install the required dependencies and use the custom loading script:
pip install torch tensorly tltorch sentence-transformers
Loading the model
import torch
import json
from sentence_transformers import SentenceTransformer
import tensorly as tl
from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding
# Set TensorLy backend
tl.set_backend("pytorch")
# Load the model structure
model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True)
# Load factorization info
with open("factorization_info.json", "r") as f:
factorized_info = json.load(f)
# Reconstruct factorized layers (see load_compressed_model.py for full implementation)
# ... reconstruction code ...
# Load compressed weights
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(checkpoint["state_dict"], strict=False)
# Use the model
embeddings = model.encode(["def hello_world():\n print('Hello, World!')"])
Model Files
pytorch_model.bin
: Compressed model weightsfactorization_info.json
: Metadata about factorized layerstokenizer.json
,vocab.txt
: Tokenizer filesmodules.json
: SentenceTransformer modules configuration
Performance
The compressed model maintains good quality while being significantly smaller:
- Similar embedding quality (average cosine similarity > 0.9 with original)
- 5.79x smaller model size
- Faster loading and inference on CPU
Citation
If you use this compressed model, please cite the original CodeRankEmbed model:
@misc{nomic2024coderankembed,
title={CodeRankEmbed},
author={Nomic AI},
year={2024},
url={https://huggingface.co/nomic-ai/CodeRankEmbed}
}
License
This compressed model inherits the license from the original model. Please check the original model's license for usage terms.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for gtandon/CodeRankEmbed-compressed
Base model
Snowflake/snowflake-arctic-embed-m-long
Finetuned
nomic-ai/CodeRankEmbed