Matryoshka Representation Learning
Paper
•
2205.13147
•
Published
•
25
This is a sentence-transformers model finetuned from nreimers/albert-small-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: AlbertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What should the report on income tax information include as per Article 48c?',
'7.\n\nMember States shall require subsidiary undertakings or branches not subject to the provisions of paragraphs 4 and 5 of this Article to publish and make accessible a report on income tax information where such subsidiary undertakings or branches serve no other objective than to circumvent the reporting requirements set out in this Chapter.\n\nArticle 48c\n\nContent of the report on income tax information\n\n1.\n\nThe report on income tax information required under Article 48b shall include information relating to all the activities of the standalone undertaking or ultimate parent undertaking, including those of all affiliated undertakings consolidated in the financial statements in respect of the relevant financial year.\n\n2.',
'(d)\n\nseal any business premises and books or records for the period of time of, and to the extent necessary for, the inspection.\n\n3.\n\nThe undertaking or association of undertakings shall submit to inspections ordered by decision of the Commission. The officials and other accompanying persons authorised by the Commission to conduct an inspection shall exercise their powers upon production of a Commission decision:\n\n(a)\n\nspecifying the subject matter and purpose of the inspection;\n\n(b)\n\ncontaining a statement that, pursuant to Article 16, a lack of cooperation allows the Commission to take a decision on the basis of the facts that are available to it;\n\n(c)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.1148 |
| cosine_accuracy@3 | 0.213 |
| cosine_accuracy@5 | 0.2456 |
| cosine_accuracy@10 | 0.289 |
| cosine_precision@1 | 0.1148 |
| cosine_precision@3 | 0.071 |
| cosine_precision@5 | 0.0491 |
| cosine_precision@10 | 0.0289 |
| cosine_recall@1 | 0.1148 |
| cosine_recall@3 | 0.213 |
| cosine_recall@5 | 0.2456 |
| cosine_recall@10 | 0.289 |
| cosine_ndcg@10 | 0.1998 |
| cosine_mrr@10 | 0.1715 |
| cosine_map@100 | 0.1779 |
query_text and doc_text| query_text | doc_text | |
|---|---|---|
| type | string | string |
| details |
|
|
| query_text | doc_text |
|---|---|
What are the requirements for Member States regarding the establishment of AI regulatory sandboxes, including the timeline for operational readiness and the possibility of joint establishment with other Member States? |
1. Member States shall ensure that their competent authorities establish at least one AI regulatory sandbox at national level, which shall be operational by 2 August 2026. That sandbox may also be established jointly with the competent authorities of other Member States. The Commission may provide technical support, advice and tools for the establishment and operation of AI regulatory sandboxes. |
Member States must provide updates on their national energy and climate strategies, detailing the anticipated energy savings from 2021 to 2030. They are also obligated to report on the necessary energy savings and the policies intended to achieve these goals. If assessments reveal that a Member State's measures are inadequate to meet energy savings targets, the Commission may issue recommendations for improvement. Additionally, any shortfall in energy savings must be addressed in subsequent obligation periods. |
9. Member States shall apply and calculate the effect of the options chosen under paragraph 8 for the period referred to in paragraph 1, first subparagraph, points (a) and (b)(i), separately: |
What is the functional definition of a remote biometric identification system, and how does it operate in terms of identifying individuals without their active participation? |
(17) The notion of ‘remote biometric identification system’ referred to in this Regulation should be defined functionally, as an AI system intended for the identification of natural persons without their active involvement, typically at a distance, through the comparison of a person’s biometric data with the biometric data contained in a reference database, irrespectively of the particular technology, processes or types of biometric data used. Such remote biometric identification systems are typically used to perceive multiple persons or their behaviour simultaneously in order to facilitate significantly the identification of natural persons without their active involvement. This excludes AI systems intended to be used for biometric |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10learning_rate: 2e-05num_train_epochs: 1warmup_ratio: 0.1fp16: Trueload_best_model_at_end: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| -1 | -1 | 0.1998 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
nreimers/albert-small-v2