YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MAMUT Bert (Mathematical Structure Aware BERT)

Pretrained model based on bert-base-cased with further mathematical pre-training, introduced in MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training.

Model Details

Model Description

This model has been mathematically pretrained based on four tasks/datasets:

  • Mathematical Formulas (MF): Masked Language Modeling (MLM) task on math formulas written in LaTeX
  • Mathematical Texts (MT): MLM task on mathematical texts (i.e., texts containing LaTeX formulas). The masked tokens are more likely to be a one of the formula tokens or mathematical words (e.g., sum, one, ...)
  • Named Math Formulas (NMF): Next-Sentence-Prediction (NSP)-like task associating a name of a well known mathematical identity (e.g., Pythagorean Theorem) with a formula representation (and the task is to classify whether the formula matches the identity described by the name)
  • Math Formula Retrieval (MFR): NSP-like task associating two formulas (and the task is to decide whether both describe the same mathematical concept(identity))

Training Overview

Compared to bert-base-cased, 300 additional mathematical LaTeX tokens have been added before the mathematical pre-training.

Model Sources [optional]

Uses

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Environmental Impact

  • Hardware Type: 8xA100
  • Hours used: 48
  • Compute Region: Germany

Citation

BibTeX:

[More Information Needed]

Downloads last month
52
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support