microsoft/FlexCAD · Hugging Face

Model Details

Model Description

This model aims to achieve controllable Computer-Aided Design (CAD) generation across all CAD construction hierarchies, such as sketch-extrusion, extrusion, sketch, face, loop and curve. It takes an original CAD model along with the part the user intends to modify as input and generates multiple new CAD models with only the chosen part changed, where CAD models are represented as structured texts.

Developed by: Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai and Jiang Bian
Model type: Large Language Models
Language(s): Python
License: MIT
Finetuned from model: Llama-3-8B

Model Sources

Repository: https://github.com/microsoft/FlexCAD
Paper: https://arxiv.org/pdf/2411.05823

Uses

Direct Intended Uses

Take an original CAD model along with the part the user intends to modify as input and generate multiple new CAD models with only the chosen part changed, where CAD models are represented as structured texts.

FlexCAD is being shared with the research community to facilitate reproduction of our results and foster further research in CAD generation.

FlexCAD is intended to be used by domain experts who are independently capable of evaluating the quality of outputs before acting on them.

Out-of-Scope Use

We do not recommend using FlexCAD in commercial or real-world applications without further testing and development. It is being released for research purposes.

Use in any manner that violates applicable laws and regulations.

Risks and Limitations

FlexCAD inherits any biases, errors or omissions produced by its base model. Develops are advised to choose an appropriate base LLM carefully, depending on the intended use case.

FlexCAD uses the Llama model. See https://huggingface.co/meta-llama/Meta-Llama-3-8B to understand the capabilities and limitations of this model.

As the model is fine-tuned on very specific data about CAD models, it is unlikely to generate information other than CAD models. However, if a user submits harmful content—e.g., a firearm CAD model—the model may inadvertently enhance that design. Users must therefore implement their own content-filtering and harm-mitigation safeguards to prevent such outputs.

CAD models generated by FlexCAD may not be technically accurate in all cases. Users are responsible for assessing the quality of generated content for each intended use case.

FlexCAD was developed for research and experimental purposes. It does not yet include Responsible AI (RAI) content-filtering or safety mechanisms, and it requires extensive testing and validation before any commercial or real-world deployment.

Recommendations

Please only provide a CAD model that you want to modify to the model.

Users are responsible for sourcing their content legally and ethically. This could include securing appropriate copyrights and the anonymization of data prior to use in research.

How to Get Started with the Model

python sample.py --model_path <model_path> --num_samples <num_samples> --data_path <data_path> --mask_type <mask_type>

Here, mask_type can be one of unconditional, cad, sketch-extrusion(es), extrusion, sketch, face, loop, curve.

For more information, please visit our GitHug repo: https://github.com/microsoft/FlexCAD.

Training Details

Training Data

The training data is from an open-source dataset, DeepCAD: https://github.com/ChrisWu1997/DeepCAD?tab=readme-ov-file. We use its pre-processed version: https://github.com/samxuxiang/SkexGen.

Training Procedure

Preprocessing

CAD models are converted to structured text, where all the categorical and numerical information are represented as textual tokens.

See A.1 in our paper (https://arxiv.org/pdf/2411.05823) for a detailed definition for the sequence format.

Training Hyperparameters

LoRA rank and alpha: 8, 32
Optimizer: AdamW
Batch size: 32
Learning rate: 5e-4
Epochs: 30

Speeds, Sizes, Times

Llama-3-8B: 8B parameters

Evaluation

Testing Data

The testing data is from an open-source dataset, DeepCAD: https://github.com/ChrisWu1997/DeepCAD?tab=readme-ov-file. We use its pre-processed version: https://github.com/samxuxiang/SkexGen.

Metrics

Generation diversity and quality on the generated CAD models in comparison to the test set, including Coverage (COV), Minimum Matching Distance (MMD) and Jensen-Shannon Divergence (JSD) [1][2].
The percentage of predicted CAD sequences that can be successfully rendered into 3D models, denoted as Prediction Validity (PV).
The percentage of the generated CAD models that are labeled as realistic ones by human evaluators, denoted as Realism.

Results

We use prior work, including Skexgen, Hnc-cad, and prompting GPT-4o, as baselines. FlexCAD demonstrates superior performance compared to these baselines across most metrics. Notably, it achieves significant improvements in PV, with the PV values for GPT-4o, Skexgen, Hnc-cad, and FlexCAD being 62.3%, 68.7%, 72.6%, and 93.4% respectively, in the context of sketch-level controllable generation. See Table 1 for the complete evaluation in our paper (https://arxiv.org/pdf/2411.05823).

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Citation

@InProceedings{zhang2024flexcad,
  title={FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models},
  author={Zhang, Zhanwei and Sun, Shizhao and Wang, Wenxiao and Cai, Deng and Bian, Jiang},
  booktitle={ICLR},
  year={2025}
}

Model Card Contact

We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at Shizhao Sun, [email protected].

If the team receives reports of undesired behavior or identifies issues independently, we will update this repository with appropriate mitigations.