Synthesis Condition Predictor

This model predicts optimal temperature bins and atmosphere categories for inorganic material synthesis. It was trained on a dataset of text-mined synthesis procedures. Here is the source of the dataset: https://www.nature.com/articles/s41597-019-0224-1

Models Included:

  • Temperature Bin Prediction (LightGBM)
  • Atmosphere Category Prediction (LightGBM)

Intended Use: To assist researchers in designing synthesis experiments by predicting key process parameters. Input a target material, precursors, and basic operational details to get predictions.

How to Use:

# Ensure your inference script and its dependencies are in the PYTHONPATH
# from synthesis_predictor_hf_repo.src.inference import predict_synthesis_outcome, load_all_artifacts_once

# Or, if running from a cloned repo where 'src' is a subdirectory:
# from src.inference import predict_synthesis_outcome, load_all_artifacts_once

# if not load_all_artifacts_once():
#     print("Failed to load model artifacts.")
# else:
#     raw_input_example = {
#         'target_formula_raw': "YBa2Cu3O7",
#         'precursor_formulas_raw': ["Y2O3", "BaCO3", "CuO"],
#         'operations_simplified_list': [
#             {'type': 'MixingOperation', 'string': 'Ball milling for 2h', 'conditions': {'duration': [{'value':2, 'unit':'h'}]}},
#             {'type': 'HeatingOperation', 'string': 'Calcined at 920C for 10h in air', 
#               'conditions': {'heating_temperature': [{'value':920}], 'heating_time': [{'value':10}], 'atmosphere':'air'}},
#             {'type': 'HeatingOperation', 'string': 'Sintered at 950C for 20h in O2', 
#               'conditions': {'heating_temperature': [{'value':950}], 'heating_time': [{'value':20}], 'atmosphere':'Oxygen'}}
#         ],
#         'reactants_coeffs': [("Y2O3", 0.5), ("BaCO3", 2.0), ("CuO", 3.0)], # Example, adjust as needed
#         'products_coeffs': [("YBa2Cu3O7", 1.0)] # Example
#     }
#     predictions = predict_synthesis_outcome(raw_input_example)
#     print(predictions)

Limitations:

  • The model's accuracy is around 68-72%.
  • Predictions are based on patterns in the training data and may not generalize to all chemical systems.
  • The feature engineering for process parameters in the inference script relies on the user providing an operations_simplified_list that can be parsed by the internal logic. The quality of these inputs directly affects prediction accuracy.

Training Data: The model was trained on a proprietary dataset of text-mined inorganic synthesis procedures. (Kononova et al.) https://www.nature.com/articles/s41597-019-0224-1

Evaluation Results: The models were evaluated on a hold-out test set.

1. Tuned Temperature Bin Prediction Model:

  • Overall Test Set Accuracy: 0.6821
  • Overall Test Set F1 Score (Weighted): 0.6785
  • Per-Class Performance (Test Set):
                                  precision    recall  f1-score   support
    
        TempBin_1_(1_to_900]       0.77      0.79      0.78       954
     TempBin_2_(900_to_1100]       0.62      0.53      0.57       743
    TempBin_3_(1100_to_1300]       0.58      0.58      0.58       768
    TempBin_4_(1300_to_3000]       0.72      0.80      0.76       715
    
                    accuracy                           0.68      3180
                   macro avg       0.67      0.68      0.67      3180
                weighted avg       0.68      0.68      0.68      3180
    

2. Tuned Atmosphere Category Prediction Model:

  • Overall Test Set Accuracy: 0.7193
  • Overall Test Set F1 Score (Weighted): 0.7174
  • Per-Class Performance (Test Set):
                              precision    recall  f1-score   support
    
                   Inert       0.59      0.38      0.46       139
        Other_Atm_Target       1.00      0.44      0.62         9
               Oxidizing       0.67      0.71      0.69      1552
                Reducing       0.70      0.47      0.56       100
    Unknown_Atm_Category       0.76      0.76      0.76      2098
    
                accuracy                           0.72      3898
               macro avg       0.74      0.55      0.62      3898
            weighted avg       0.72      0.72      0.72      3898
    

)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support