Shuu12121 commited on
Commit
632d07c
·
verified ·
1 Parent(s): d90aaa2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -1
README.md CHANGED
@@ -39,7 +39,28 @@ datasets:
39
 
40
  This model is a SentenceTransformer fine-tuned from [`Shuu12121/CodeModernBERT-Owl🦉`](https://huggingface.co/Shuu12121/CodeModernBERT-Owl) on the [BigCloneBench](https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_big_clone_bench) dataset for **code clone detection**. It maps code snippets into a 768-dimensional dense vector space for semantic similarity tasks.
41
 
42
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ## 📌 Model Overview
45
 
 
39
 
40
  This model is a SentenceTransformer fine-tuned from [`Shuu12121/CodeModernBERT-Owl🦉`](https://huggingface.co/Shuu12121/CodeModernBERT-Owl) on the [BigCloneBench](https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_big_clone_bench) dataset for **code clone detection**. It maps code snippets into a 768-dimensional dense vector space for semantic similarity tasks.
41
 
42
+
43
+
44
+ ## 🎯 Distinctive Performance and Stability
45
+
46
+ This model achieves **very high accuracy and F1 scores** in code clone detection.
47
+ One particularly noteworthy characteristic is that **changing the similarity threshold has minimal impact on classification performance**.
48
+ This indicates that the model has learned to **clearly separate clones from non-clones**, resulting in a **stable and reliable similarity score distribution**.
49
+
50
+ | Threshold | Accuracy | F1 Score |
51
+ |-------------------|-------------------|--------------------|
52
+ | 0.5 | 0.9900 | 0.9633 |
53
+ | 0.85 | 0.9903 | 0.9641 |
54
+ | 0.90 | 0.9902 | 0.9637 |
55
+ | 0.95 | 0.9887 | 0.9579 |
56
+ | 0.98 | 0.9879 | 0.9540 |
57
+
58
+ - **High Stability**: Between thresholds of 0.85 and 0.98, accuracy and F1 scores remain nearly constant.
59
+ _(This suggests that code pairs considered clones generally score between 0.9 and 1.0 in cosine similarity.)_
60
+
61
+ - **Reliable in Real-World Applications**: Even if the similarity threshold is slightly adjusted for different tasks or environments, the model maintains consistent performance without significant degradation.
62
+
63
+
64
 
65
  ## 📌 Model Overview
66