tehranixyz commited on
Commit
cdfcaec
·
verified ·
1 Parent(s): 7b47464

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -3
README.md CHANGED
@@ -1,3 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ ---
5
+
6
+ # CodeRosetta
7
+ ## Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming ([📃Paper](https://arxiv.org/abs/2410.20527), [🔗Website](https://coderosetta.com/)).
8
+
9
+
10
+ CodeRosetta is an EncoderDecoder translation model. It supports the translation of C++, CUDA, and Fortran. \
11
+ This version of the model is the base version of **C++-CUDA translation** without being fine-tuned.
12
+
13
+ ### How to use
14
+
15
+ ```python
16
+ from transformers import AutoTokenizer, EncoderDecoderModel
17
+
18
+ # Load the CodeRosetta model and tokenizer
19
+ model = EncoderDecoderModel.from_pretrained('CodeRosetta/CodeRosetta_cpp2cuda_ft')
20
+ tokenizer = AutoTokenizer.from_pretrained('CodeRosetta/CodeRosetta_cpp2cuda_ft')
21
+
22
+ # Encode the input C++ Code
23
+ input_cpp_code = "void add_100 ( int numElements , int * data ) { for ( int idx = 0 ; idx < numElements ; idx ++ ) { data [ idx ] += 100 ; } }"
24
+ input_ids = tokenizer.encode(input_cpp_code, return_tensors="pt")
25
+
26
+ # Set the start token to <CUDA>
27
+ start_token = "<CUDA>" # If input is CUDA code, change the start token to <CPP>
28
+ decoder_start_token_id = tokenizer.convert_tokens_to_ids(start_token)
29
+
30
+ # Generate the CUDA code
31
+ output = model.generate(
32
+ input_ids=input_ids,
33
+ decoder_start_token_id=decoder_start_token_id,
34
+ max_length=256
35
+ )
36
+
37
+ # Decode and print the generated output
38
+ generated_code= tokenizer.decode(output[0], skip_special_tokens=True)
39
+ print(generated_code)
40
+ ```
41
+
42
+ ### BibTeX
43
+
44
+ ```bibtex
45
+ @inproceedings{coderosetta:neurips:2024,
46
+ title = {CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming},
47
+ author = {TehraniJamsaz, Ali and Bhattacharjee, Arijit and Chen, Le and Ahmed, Nesreen K and Yazdanbakhsh, Amir and Jannesari, Ali},
48
+ booktitle = {NeurIPS},
49
+ year = {2024},
50
+ }