mohit19906 commited on
Commit
5e77180
1 Parent(s): 9d3a246

mohit19906/mistral-7b-Ins-IntentAndEntity

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.6320
20
 
21
  ## Model description
22
 
@@ -51,54 +51,54 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 4.3915 | 0.96 | 6 | 3.3366 |
55
- | 2.8224 | 1.92 | 12 | 2.2589 |
56
- | 1.9623 | 2.88 | 18 | 1.7017 |
57
- | 1.3485 | 4.0 | 25 | 1.4309 |
58
- | 1.3006 | 4.96 | 31 | 1.2167 |
59
- | 1.0941 | 5.92 | 37 | 1.1105 |
60
- | 0.9855 | 6.88 | 43 | 1.0638 |
61
- | 0.7878 | 8.0 | 50 | 1.0648 |
62
- | 0.8714 | 8.96 | 56 | 1.0702 |
63
- | 0.8352 | 9.92 | 62 | 1.0912 |
64
- | 0.7972 | 10.88 | 68 | 1.1189 |
65
- | 0.6574 | 12.0 | 75 | 1.1362 |
66
- | 0.7345 | 12.96 | 81 | 1.2220 |
67
- | 0.7001 | 13.92 | 87 | 1.2328 |
68
- | 0.676 | 14.88 | 93 | 1.2799 |
69
- | 0.5602 | 16.0 | 100 | 1.3639 |
70
- | 0.6319 | 16.96 | 106 | 1.3750 |
71
- | 0.6165 | 17.92 | 112 | 1.4120 |
72
- | 0.6074 | 18.88 | 118 | 1.4753 |
73
- | 0.5132 | 20.0 | 125 | 1.4273 |
74
- | 0.5899 | 20.96 | 131 | 1.5352 |
75
- | 0.5875 | 21.92 | 137 | 1.4517 |
76
- | 0.5836 | 22.88 | 143 | 1.5255 |
77
- | 0.496 | 24.0 | 150 | 1.5258 |
78
- | 0.5764 | 24.96 | 156 | 1.5619 |
79
- | 0.574 | 25.92 | 162 | 1.5049 |
80
- | 0.5713 | 26.88 | 168 | 1.5615 |
81
- | 0.4872 | 28.0 | 175 | 1.5760 |
82
- | 0.568 | 28.96 | 181 | 1.5624 |
83
- | 0.5646 | 29.92 | 187 | 1.5624 |
84
- | 0.5642 | 30.88 | 193 | 1.5766 |
85
- | 0.4829 | 32.0 | 200 | 1.5856 |
86
- | 0.5614 | 32.96 | 206 | 1.5721 |
87
- | 0.561 | 33.92 | 212 | 1.5661 |
88
- | 0.5596 | 34.88 | 218 | 1.5899 |
89
- | 0.4793 | 36.0 | 225 | 1.5967 |
90
- | 0.5583 | 36.96 | 231 | 1.5734 |
91
- | 0.5573 | 37.92 | 237 | 1.5972 |
92
- | 0.5571 | 38.88 | 243 | 1.6244 |
93
- | 0.4776 | 40.0 | 250 | 1.6172 |
94
- | 0.5552 | 40.96 | 256 | 1.6085 |
95
- | 0.5552 | 41.92 | 262 | 1.6138 |
96
- | 0.5546 | 42.88 | 268 | 1.6257 |
97
- | 0.4751 | 44.0 | 275 | 1.6269 |
98
- | 0.5535 | 44.96 | 281 | 1.6306 |
99
- | 0.5534 | 45.92 | 287 | 1.6320 |
100
- | 0.5533 | 46.88 | 293 | 1.6310 |
101
- | 0.4734 | 48.0 | 300 | 1.6320 |
102
 
103
 
104
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.6079
20
 
21
  ## Model description
22
 
 
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 4.3979 | 0.96 | 6 | 3.3561 |
55
+ | 2.837 | 1.92 | 12 | 2.2656 |
56
+ | 1.9777 | 2.88 | 18 | 1.7212 |
57
+ | 1.3641 | 4.0 | 25 | 1.4591 |
58
+ | 1.3384 | 4.96 | 31 | 1.2543 |
59
+ | 1.1314 | 5.92 | 37 | 1.1326 |
60
+ | 0.9904 | 6.88 | 43 | 1.0707 |
61
+ | 0.7908 | 8.0 | 50 | 1.0784 |
62
+ | 0.8779 | 8.96 | 56 | 1.0891 |
63
+ | 0.8415 | 9.92 | 62 | 1.1026 |
64
+ | 0.8044 | 10.88 | 68 | 1.1326 |
65
+ | 0.6611 | 12.0 | 75 | 1.1425 |
66
+ | 0.7385 | 12.96 | 81 | 1.2161 |
67
+ | 0.7071 | 13.92 | 87 | 1.2182 |
68
+ | 0.6841 | 14.88 | 93 | 1.2865 |
69
+ | 0.5671 | 16.0 | 100 | 1.3092 |
70
+ | 0.6442 | 16.96 | 106 | 1.3813 |
71
+ | 0.629 | 17.92 | 112 | 1.3295 |
72
+ | 0.6197 | 18.88 | 118 | 1.4387 |
73
+ | 0.522 | 20.0 | 125 | 1.3785 |
74
+ | 0.6013 | 20.96 | 131 | 1.4355 |
75
+ | 0.5928 | 21.92 | 137 | 1.4321 |
76
+ | 0.5901 | 22.88 | 143 | 1.4711 |
77
+ | 0.5015 | 24.0 | 150 | 1.4916 |
78
+ | 0.5817 | 24.96 | 156 | 1.5001 |
79
+ | 0.578 | 25.92 | 162 | 1.5077 |
80
+ | 0.5758 | 26.88 | 168 | 1.5173 |
81
+ | 0.4914 | 28.0 | 175 | 1.4935 |
82
+ | 0.5732 | 28.96 | 181 | 1.5161 |
83
+ | 0.5715 | 29.92 | 187 | 1.5131 |
84
+ | 0.5696 | 30.88 | 193 | 1.5400 |
85
+ | 0.4861 | 32.0 | 200 | 1.5338 |
86
+ | 0.5666 | 32.96 | 206 | 1.5474 |
87
+ | 0.5643 | 33.92 | 212 | 1.5519 |
88
+ | 0.5643 | 34.88 | 218 | 1.5710 |
89
+ | 0.4819 | 36.0 | 225 | 1.5723 |
90
+ | 0.5607 | 36.96 | 231 | 1.5749 |
91
+ | 0.5609 | 37.92 | 237 | 1.5677 |
92
+ | 0.5598 | 38.88 | 243 | 1.5853 |
93
+ | 0.4793 | 40.0 | 250 | 1.5951 |
94
+ | 0.5587 | 40.96 | 256 | 1.5850 |
95
+ | 0.5577 | 41.92 | 262 | 1.5904 |
96
+ | 0.5568 | 42.88 | 268 | 1.5913 |
97
+ | 0.477 | 44.0 | 275 | 1.5959 |
98
+ | 0.5553 | 44.96 | 281 | 1.6042 |
99
+ | 0.5556 | 45.92 | 287 | 1.6082 |
100
+ | 0.5549 | 46.88 | 293 | 1.6075 |
101
+ | 0.4749 | 48.0 | 300 | 1.6079 |
102
 
103
 
104
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:67f91d6eb593160a6209fbde4830ef56d7d8927a819a5cbd0c2cebf6065c7877
3
  size 8397056
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ecc456d994e136b15e47cea218d1b63aacdfaf866005c9bcfc86103550b0ec4
3
  size 8397056
runs/Apr13_09-40-43_05fce47c7cf1/events.out.tfevents.1713001249.05fce47c7cf1.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3dd0a817770d24538b7cbc6d64f5f1f065c7a6942d562b0d6dc944289d393470
3
+ size 28580
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:488615f52a799cda26534b15c0df583d5e6fdabad5b08f80e6c83296647fb5dc
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f040ae256a4293fca212376da4399901e6ac5dd876fe1d4234b8c4b105016df0
3
  size 4920
wandb/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/debug.log CHANGED
@@ -1,38 +1,38 @@
1
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.5
2
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
- 2024-04-13 07:27:18,100 INFO MainThread:34 [wandb_init.py:_log_setup():527] Logging user logs to /kaggle/working/wandb/run-20240413_072718-14n232vp/logs/debug.log
11
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:_log_setup():528] Logging internal logs to /kaggle/working/wandb/run-20240413_072718-14n232vp/logs/debug-internal.log
12
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:_jupyter_setup():473] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x783e3778f0d0>
13
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:init():567] calling init triggers
14
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:init():574] wandb.init called with sweep_config: {}
15
  config: {}
16
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:init():617] starting backend
17
- 2024-04-13 07:27:18,101 INFO MainThread:34 [wandb_init.py:init():621] setting up manager
18
- 2024-04-13 07:27:18,103 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
- 2024-04-13 07:27:18,105 INFO MainThread:34 [wandb_init.py:init():629] backend started and connected
20
- 2024-04-13 07:27:18,118 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
- 2024-04-13 07:27:18,455 INFO MainThread:34 [wandb_init.py:init():721] updated telemetry
22
- 2024-04-13 07:27:18,468 INFO MainThread:34 [wandb_init.py:init():754] communicating run to backend with 90.0 second timeout
23
- 2024-04-13 07:27:18,602 INFO MainThread:34 [wandb_run.py:_on_init():2344] communicating current version
24
- 2024-04-13 07:27:18,685 INFO MainThread:34 [wandb_run.py:_on_init():2353] got version response upgrade_message: "wandb version 0.16.6 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
25
 
26
- 2024-04-13 07:27:18,685 INFO MainThread:34 [wandb_init.py:init():805] starting run threads in backend
27
- 2024-04-13 07:27:34,783 INFO MainThread:34 [wandb_run.py:_console_start():2323] atexit reg
28
- 2024-04-13 07:27:34,783 INFO MainThread:34 [wandb_run.py:_redirect():2178] redirect: wrap_raw
29
- 2024-04-13 07:27:34,783 INFO MainThread:34 [wandb_run.py:_redirect():2243] Wrapping output streams.
30
- 2024-04-13 07:27:34,783 INFO MainThread:34 [wandb_run.py:_redirect():2268] Redirects installed.
31
- 2024-04-13 07:27:34,784 INFO MainThread:34 [wandb_init.py:init():848] run started, returning control to user process
32
- 2024-04-13 07:27:34,790 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32000, 'max_position_embeddings': 32768, 'hidden_size': 4096, 'intermediate_size': 14336, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'sliding_window': None, 'num_key_value_heads': 8, 'hidden_act': 'silu', 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 1000000.0, 'attention_dropout': 0.0, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['MistralForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 0, 'eos_token_id': 2, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', 'transformers_version': '4.39.3', 'model_type': 'mistral', 'pretraining_tp': 1, 'quantization_config': {'quant_method': 'QuantizationMethod.GPTQ', 'bits': 4, 'tokenizer': None, 'dataset': None, 'group_size': 128, 'damp_percent': 0.1, 'desc_act': True, 'sym': True, 'true_sequential': True, 'use_cuda_fp16': False, 'model_seqlen': None, 'block_name_to_quantize': None, 'module_name_preceding_first_block': None, 'batch_size': 1, 'pad_token_id': None, 'use_exllama': True, 'max_input_length': None, 'exllama_config': {'version': 'ExllamaVersion.ONE'}, 'cache_block_outputs': True, 'modules_in_block_to_quantize': None}, 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 50, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr13_07-27-05_71c495f7d39a', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
33
- 2024-04-13 08:43:29,291 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
34
- 2024-04-13 08:43:29,291 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
35
- 2024-04-13 08:43:29,298 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
36
- 2024-04-13 08:43:29,299 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
37
- 2024-04-13 08:43:29,299 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
38
- 2024-04-13 08:43:29,305 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
 
1
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.5
2
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_log_setup():527] Logging user logs to /kaggle/working/wandb/run-20240413_094054-v5dwwo6y/logs/debug.log
11
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_log_setup():528] Logging internal logs to /kaggle/working/wandb/run-20240413_094054-v5dwwo6y/logs/debug-internal.log
12
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_jupyter_setup():473] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x7fd7c7c3b580>
13
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():567] calling init triggers
14
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():574] wandb.init called with sweep_config: {}
15
  config: {}
16
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():617] starting backend
17
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():621] setting up manager
18
+ 2024-04-13 09:40:54,414 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
+ 2024-04-13 09:40:54,416 INFO MainThread:34 [wandb_init.py:init():629] backend started and connected
20
+ 2024-04-13 09:40:54,433 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
+ 2024-04-13 09:40:54,846 INFO MainThread:34 [wandb_init.py:init():721] updated telemetry
22
+ 2024-04-13 09:40:54,849 INFO MainThread:34 [wandb_init.py:init():754] communicating run to backend with 90.0 second timeout
23
+ 2024-04-13 09:40:55,012 INFO MainThread:34 [wandb_run.py:_on_init():2344] communicating current version
24
+ 2024-04-13 09:40:55,096 INFO MainThread:34 [wandb_run.py:_on_init():2353] got version response upgrade_message: "wandb version 0.16.6 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
25
 
26
+ 2024-04-13 09:40:55,096 INFO MainThread:34 [wandb_init.py:init():805] starting run threads in backend
27
+ 2024-04-13 09:41:11,226 INFO MainThread:34 [wandb_run.py:_console_start():2323] atexit reg
28
+ 2024-04-13 09:41:11,227 INFO MainThread:34 [wandb_run.py:_redirect():2178] redirect: wrap_raw
29
+ 2024-04-13 09:41:11,227 INFO MainThread:34 [wandb_run.py:_redirect():2243] Wrapping output streams.
30
+ 2024-04-13 09:41:11,228 INFO MainThread:34 [wandb_run.py:_redirect():2268] Redirects installed.
31
+ 2024-04-13 09:41:11,229 INFO MainThread:34 [wandb_init.py:init():848] run started, returning control to user process
32
+ 2024-04-13 09:41:11,235 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32000, 'max_position_embeddings': 32768, 'hidden_size': 4096, 'intermediate_size': 14336, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'sliding_window': None, 'num_key_value_heads': 8, 'hidden_act': 'silu', 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 1000000.0, 'attention_dropout': 0.0, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['MistralForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 0, 'eos_token_id': 2, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', 'transformers_version': '4.39.3', 'model_type': 'mistral', 'pretraining_tp': 1, 'quantization_config': {'quant_method': 'QuantizationMethod.GPTQ', 'bits': 4, 'tokenizer': None, 'dataset': None, 'group_size': 128, 'damp_percent': 0.1, 'desc_act': True, 'sym': True, 'true_sequential': True, 'use_cuda_fp16': False, 'model_seqlen': None, 'block_name_to_quantize': None, 'module_name_preceding_first_block': None, 'batch_size': 1, 'pad_token_id': None, 'use_exllama': True, 'max_input_length': None, 'exllama_config': {'version': 'ExllamaVersion.ONE'}, 'cache_block_outputs': True, 'modules_in_block_to_quantize': None}, 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 50, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr13_09-40-43_05fce47c7cf1', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
33
+ 2024-04-13 10:52:03,516 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
34
+ 2024-04-13 10:52:03,517 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
35
+ 2024-04-13 10:52:03,523 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
36
+ 2024-04-13 10:52:03,524 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
37
+ 2024-04-13 10:52:03,524 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
38
+ 2024-04-13 10:52:03,530 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
wandb/run-20240413_094054-v5dwwo6y/files/conda-environment.yaml ADDED
File without changes
wandb/run-20240413_094054-v5dwwo6y/files/config.yaml ADDED
@@ -0,0 +1,707 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ _wandb:
4
+ desc: null
5
+ value:
6
+ python_version: 3.10.13
7
+ cli_version: 0.16.5
8
+ framework: huggingface
9
+ huggingface_version: 4.39.3
10
+ is_jupyter_run: true
11
+ is_kaggle_kernel: true
12
+ start_time: 1713001254.0
13
+ t:
14
+ 1:
15
+ - 1
16
+ - 2
17
+ - 3
18
+ - 5
19
+ - 11
20
+ - 12
21
+ - 49
22
+ - 51
23
+ - 53
24
+ - 55
25
+ - 71
26
+ - 98
27
+ - 99
28
+ - 105
29
+ 2:
30
+ - 1
31
+ - 2
32
+ - 3
33
+ - 5
34
+ - 11
35
+ - 12
36
+ - 49
37
+ - 51
38
+ - 53
39
+ - 55
40
+ - 71
41
+ - 98
42
+ - 99
43
+ - 105
44
+ 3:
45
+ - 7
46
+ - 23
47
+ 4: 3.10.13
48
+ 5: 0.16.5
49
+ 6: 4.39.3
50
+ 8:
51
+ - 1
52
+ - 2
53
+ - 5
54
+ 9:
55
+ 1: transformers_trainer
56
+ 13: linux-x86_64
57
+ m:
58
+ - 1: train/global_step
59
+ 6:
60
+ - 3
61
+ - 1: train/loss
62
+ 5: 1
63
+ 6:
64
+ - 1
65
+ - 1: train/grad_norm
66
+ 5: 1
67
+ 6:
68
+ - 1
69
+ - 1: train/learning_rate
70
+ 5: 1
71
+ 6:
72
+ - 1
73
+ - 1: train/epoch
74
+ 5: 1
75
+ 6:
76
+ - 1
77
+ - 1: eval/loss
78
+ 5: 1
79
+ 6:
80
+ - 1
81
+ - 1: eval/runtime
82
+ 5: 1
83
+ 6:
84
+ - 1
85
+ - 1: eval/samples_per_second
86
+ 5: 1
87
+ 6:
88
+ - 1
89
+ - 1: eval/steps_per_second
90
+ 5: 1
91
+ 6:
92
+ - 1
93
+ vocab_size:
94
+ desc: null
95
+ value: 32000
96
+ max_position_embeddings:
97
+ desc: null
98
+ value: 32768
99
+ hidden_size:
100
+ desc: null
101
+ value: 4096
102
+ intermediate_size:
103
+ desc: null
104
+ value: 14336
105
+ num_hidden_layers:
106
+ desc: null
107
+ value: 32
108
+ num_attention_heads:
109
+ desc: null
110
+ value: 32
111
+ sliding_window:
112
+ desc: null
113
+ value: null
114
+ num_key_value_heads:
115
+ desc: null
116
+ value: 8
117
+ hidden_act:
118
+ desc: null
119
+ value: silu
120
+ initializer_range:
121
+ desc: null
122
+ value: 0.02
123
+ rms_norm_eps:
124
+ desc: null
125
+ value: 1.0e-05
126
+ use_cache:
127
+ desc: null
128
+ value: false
129
+ rope_theta:
130
+ desc: null
131
+ value: 1000000.0
132
+ attention_dropout:
133
+ desc: null
134
+ value: 0.0
135
+ return_dict:
136
+ desc: null
137
+ value: true
138
+ output_hidden_states:
139
+ desc: null
140
+ value: false
141
+ output_attentions:
142
+ desc: null
143
+ value: false
144
+ torchscript:
145
+ desc: null
146
+ value: false
147
+ torch_dtype:
148
+ desc: null
149
+ value: bfloat16
150
+ use_bfloat16:
151
+ desc: null
152
+ value: false
153
+ tf_legacy_loss:
154
+ desc: null
155
+ value: false
156
+ pruned_heads:
157
+ desc: null
158
+ value: {}
159
+ tie_word_embeddings:
160
+ desc: null
161
+ value: false
162
+ chunk_size_feed_forward:
163
+ desc: null
164
+ value: 0
165
+ is_encoder_decoder:
166
+ desc: null
167
+ value: false
168
+ is_decoder:
169
+ desc: null
170
+ value: false
171
+ cross_attention_hidden_size:
172
+ desc: null
173
+ value: null
174
+ add_cross_attention:
175
+ desc: null
176
+ value: false
177
+ tie_encoder_decoder:
178
+ desc: null
179
+ value: false
180
+ max_length:
181
+ desc: null
182
+ value: 20
183
+ min_length:
184
+ desc: null
185
+ value: 0
186
+ do_sample:
187
+ desc: null
188
+ value: false
189
+ early_stopping:
190
+ desc: null
191
+ value: false
192
+ num_beams:
193
+ desc: null
194
+ value: 1
195
+ num_beam_groups:
196
+ desc: null
197
+ value: 1
198
+ diversity_penalty:
199
+ desc: null
200
+ value: 0.0
201
+ temperature:
202
+ desc: null
203
+ value: 1.0
204
+ top_k:
205
+ desc: null
206
+ value: 50
207
+ top_p:
208
+ desc: null
209
+ value: 1.0
210
+ typical_p:
211
+ desc: null
212
+ value: 1.0
213
+ repetition_penalty:
214
+ desc: null
215
+ value: 1.0
216
+ length_penalty:
217
+ desc: null
218
+ value: 1.0
219
+ no_repeat_ngram_size:
220
+ desc: null
221
+ value: 0
222
+ encoder_no_repeat_ngram_size:
223
+ desc: null
224
+ value: 0
225
+ bad_words_ids:
226
+ desc: null
227
+ value: null
228
+ num_return_sequences:
229
+ desc: null
230
+ value: 1
231
+ output_scores:
232
+ desc: null
233
+ value: false
234
+ return_dict_in_generate:
235
+ desc: null
236
+ value: false
237
+ forced_bos_token_id:
238
+ desc: null
239
+ value: null
240
+ forced_eos_token_id:
241
+ desc: null
242
+ value: null
243
+ remove_invalid_values:
244
+ desc: null
245
+ value: false
246
+ exponential_decay_length_penalty:
247
+ desc: null
248
+ value: null
249
+ suppress_tokens:
250
+ desc: null
251
+ value: null
252
+ begin_suppress_tokens:
253
+ desc: null
254
+ value: null
255
+ architectures:
256
+ desc: null
257
+ value:
258
+ - MistralForCausalLM
259
+ finetuning_task:
260
+ desc: null
261
+ value: null
262
+ id2label:
263
+ desc: null
264
+ value:
265
+ '0': LABEL_0
266
+ '1': LABEL_1
267
+ label2id:
268
+ desc: null
269
+ value:
270
+ LABEL_0: 0
271
+ LABEL_1: 1
272
+ tokenizer_class:
273
+ desc: null
274
+ value: null
275
+ prefix:
276
+ desc: null
277
+ value: null
278
+ bos_token_id:
279
+ desc: null
280
+ value: 1
281
+ pad_token_id:
282
+ desc: null
283
+ value: 0
284
+ eos_token_id:
285
+ desc: null
286
+ value: 2
287
+ sep_token_id:
288
+ desc: null
289
+ value: null
290
+ decoder_start_token_id:
291
+ desc: null
292
+ value: null
293
+ task_specific_params:
294
+ desc: null
295
+ value: null
296
+ problem_type:
297
+ desc: null
298
+ value: null
299
+ _name_or_path:
300
+ desc: null
301
+ value: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
302
+ transformers_version:
303
+ desc: null
304
+ value: 4.39.3
305
+ model_type:
306
+ desc: null
307
+ value: mistral
308
+ pretraining_tp:
309
+ desc: null
310
+ value: 1
311
+ quantization_config:
312
+ desc: null
313
+ value:
314
+ quant_method: QuantizationMethod.GPTQ
315
+ bits: 4
316
+ tokenizer: null
317
+ dataset: null
318
+ group_size: 128
319
+ damp_percent: 0.1
320
+ desc_act: true
321
+ sym: true
322
+ true_sequential: true
323
+ use_cuda_fp16: false
324
+ model_seqlen: null
325
+ block_name_to_quantize: null
326
+ module_name_preceding_first_block: null
327
+ batch_size: 1
328
+ pad_token_id: null
329
+ use_exllama: true
330
+ max_input_length: null
331
+ exllama_config:
332
+ version: ExllamaVersion.ONE
333
+ cache_block_outputs: true
334
+ modules_in_block_to_quantize: null
335
+ output_dir:
336
+ desc: null
337
+ value: /kaggle/working/
338
+ overwrite_output_dir:
339
+ desc: null
340
+ value: false
341
+ do_train:
342
+ desc: null
343
+ value: false
344
+ do_eval:
345
+ desc: null
346
+ value: true
347
+ do_predict:
348
+ desc: null
349
+ value: false
350
+ evaluation_strategy:
351
+ desc: null
352
+ value: epoch
353
+ prediction_loss_only:
354
+ desc: null
355
+ value: false
356
+ per_device_train_batch_size:
357
+ desc: null
358
+ value: 6
359
+ per_device_eval_batch_size:
360
+ desc: null
361
+ value: 6
362
+ per_gpu_train_batch_size:
363
+ desc: null
364
+ value: null
365
+ per_gpu_eval_batch_size:
366
+ desc: null
367
+ value: null
368
+ gradient_accumulation_steps:
369
+ desc: null
370
+ value: 4
371
+ eval_accumulation_steps:
372
+ desc: null
373
+ value: null
374
+ eval_delay:
375
+ desc: null
376
+ value: 0
377
+ learning_rate:
378
+ desc: null
379
+ value: 0.0002
380
+ weight_decay:
381
+ desc: null
382
+ value: 0.01
383
+ adam_beta1:
384
+ desc: null
385
+ value: 0.9
386
+ adam_beta2:
387
+ desc: null
388
+ value: 0.999
389
+ adam_epsilon:
390
+ desc: null
391
+ value: 1.0e-08
392
+ max_grad_norm:
393
+ desc: null
394
+ value: 1.0
395
+ num_train_epochs:
396
+ desc: null
397
+ value: 50
398
+ max_steps:
399
+ desc: null
400
+ value: -1
401
+ lr_scheduler_type:
402
+ desc: null
403
+ value: linear
404
+ lr_scheduler_kwargs:
405
+ desc: null
406
+ value: {}
407
+ warmup_ratio:
408
+ desc: null
409
+ value: 0.0
410
+ warmup_steps:
411
+ desc: null
412
+ value: 2
413
+ log_level:
414
+ desc: null
415
+ value: passive
416
+ log_level_replica:
417
+ desc: null
418
+ value: warning
419
+ log_on_each_node:
420
+ desc: null
421
+ value: true
422
+ logging_dir:
423
+ desc: null
424
+ value: /kaggle/working/runs/Apr13_09-40-43_05fce47c7cf1
425
+ logging_strategy:
426
+ desc: null
427
+ value: epoch
428
+ logging_first_step:
429
+ desc: null
430
+ value: false
431
+ logging_steps:
432
+ desc: null
433
+ value: 500
434
+ logging_nan_inf_filter:
435
+ desc: null
436
+ value: true
437
+ save_strategy:
438
+ desc: null
439
+ value: epoch
440
+ save_steps:
441
+ desc: null
442
+ value: 500
443
+ save_total_limit:
444
+ desc: null
445
+ value: null
446
+ save_safetensors:
447
+ desc: null
448
+ value: true
449
+ save_on_each_node:
450
+ desc: null
451
+ value: false
452
+ save_only_model:
453
+ desc: null
454
+ value: false
455
+ no_cuda:
456
+ desc: null
457
+ value: false
458
+ use_cpu:
459
+ desc: null
460
+ value: false
461
+ use_mps_device:
462
+ desc: null
463
+ value: false
464
+ seed:
465
+ desc: null
466
+ value: 42
467
+ data_seed:
468
+ desc: null
469
+ value: null
470
+ jit_mode_eval:
471
+ desc: null
472
+ value: false
473
+ use_ipex:
474
+ desc: null
475
+ value: false
476
+ bf16:
477
+ desc: null
478
+ value: false
479
+ fp16:
480
+ desc: null
481
+ value: true
482
+ fp16_opt_level:
483
+ desc: null
484
+ value: O1
485
+ half_precision_backend:
486
+ desc: null
487
+ value: auto
488
+ bf16_full_eval:
489
+ desc: null
490
+ value: false
491
+ fp16_full_eval:
492
+ desc: null
493
+ value: false
494
+ tf32:
495
+ desc: null
496
+ value: null
497
+ local_rank:
498
+ desc: null
499
+ value: 0
500
+ ddp_backend:
501
+ desc: null
502
+ value: null
503
+ tpu_num_cores:
504
+ desc: null
505
+ value: null
506
+ tpu_metrics_debug:
507
+ desc: null
508
+ value: false
509
+ debug:
510
+ desc: null
511
+ value: []
512
+ dataloader_drop_last:
513
+ desc: null
514
+ value: false
515
+ eval_steps:
516
+ desc: null
517
+ value: null
518
+ dataloader_num_workers:
519
+ desc: null
520
+ value: 0
521
+ dataloader_prefetch_factor:
522
+ desc: null
523
+ value: null
524
+ past_index:
525
+ desc: null
526
+ value: -1
527
+ run_name:
528
+ desc: null
529
+ value: /kaggle/working/
530
+ disable_tqdm:
531
+ desc: null
532
+ value: false
533
+ remove_unused_columns:
534
+ desc: null
535
+ value: true
536
+ label_names:
537
+ desc: null
538
+ value: null
539
+ load_best_model_at_end:
540
+ desc: null
541
+ value: true
542
+ metric_for_best_model:
543
+ desc: null
544
+ value: loss
545
+ greater_is_better:
546
+ desc: null
547
+ value: false
548
+ ignore_data_skip:
549
+ desc: null
550
+ value: false
551
+ fsdp:
552
+ desc: null
553
+ value: []
554
+ fsdp_min_num_params:
555
+ desc: null
556
+ value: 0
557
+ fsdp_config:
558
+ desc: null
559
+ value:
560
+ min_num_params: 0
561
+ xla: false
562
+ xla_fsdp_v2: false
563
+ xla_fsdp_grad_ckpt: false
564
+ fsdp_transformer_layer_cls_to_wrap:
565
+ desc: null
566
+ value: null
567
+ accelerator_config:
568
+ desc: null
569
+ value:
570
+ split_batches: false
571
+ dispatch_batches: null
572
+ even_batches: true
573
+ use_seedable_sampler: true
574
+ deepspeed:
575
+ desc: null
576
+ value: null
577
+ label_smoothing_factor:
578
+ desc: null
579
+ value: 0.0
580
+ optim:
581
+ desc: null
582
+ value: paged_adamw_8bit
583
+ optim_args:
584
+ desc: null
585
+ value: null
586
+ adafactor:
587
+ desc: null
588
+ value: false
589
+ group_by_length:
590
+ desc: null
591
+ value: false
592
+ length_column_name:
593
+ desc: null
594
+ value: length
595
+ report_to:
596
+ desc: null
597
+ value:
598
+ - tensorboard
599
+ - wandb
600
+ ddp_find_unused_parameters:
601
+ desc: null
602
+ value: null
603
+ ddp_bucket_cap_mb:
604
+ desc: null
605
+ value: null
606
+ ddp_broadcast_buffers:
607
+ desc: null
608
+ value: null
609
+ dataloader_pin_memory:
610
+ desc: null
611
+ value: true
612
+ dataloader_persistent_workers:
613
+ desc: null
614
+ value: false
615
+ skip_memory_metrics:
616
+ desc: null
617
+ value: true
618
+ use_legacy_prediction_loop:
619
+ desc: null
620
+ value: false
621
+ push_to_hub:
622
+ desc: null
623
+ value: false
624
+ resume_from_checkpoint:
625
+ desc: null
626
+ value: null
627
+ hub_model_id:
628
+ desc: null
629
+ value: null
630
+ hub_strategy:
631
+ desc: null
632
+ value: every_save
633
+ hub_token:
634
+ desc: null
635
+ value: <HUB_TOKEN>
636
+ hub_private_repo:
637
+ desc: null
638
+ value: false
639
+ hub_always_push:
640
+ desc: null
641
+ value: false
642
+ gradient_checkpointing:
643
+ desc: null
644
+ value: false
645
+ gradient_checkpointing_kwargs:
646
+ desc: null
647
+ value: null
648
+ include_inputs_for_metrics:
649
+ desc: null
650
+ value: false
651
+ fp16_backend:
652
+ desc: null
653
+ value: auto
654
+ push_to_hub_model_id:
655
+ desc: null
656
+ value: null
657
+ push_to_hub_organization:
658
+ desc: null
659
+ value: null
660
+ push_to_hub_token:
661
+ desc: null
662
+ value: <PUSH_TO_HUB_TOKEN>
663
+ mp_parameters:
664
+ desc: null
665
+ value: ''
666
+ auto_find_batch_size:
667
+ desc: null
668
+ value: false
669
+ full_determinism:
670
+ desc: null
671
+ value: false
672
+ torchdynamo:
673
+ desc: null
674
+ value: null
675
+ ray_scope:
676
+ desc: null
677
+ value: last
678
+ ddp_timeout:
679
+ desc: null
680
+ value: 1800
681
+ torch_compile:
682
+ desc: null
683
+ value: false
684
+ torch_compile_backend:
685
+ desc: null
686
+ value: null
687
+ torch_compile_mode:
688
+ desc: null
689
+ value: null
690
+ dispatch_batches:
691
+ desc: null
692
+ value: null
693
+ split_batches:
694
+ desc: null
695
+ value: null
696
+ include_tokens_per_second:
697
+ desc: null
698
+ value: false
699
+ include_num_input_tokens_seen:
700
+ desc: null
701
+ value: false
702
+ neftune_noise_alpha:
703
+ desc: null
704
+ value: null
705
+ optim_target_modules:
706
+ desc: null
707
+ value: null
wandb/run-20240413_094054-v5dwwo6y/files/output.log ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
2
+ warnings.warn(
3
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
4
+ warnings.warn(
5
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
6
+ warnings.warn(
7
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
8
+ warnings.warn(
9
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
10
+ warnings.warn(
11
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
12
+ warnings.warn(
13
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
14
+ warnings.warn(
15
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
16
+ warnings.warn(
17
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
18
+ warnings.warn(
19
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
20
+ warnings.warn(
21
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
22
+ warnings.warn(
23
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
24
+ warnings.warn(
25
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
26
+ warnings.warn(
27
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
28
+ warnings.warn(
29
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
30
+ warnings.warn(
31
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
32
+ warnings.warn(
33
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
34
+ warnings.warn(
35
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
36
+ warnings.warn(
37
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
38
+ warnings.warn(
39
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
40
+ warnings.warn(
41
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
42
+ warnings.warn(
43
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
44
+ warnings.warn(
45
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
46
+ warnings.warn(
47
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
48
+ warnings.warn(
49
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
50
+ warnings.warn(
51
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
52
+ warnings.warn(
53
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
54
+ warnings.warn(
55
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
56
+ warnings.warn(
57
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
58
+ warnings.warn(
59
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
60
+ warnings.warn(
61
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
62
+ warnings.warn(
63
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
64
+ warnings.warn(
65
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
66
+ warnings.warn(
67
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
68
+ warnings.warn(
69
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
70
+ warnings.warn(
71
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
72
+ warnings.warn(
73
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
74
+ warnings.warn(
75
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
76
+ warnings.warn(
77
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
78
+ warnings.warn(
79
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
80
+ warnings.warn(
81
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
82
+ warnings.warn(
83
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
84
+ warnings.warn(
85
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
86
+ warnings.warn(
87
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
88
+ warnings.warn(
89
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
90
+ warnings.warn(
91
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
92
+ warnings.warn(
93
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
94
+ warnings.warn(
95
+ /opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
wandb/run-20240413_094054-v5dwwo6y/files/requirements.txt ADDED
@@ -0,0 +1,867 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Babel==2.14.0
2
+ Boruta==0.3
3
+ Brotli==1.0.9
4
+ CVXcanon==0.1.2
5
+ Cartopy==0.22.0
6
+ Cython==3.0.8
7
+ Deprecated==1.2.14
8
+ Farama-Notifications==0.0.4
9
+ Flask==3.0.2
10
+ Geohash==1.0
11
+ GitPython==3.1.41
12
+ ImageHash==4.3.1
13
+ Janome==0.5.0
14
+ Jinja2==3.1.2
15
+ LunarCalendar==0.0.9
16
+ Mako==1.3.2
17
+ Markdown==3.5.2
18
+ MarkupSafe==2.1.3
19
+ MarkupSafe==2.1.5
20
+ Pillow==9.5.0
21
+ PuLP==2.8.0
22
+ PyArabic==0.6.15
23
+ PyJWT==2.8.0
24
+ PyMeeus==0.5.12
25
+ PySocks==1.7.1
26
+ PyUpSet==0.1.1.post7
27
+ PyWavelets==1.5.0
28
+ PyYAML==6.0.1
29
+ Pygments==2.17.2
30
+ Pympler==1.0.1
31
+ QtPy==2.4.1
32
+ Rtree==1.2.0
33
+ SQLAlchemy==2.0.25
34
+ SecretStorage==3.3.3
35
+ Send2Trash==1.8.2
36
+ Shapely==1.8.5.post1
37
+ Shimmy==1.3.0
38
+ SimpleITK==2.3.1
39
+ TPOT==0.12.1
40
+ Theano-PyMC==1.1.2
41
+ Theano==1.0.5
42
+ Wand==0.6.13
43
+ Werkzeug==3.0.2
44
+ absl-py==1.4.0
45
+ accelerate==0.28.0
46
+ access==1.1.9
47
+ affine==2.4.0
48
+ aiobotocore==2.12.2
49
+ aiofiles==22.1.0
50
+ aiohttp-cors==0.7.0
51
+ aiohttp==3.9.1
52
+ aioitertools==0.11.0
53
+ aiorwlock==1.3.0
54
+ aiosignal==1.3.1
55
+ aiosqlite==0.19.0
56
+ albumentations==1.4.0
57
+ alembic==1.13.1
58
+ altair==5.3.0
59
+ annotated-types==0.6.0
60
+ annoy==1.17.3
61
+ anyio==4.2.0
62
+ apache-beam==2.46.0
63
+ aplus==0.11.0
64
+ appdirs==1.4.4
65
+ archspec==0.2.3
66
+ argon2-cffi-bindings==21.2.0
67
+ argon2-cffi==23.1.0
68
+ array-record==0.5.0
69
+ arrow==1.3.0
70
+ arviz==0.17.1
71
+ astroid==3.1.0
72
+ astropy-iers-data==0.2024.4.1.0.33.14
73
+ astropy==6.0.1
74
+ asttokens==2.4.1
75
+ astunparse==1.6.3
76
+ async-lru==2.0.4
77
+ async-timeout==4.0.3
78
+ attrs==23.2.0
79
+ audioread==3.0.1
80
+ auto_gptq==0.7.1
81
+ autopep8==2.0.4
82
+ backoff==2.2.1
83
+ bayesian-optimization==1.4.3
84
+ beatrix_jupyterlab==2023.128.151533
85
+ beautifulsoup4==4.12.2
86
+ bitsandbytes==0.43.1
87
+ blake3==0.2.1
88
+ bleach==6.1.0
89
+ blessed==1.20.0
90
+ blinker==1.7.0
91
+ blis==0.7.10
92
+ blosc2==2.6.0
93
+ bokeh==3.3.4
94
+ boltons==23.1.1
95
+ boto3==1.26.100
96
+ botocore==1.34.51
97
+ bq_helper==0.4.1
98
+ bqplot==0.12.43
99
+ branca==0.7.1
100
+ brewer2mpl==1.4.1
101
+ brotlipy==0.7.0
102
+ cached-property==1.5.2
103
+ cachetools==4.2.4
104
+ cachetools==5.3.2
105
+ catalogue==2.0.10
106
+ catalyst==22.4
107
+ catboost==1.2.3
108
+ category-encoders==2.6.3
109
+ certifi==2024.2.2
110
+ cesium==0.12.1
111
+ cffi==1.16.0
112
+ charset-normalizer==3.3.2
113
+ chex==0.1.86
114
+ cleverhans==4.0.0
115
+ click-plugins==1.1.1
116
+ click==8.1.7
117
+ cligj==0.7.2
118
+ cloud-tpu-client==0.10
119
+ cloud-tpu-profiler==2.4.0
120
+ cloudpathlib==0.16.0
121
+ cloudpickle==2.2.1
122
+ cloudpickle==3.0.0
123
+ cmdstanpy==1.2.2
124
+ colorama==0.4.6
125
+ colorcet==3.1.0
126
+ coloredlogs==15.0.1
127
+ colorful==0.5.6
128
+ colorlog==6.8.2
129
+ colorlover==0.3.0
130
+ comm==0.2.1
131
+ conda-libmamba-solver==23.7.0
132
+ conda-package-handling==2.2.0
133
+ conda==23.7.4
134
+ conda_package_streaming==0.9.0
135
+ confection==0.1.4
136
+ contextily==1.6.0
137
+ contourpy==1.2.0
138
+ convertdate==2.4.0
139
+ crcmod==1.7
140
+ cryptography==41.0.7
141
+ cuda-python==12.4.0
142
+ cudf==23.8.0
143
+ cufflinks==0.17.3
144
+ cuml==23.8.0
145
+ cupy==13.0.0
146
+ cycler==0.12.1
147
+ cymem==2.0.8
148
+ cytoolz==0.12.3
149
+ daal4py==2024.2.0
150
+ daal==2024.2.0
151
+ dacite==1.8.1
152
+ dask-cuda==23.8.0
153
+ dask-cudf==23.8.0
154
+ dask-expr==1.0.9
155
+ dask==2024.4.0
156
+ dataclasses-json==0.6.4
157
+ dataproc_jupyter_plugin==0.1.66
158
+ datasets==2.18.0
159
+ datashader==0.16.0
160
+ datatile==1.0.3
161
+ db-dtypes==1.2.0
162
+ deap==1.4.1
163
+ debugpy==1.8.0
164
+ decorator==5.1.1
165
+ deepdiff==6.7.1
166
+ defusedxml==0.7.1
167
+ deprecation==2.1.0
168
+ descartes==1.1.0
169
+ dill==0.3.8
170
+ dipy==1.9.0
171
+ distlib==0.3.8
172
+ distributed==2023.7.1
173
+ distro==1.9.0
174
+ dm-tree==0.1.8
175
+ docker-pycreds==0.4.0
176
+ docker==7.0.0
177
+ docopt==0.6.2
178
+ docstring-parser==0.15
179
+ docstring-to-markdown==0.15
180
+ docutils==0.20.1
181
+ earthengine-api==0.1.395
182
+ easydict==1.13
183
+ easyocr==1.7.1
184
+ ecos==2.0.13
185
+ eli5==0.13.0
186
+ emoji==2.11.0
187
+ en-core-web-lg==3.7.1
188
+ en-core-web-sm==3.7.1
189
+ entrypoints==0.4
190
+ ephem==4.1.5
191
+ esda==2.5.1
192
+ essentia==2.1b6.dev1110
193
+ et-xmlfile==1.1.0
194
+ etils==1.6.0
195
+ exceptiongroup==1.2.0
196
+ executing==2.0.1
197
+ explainable-ai-sdk==1.3.3
198
+ fastai==2.7.14
199
+ fastapi==0.108.0
200
+ fastavro==1.9.3
201
+ fastcore==1.5.29
202
+ fastdownload==0.0.7
203
+ fasteners==0.19
204
+ fastjsonschema==2.19.1
205
+ fastprogress==1.0.3
206
+ fastrlock==0.8.2
207
+ fasttext==0.9.2
208
+ feather-format==0.4.1
209
+ featuretools==1.30.0
210
+ filelock==3.13.1
211
+ fiona==1.9.6
212
+ fitter==1.7.0
213
+ flake8==7.0.0
214
+ flashtext==2.7
215
+ flatbuffers==23.5.26
216
+ flax==0.8.2
217
+ folium==0.16.0
218
+ fonttools==4.47.0
219
+ fonttools==4.50.0
220
+ fqdn==1.5.1
221
+ frozendict==2.4.1
222
+ frozenlist==1.4.1
223
+ fsspec==2024.2.0
224
+ fsspec==2024.3.1
225
+ funcy==2.0
226
+ fury==0.10.0
227
+ future==1.0.0
228
+ fuzzywuzzy==0.18.0
229
+ gast==0.5.4
230
+ gatspy==0.3
231
+ gcsfs==2024.2.0
232
+ gekko==1.1.1
233
+ gensim==4.3.2
234
+ geographiclib==2.0
235
+ geojson==3.1.0
236
+ geopandas==0.14.3
237
+ geoplot==0.5.1
238
+ geopy==2.4.1
239
+ geoviews==1.11.1
240
+ ggplot==0.11.5
241
+ giddy==2.3.5
242
+ gitdb==4.0.11
243
+ google-ai-generativelanguage==0.4.0
244
+ google-api-core==2.11.1
245
+ google-api-core==2.18.0
246
+ google-api-python-client==2.125.0
247
+ google-apitools==0.5.31
248
+ google-auth-httplib2==0.2.0
249
+ google-auth-oauthlib==1.2.0
250
+ google-auth==2.26.1
251
+ google-cloud-aiplatform==0.6.0a1
252
+ google-cloud-artifact-registry==1.10.0
253
+ google-cloud-automl==1.0.1
254
+ google-cloud-bigquery==2.34.4
255
+ google-cloud-bigtable==1.7.3
256
+ google-cloud-core==2.4.1
257
+ google-cloud-datastore==2.19.0
258
+ google-cloud-dlp==3.14.0
259
+ google-cloud-jupyter-config==0.0.5
260
+ google-cloud-language==2.13.3
261
+ google-cloud-monitoring==2.18.0
262
+ google-cloud-pubsub==2.19.0
263
+ google-cloud-pubsublite==1.9.0
264
+ google-cloud-recommendations-ai==0.7.1
265
+ google-cloud-resource-manager==1.11.0
266
+ google-cloud-spanner==3.40.1
267
+ google-cloud-storage==1.44.0
268
+ google-cloud-translate==3.12.1
269
+ google-cloud-videointelligence==2.13.3
270
+ google-cloud-vision==2.8.0
271
+ google-crc32c==1.5.0
272
+ google-generativeai==0.4.1
273
+ google-pasta==0.2.0
274
+ google-resumable-media==2.7.0
275
+ googleapis-common-protos==1.62.0
276
+ gplearn==0.4.2
277
+ gpustat==1.0.0
278
+ gpxpy==1.6.2
279
+ graphviz==0.20.3
280
+ greenlet==3.0.3
281
+ grpc-google-iam-v1==0.12.7
282
+ grpcio-status==1.48.1
283
+ grpcio-status==1.48.2
284
+ grpcio==1.51.1
285
+ grpcio==1.60.0
286
+ gviz-api==1.10.0
287
+ gym-notices==0.0.8
288
+ gym==0.26.2
289
+ gymnasium==0.29.0
290
+ h11==0.14.0
291
+ h2o==3.46.0.1
292
+ h5netcdf==1.3.0
293
+ h5py==3.10.0
294
+ haversine==2.8.1
295
+ hdfs==2.7.3
296
+ hep-ml==0.7.2
297
+ hijri-converter==2.3.1
298
+ hmmlearn==0.3.2
299
+ holidays==0.24
300
+ holoviews==1.18.3
301
+ hpsklearn==0.1.0
302
+ html5lib==1.1
303
+ htmlmin==0.1.12
304
+ httpcore==1.0.5
305
+ httplib2==0.21.0
306
+ httptools==0.6.1
307
+ httpx==0.27.0
308
+ huggingface-hub==0.22.2
309
+ humanfriendly==10.0
310
+ hunspell==0.5.5
311
+ hydra-slayer==0.5.0
312
+ hyperopt==0.2.7
313
+ hypertools==0.8.0
314
+ idna==3.6
315
+ igraph==0.11.4
316
+ imagecodecs==2024.1.1
317
+ imageio==2.33.1
318
+ imbalanced-learn==0.12.2
319
+ imgaug==0.4.0
320
+ importlib-metadata==6.11.0
321
+ importlib-metadata==7.0.1
322
+ importlib-resources==6.1.1
323
+ inequality==1.0.1
324
+ iniconfig==2.0.0
325
+ ipydatawidgets==4.3.5
326
+ ipykernel==6.28.0
327
+ ipyleaflet==0.18.2
328
+ ipympl==0.7.0
329
+ ipython-genutils==0.2.0
330
+ ipython-genutils==0.2.0
331
+ ipython-sql==0.5.0
332
+ ipython==8.20.0
333
+ ipyvolume==0.6.3
334
+ ipyvue==1.10.2
335
+ ipyvuetify==1.9.3
336
+ ipywebrtc==0.6.0
337
+ ipywidgets==7.7.1
338
+ isoduration==20.11.0
339
+ isort==5.13.2
340
+ isoweek==1.3.3
341
+ itsdangerous==2.1.2
342
+ jaraco.classes==3.3.0
343
+ jax-jumpy==1.0.0
344
+ jax==0.4.23
345
+ jaxlib==0.4.23.dev20240116
346
+ jedi==0.19.1
347
+ jeepney==0.8.0
348
+ jieba==0.42.1
349
+ jmespath==1.0.1
350
+ joblib==1.3.2
351
+ json5==0.9.14
352
+ jsonpatch==1.33
353
+ jsonpointer==2.4
354
+ jsonschema-specifications==2023.12.1
355
+ jsonschema==4.20.0
356
+ jupyter-console==6.6.3
357
+ jupyter-events==0.9.0
358
+ jupyter-http-over-ws==0.0.8
359
+ jupyter-lsp==1.5.1
360
+ jupyter-server-mathjax==0.2.6
361
+ jupyter-ydoc==0.2.5
362
+ jupyter_client==7.4.9
363
+ jupyter_client==8.6.0
364
+ jupyter_core==5.7.1
365
+ jupyter_server==2.13.0
366
+ jupyter_server_fileid==0.9.1
367
+ jupyter_server_proxy==4.1.0
368
+ jupyter_server_terminals==0.5.1
369
+ jupyter_server_ydoc==0.8.0
370
+ jupyterlab-lsp==5.1.0
371
+ jupyterlab-widgets==3.0.9
372
+ jupyterlab==4.1.5
373
+ jupyterlab_git==0.44.0
374
+ jupyterlab_pygments==0.3.0
375
+ jupyterlab_server==2.25.2
376
+ jupytext==1.16.0
377
+ kaggle-environments==1.14.3
378
+ kaggle==1.6.8
379
+ kagglehub==0.2.2
380
+ keras-cv==0.8.2
381
+ keras-nlp==0.8.2
382
+ keras-tuner==1.4.6
383
+ keras==3.1.1
384
+ kernels-mixer==0.0.7
385
+ keyring==24.3.0
386
+ keyrings.google-artifactregistry-auth==1.1.2
387
+ kfp-pipeline-spec==0.2.2
388
+ kfp-server-api==2.0.5
389
+ kfp==2.5.0
390
+ kiwisolver==1.4.5
391
+ kmapper==2.0.1
392
+ kmodes==0.12.2
393
+ korean-lunar-calendar==0.3.1
394
+ kornia==0.7.2
395
+ kornia_rs==0.1.3
396
+ kt-legacy==1.0.5
397
+ kubernetes==26.1.0
398
+ langcodes==3.3.0
399
+ langid==1.1.6
400
+ lazy_loader==0.3
401
+ learntools==0.3.4
402
+ leven==1.0.4
403
+ libclang==16.0.6
404
+ libmambapy==1.5.0
405
+ libpysal==4.9.2
406
+ librosa==0.10.1
407
+ lightgbm==4.2.0
408
+ lightning-utilities==0.11.2
409
+ lime==0.2.0.1
410
+ line-profiler==4.1.2
411
+ linkify-it-py==2.0.3
412
+ llvmlite==0.41.1
413
+ llvmlite==0.42.0
414
+ lml==0.1.0
415
+ locket==1.0.0
416
+ loguru==0.7.2
417
+ lxml==5.2.1
418
+ lz4==4.3.3
419
+ mamba==1.5.0
420
+ mapclassify==2.6.1
421
+ markdown-it-py==3.0.0
422
+ marshmallow==3.21.1
423
+ matplotlib-inline==0.1.6
424
+ matplotlib-venn==0.11.10
425
+ matplotlib==3.7.5
426
+ matplotlib==3.8.3
427
+ mccabe==0.7.0
428
+ mdit-py-plugins==0.4.0
429
+ mdurl==0.1.2
430
+ memory-profiler==0.61.0
431
+ menuinst==2.0.1
432
+ mercantile==1.2.1
433
+ mgwr==2.2.1
434
+ missingno==0.5.2
435
+ mistune==0.8.4
436
+ mizani==0.11.1
437
+ ml-dtypes==0.2.0
438
+ mlcrate==0.2.0
439
+ mlens==0.2.3
440
+ mlxtend==0.23.1
441
+ mne==1.6.1
442
+ mnist==0.2.2
443
+ momepy==0.7.0
444
+ more-itertools==10.2.0
445
+ mpld3==0.5.10
446
+ mpmath==1.3.0
447
+ msgpack==1.0.7
448
+ multidict==6.0.4
449
+ multimethod==1.10
450
+ multipledispatch==1.0.0
451
+ multiprocess==0.70.16
452
+ munkres==1.1.4
453
+ murmurhash==1.0.10
454
+ mypy-extensions==1.0.0
455
+ namex==0.0.7
456
+ nb-conda-kernels==2.3.1
457
+ nb_conda==2.2.1
458
+ nbclassic==1.0.0
459
+ nbclient==0.5.13
460
+ nbconvert==6.4.5
461
+ nbdime==3.2.0
462
+ nbformat==5.9.2
463
+ ndindex==1.8
464
+ nest-asyncio==1.5.8
465
+ networkx==3.2.1
466
+ nibabel==5.2.1
467
+ nilearn==0.10.3
468
+ ninja==1.11.1.1
469
+ nltk==3.2.4
470
+ nose==1.3.7
471
+ notebook==6.5.4
472
+ notebook==6.5.6
473
+ notebook_executor==0.2
474
+ notebook_shim==0.2.3
475
+ numba==0.58.1
476
+ numba==0.59.1
477
+ numexpr==2.10.0
478
+ numpy==1.26.4
479
+ nvidia-ml-py==11.495.46
480
+ nvtx==0.2.10
481
+ oauth2client==4.1.3
482
+ oauthlib==3.2.2
483
+ objsize==0.6.1
484
+ odfpy==1.4.1
485
+ olefile==0.47
486
+ onnx==1.16.0
487
+ opencensus-context==0.1.3
488
+ opencensus==0.11.4
489
+ opencv-contrib-python==4.9.0.80
490
+ opencv-python-headless==4.9.0.80
491
+ opencv-python==4.9.0.80
492
+ openpyxl==3.1.2
493
+ openslide-python==1.3.1
494
+ opentelemetry-api==1.22.0
495
+ opentelemetry-exporter-otlp-proto-common==1.22.0
496
+ opentelemetry-exporter-otlp-proto-grpc==1.22.0
497
+ opentelemetry-exporter-otlp-proto-http==1.22.0
498
+ opentelemetry-exporter-otlp==1.22.0
499
+ opentelemetry-proto==1.22.0
500
+ opentelemetry-sdk==1.22.0
501
+ opentelemetry-semantic-conventions==0.43b0
502
+ opt-einsum==3.3.0
503
+ optax==0.2.2
504
+ optimum==1.18.1
505
+ optree==0.11.0
506
+ optuna==3.6.1
507
+ orbax-checkpoint==0.5.7
508
+ ordered-set==4.1.0
509
+ orjson==3.9.10
510
+ ortools==9.4.1874
511
+ osmnx==1.9.2
512
+ overrides==7.4.0
513
+ packaging==21.3
514
+ pandas-datareader==0.10.0
515
+ pandas-profiling==3.6.6
516
+ pandas-summary==0.2.0
517
+ pandas==2.1.4
518
+ pandas==2.2.1
519
+ pandasql==0.7.3
520
+ pandocfilters==1.5.0
521
+ panel==1.3.8
522
+ papermill==2.5.0
523
+ param==2.1.0
524
+ parso==0.8.3
525
+ partd==1.4.1
526
+ path.py==12.5.0
527
+ path==16.10.0
528
+ pathos==0.3.2
529
+ pathy==0.10.3
530
+ patsy==0.5.6
531
+ pdf2image==1.17.0
532
+ peft==0.10.0
533
+ pettingzoo==1.24.0
534
+ pexpect==4.8.0
535
+ pexpect==4.9.0
536
+ phik==0.12.4
537
+ pickleshare==0.7.5
538
+ pillow==10.3.0
539
+ pip==23.3.2
540
+ pkgutil_resolve_name==1.3.10
541
+ platformdirs==4.2.0
542
+ plotly-express==0.4.1
543
+ plotly==5.18.0
544
+ plotnine==0.13.4
545
+ pluggy==1.4.0
546
+ pointpats==2.4.0
547
+ polars==0.20.18
548
+ polyglot==16.7.4
549
+ pooch==1.8.1
550
+ pox==0.3.4
551
+ ppca==0.0.4
552
+ ppft==1.7.6.8
553
+ preprocessing==0.1.13
554
+ preshed==3.0.9
555
+ prettytable==3.9.0
556
+ progressbar2==4.4.2
557
+ prometheus-client==0.19.0
558
+ promise==2.3
559
+ prompt-toolkit==3.0.42
560
+ prompt-toolkit==3.0.43
561
+ prophet==1.1.1
562
+ proto-plus==1.23.0
563
+ protobuf==3.20.3
564
+ protobuf==4.21.12
565
+ psutil==5.9.3
566
+ psutil==5.9.7
567
+ ptyprocess==0.7.0
568
+ pudb==2024.1
569
+ pure-eval==0.2.2
570
+ py-cpuinfo==9.0.0
571
+ py-spy==0.3.14
572
+ py4j==0.10.9.7
573
+ pyLDAvis==3.4.1
574
+ pyOpenSSL==23.3.0
575
+ pyaml==23.12.0
576
+ pyarrow-hotfix==0.6
577
+ pyarrow==15.0.2
578
+ pyasn1-modules==0.3.0
579
+ pyasn1==0.5.1
580
+ pybind11==2.12.0
581
+ pyclipper==1.3.0.post5
582
+ pycodestyle==2.11.1
583
+ pycosat==0.6.6
584
+ pycparser==2.21
585
+ pycryptodome==3.20.0
586
+ pyct==0.5.0
587
+ pycuda==2024.1
588
+ pydantic==2.5.3
589
+ pydantic==2.6.4
590
+ pydantic_core==2.14.6
591
+ pydantic_core==2.16.3
592
+ pydegensac==0.1.2
593
+ pydicom==2.4.4
594
+ pydocstyle==6.3.0
595
+ pydot==1.4.2
596
+ pydub==0.25.1
597
+ pyemd==1.0.0
598
+ pyerfa==2.0.1.1
599
+ pyexcel-io==0.6.6
600
+ pyexcel-ods==0.6.0
601
+ pyflakes==3.2.0
602
+ pygltflib==1.16.2
603
+ pykalman==0.9.7
604
+ pylibraft==23.8.0
605
+ pylint==3.1.0
606
+ pymc3==3.11.4
607
+ pymongo==3.13.0
608
+ pynndescent==0.5.12
609
+ pynvml==11.4.1
610
+ pynvrtc==9.2
611
+ pyparsing==3.1.1
612
+ pyparsing==3.1.2
613
+ pypdf==4.1.0
614
+ pyproj==3.6.1
615
+ pysal==24.1
616
+ pyshp==2.3.1
617
+ pytesseract==0.3.10
618
+ pytest==8.1.1
619
+ python-bidi==0.4.2
620
+ python-dateutil==2.9.0.post0
621
+ python-dotenv==1.0.0
622
+ python-json-logger==2.0.7
623
+ python-louvain==0.16
624
+ python-lsp-jsonrpc==1.1.2
625
+ python-lsp-server==1.11.0
626
+ python-slugify==8.0.4
627
+ python-utils==3.8.2
628
+ pythreejs==2.4.2
629
+ pytoolconfig==1.3.1
630
+ pytools==2024.1.1
631
+ pytorch-ignite==0.5.0.post2
632
+ pytorch-lightning==2.2.1
633
+ pytz==2023.3.post1
634
+ pytz==2024.1
635
+ pyu2f==0.1.5
636
+ pyviz_comms==3.0.2
637
+ pyzmq==24.0.1
638
+ pyzmq==25.1.2
639
+ qgrid==1.3.1
640
+ qtconsole==5.5.1
641
+ quantecon==0.7.2
642
+ qudida==0.0.4
643
+ raft-dask==23.8.0
644
+ rasterio==1.3.9
645
+ rasterstats==0.19.0
646
+ ray-cpp==2.9.0
647
+ ray==2.9.0
648
+ referencing==0.32.1
649
+ regex==2023.12.25
650
+ requests-oauthlib==1.3.1
651
+ requests-toolbelt==0.10.1
652
+ requests==2.31.0
653
+ retrying==1.3.3
654
+ retrying==1.3.4
655
+ rfc3339-validator==0.1.4
656
+ rfc3986-validator==0.1.1
657
+ rgf-python==3.12.0
658
+ rich-click==1.7.4
659
+ rich==13.7.0
660
+ rich==13.7.1
661
+ rmm==23.8.0
662
+ rope==1.13.0
663
+ rouge==1.0.1
664
+ rpds-py==0.16.2
665
+ rsa==4.9
666
+ ruamel-yaml-conda==0.15.100
667
+ ruamel.yaml.clib==0.2.7
668
+ ruamel.yaml==0.17.40
669
+ s2sphere==0.2.5
670
+ s3fs==2024.2.0
671
+ s3transfer==0.6.2
672
+ safetensors==0.4.2
673
+ scattertext==0.1.19
674
+ scikit-image==0.22.0
675
+ scikit-learn-intelex==2024.2.0
676
+ scikit-learn==1.2.2
677
+ scikit-multilearn==0.2.0
678
+ scikit-optimize==0.10.1
679
+ scikit-plot==0.3.7
680
+ scikit-surprise==1.1.3
681
+ scipy==1.11.4
682
+ scipy==1.12.0
683
+ seaborn==0.12.2
684
+ segment_anything==1.0
685
+ segregation==2.5
686
+ semver==3.0.2
687
+ sentencepiece==0.2.0
688
+ sentry-sdk==1.44.1
689
+ setproctitle==1.3.3
690
+ setuptools-git==1.2
691
+ setuptools-scm==8.0.4
692
+ setuptools==69.0.3
693
+ shap==0.44.1
694
+ shapely==2.0.3
695
+ shellingham==1.5.4
696
+ simpervisor==1.0.0
697
+ simplejson==3.19.2
698
+ six==1.16.0
699
+ sklearn-pandas==2.2.0
700
+ slicer==0.0.7
701
+ smart-open==6.4.0
702
+ smmap==5.0.1
703
+ sniffio==1.3.0
704
+ snowballstemmer==2.2.0
705
+ snuggs==1.4.7
706
+ sortedcontainers==2.4.0
707
+ soundfile==0.12.1
708
+ soupsieve==2.5
709
+ soxr==0.3.7
710
+ spacy-legacy==3.0.12
711
+ spacy-loggers==1.0.5
712
+ spacy==3.7.2
713
+ spaghetti==1.7.5.post1
714
+ spectral==0.23.1
715
+ spglm==1.1.0
716
+ sphinx-rtd-theme==0.2.4
717
+ spint==1.0.7
718
+ splot==1.1.5.post1
719
+ spopt==0.6.0
720
+ spreg==1.4.2
721
+ spvcm==0.3.0
722
+ sqlparse==0.4.4
723
+ squarify==0.4.3
724
+ srsly==2.4.8
725
+ stable-baselines3==2.1.0
726
+ stack-data==0.6.2
727
+ stack-data==0.6.3
728
+ stanio==0.5.0
729
+ starlette==0.32.0.post1
730
+ statsmodels==0.14.1
731
+ stemming==1.0.1
732
+ stop-words==2018.7.23
733
+ stopit==1.1.2
734
+ stumpy==1.12.0
735
+ sympy==1.12
736
+ tables==3.9.2
737
+ tabulate==0.9.0
738
+ tangled-up-in-unicode==0.2.0
739
+ tbb==2021.12.0
740
+ tblib==3.0.0
741
+ tenacity==8.2.3
742
+ tensorboard-data-server==0.7.2
743
+ tensorboard-plugin-profile==2.15.0
744
+ tensorboard==2.15.1
745
+ tensorboardX==2.6.2.2
746
+ tensorflow-cloud==0.1.16
747
+ tensorflow-datasets==4.9.4
748
+ tensorflow-decision-forests==1.8.1
749
+ tensorflow-estimator==2.15.0
750
+ tensorflow-hub==0.16.1
751
+ tensorflow-io-gcs-filesystem==0.35.0
752
+ tensorflow-io==0.35.0
753
+ tensorflow-metadata==0.14.0
754
+ tensorflow-probability==0.23.0
755
+ tensorflow-serving-api==2.14.1
756
+ tensorflow-text==2.15.0
757
+ tensorflow-transform==0.14.0
758
+ tensorflow==2.15.0
759
+ tensorstore==0.1.56
760
+ termcolor==2.4.0
761
+ terminado==0.18.0
762
+ testpath==0.6.0
763
+ text-unidecode==1.3
764
+ textblob==0.18.0.post0
765
+ texttable==1.7.0
766
+ tf_keras==2.15.1
767
+ tfp-nightly==0.24.0.dev0
768
+ thinc==8.2.2
769
+ threadpoolctl==3.2.0
770
+ tifffile==2023.12.9
771
+ timm==0.9.16
772
+ tinycss2==1.2.1
773
+ tobler==0.11.2
774
+ tokenizers==0.15.2
775
+ toml==0.10.2
776
+ tomli==2.0.1
777
+ tomlkit==0.12.4
778
+ toolz==0.12.1
779
+ torch==2.1.2
780
+ torchaudio==2.1.2
781
+ torchdata==0.7.1
782
+ torchinfo==1.8.0
783
+ torchmetrics==1.3.2
784
+ torchtext==0.16.2
785
+ torchvision==0.16.2
786
+ tornado==6.3.3
787
+ tqdm==4.66.1
788
+ traceml==1.0.8
789
+ traitlets==5.9.0
790
+ traittypes==0.2.1
791
+ transformers==4.39.3
792
+ treelite-runtime==3.2.0
793
+ treelite==3.2.0
794
+ truststore==0.8.0
795
+ trx-python==0.2.9
796
+ tsfresh==0.20.2
797
+ typeguard==4.1.5
798
+ typer==0.9.0
799
+ typer==0.9.4
800
+ types-python-dateutil==2.8.19.20240106
801
+ typing-inspect==0.9.0
802
+ typing-utils==0.1.0
803
+ typing_extensions==4.9.0
804
+ tzdata==2023.4
805
+ uc-micro-py==1.0.3
806
+ ucx-py==0.33.0
807
+ ujson==5.9.0
808
+ umap-learn==0.5.5
809
+ unicodedata2==15.1.0
810
+ update-checker==0.18.0
811
+ uri-template==1.3.0
812
+ uritemplate==3.0.1
813
+ urllib3==1.26.18
814
+ urllib3==2.1.0
815
+ urwid==2.6.10
816
+ urwid_readline==0.14
817
+ uvicorn==0.25.0
818
+ uvloop==0.19.0
819
+ vaex-astro==0.9.3
820
+ vaex-core==4.17.1
821
+ vaex-hdf5==0.14.1
822
+ vaex-jupyter==0.8.2
823
+ vaex-ml==0.18.3
824
+ vaex-server==0.9.0
825
+ vaex-viz==0.5.4
826
+ vaex==4.17.0
827
+ vec_noise==1.1.4
828
+ vecstack==0.4.0
829
+ virtualenv==20.21.0
830
+ visions==0.7.5
831
+ vowpalwabbit==9.9.0
832
+ vtk==9.3.0
833
+ wandb==0.16.5
834
+ wasabi==1.1.2
835
+ watchfiles==0.21.0
836
+ wavio==0.0.8
837
+ wcwidth==0.2.13
838
+ weasel==0.3.4
839
+ webcolors==1.13
840
+ webencodings==0.5.1
841
+ websocket-client==1.7.0
842
+ websockets==12.0
843
+ wfdb==4.1.2
844
+ whatthepatch==1.0.5
845
+ wheel==0.42.0
846
+ widgetsnbextension==3.6.6
847
+ witwidget==1.8.1
848
+ woodwork==0.29.0
849
+ wordcloud==1.9.3
850
+ wordsegment==1.3.1
851
+ wrapt==1.14.1
852
+ xarray-einstats==0.7.0
853
+ xarray==2024.3.0
854
+ xgboost==2.0.3
855
+ xvfbwrapper==0.2.9
856
+ xxhash==3.4.1
857
+ xyzservices==2023.10.1
858
+ y-py==0.6.2
859
+ yapf==0.40.2
860
+ yarl==1.9.3
861
+ yarl==1.9.4
862
+ ydata-profiling==4.6.4
863
+ yellowbrick==1.5
864
+ ypy-websocket==0.8.4
865
+ zict==3.0.0
866
+ zipp==3.17.0
867
+ zstandard==0.22.0
wandb/run-20240413_094054-v5dwwo6y/files/wandb-metadata.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-5.15.133+-x86_64-with-glibc2.31",
3
+ "python": "3.10.13",
4
+ "heartbeatAt": "2024-04-13T09:40:55.135621",
5
+ "startedAt": "2024-04-13T09:40:54.408231",
6
+ "docker": null,
7
+ "cuda": null,
8
+ "args": [],
9
+ "state": "running",
10
+ "program": "kaggle.ipynb",
11
+ "codePathLocal": null,
12
+ "root": "/kaggle/working",
13
+ "host": "05fce47c7cf1",
14
+ "username": "root",
15
+ "executable": "/opt/conda/bin/python3.10",
16
+ "cpu_count": 2,
17
+ "cpu_count_logical": 4,
18
+ "cpu_freq": {
19
+ "current": 2000.142,
20
+ "min": 0.0,
21
+ "max": 0.0
22
+ },
23
+ "cpu_freq_per_core": [
24
+ {
25
+ "current": 2000.142,
26
+ "min": 0.0,
27
+ "max": 0.0
28
+ },
29
+ {
30
+ "current": 2000.142,
31
+ "min": 0.0,
32
+ "max": 0.0
33
+ },
34
+ {
35
+ "current": 2000.142,
36
+ "min": 0.0,
37
+ "max": 0.0
38
+ },
39
+ {
40
+ "current": 2000.142,
41
+ "min": 0.0,
42
+ "max": 0.0
43
+ }
44
+ ],
45
+ "disk": {
46
+ "/": {
47
+ "total": 8062.387607574463,
48
+ "used": 5567.563323974609
49
+ }
50
+ },
51
+ "gpu": "Tesla T4",
52
+ "gpu_count": 2,
53
+ "gpu_devices": [
54
+ {
55
+ "name": "Tesla T4",
56
+ "memory_total": 16106127360
57
+ },
58
+ {
59
+ "name": "Tesla T4",
60
+ "memory_total": 16106127360
61
+ }
62
+ ],
63
+ "memory": {
64
+ "total": 31.357559204101562
65
+ }
66
+ }
wandb/run-20240413_094054-v5dwwo6y/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"train/loss": 0.4749, "train/grad_norm": 1.0544670820236206, "train/learning_rate": 0.0, "train/epoch": 48.0, "train/global_step": 300, "_timestamp": 1713005523.5123792, "_runtime": 4269.095329284668, "_step": 96, "eval/loss": 1.6078672409057617, "eval/runtime": 9.2111, "eval/samples_per_second": 4.125, "eval/steps_per_second": 0.76, "train_runtime": 4274.3723, "train_samples_per_second": 1.743, "train_steps_per_second": 0.07, "total_flos": 450724738867200.0, "train_loss": 0.79365203221639}
wandb/run-20240413_094054-v5dwwo6y/logs/debug-internal.log ADDED
The diff for this file is too large to render. See raw diff
 
wandb/run-20240413_094054-v5dwwo6y/logs/debug.log ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.5
2
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
+ 2024-04-13 09:40:54,410 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_log_setup():527] Logging user logs to /kaggle/working/wandb/run-20240413_094054-v5dwwo6y/logs/debug.log
11
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_log_setup():528] Logging internal logs to /kaggle/working/wandb/run-20240413_094054-v5dwwo6y/logs/debug-internal.log
12
+ 2024-04-13 09:40:54,411 INFO MainThread:34 [wandb_init.py:_jupyter_setup():473] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x7fd7c7c3b580>
13
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():567] calling init triggers
14
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():574] wandb.init called with sweep_config: {}
15
+ config: {}
16
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():617] starting backend
17
+ 2024-04-13 09:40:54,412 INFO MainThread:34 [wandb_init.py:init():621] setting up manager
18
+ 2024-04-13 09:40:54,414 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
+ 2024-04-13 09:40:54,416 INFO MainThread:34 [wandb_init.py:init():629] backend started and connected
20
+ 2024-04-13 09:40:54,433 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
+ 2024-04-13 09:40:54,846 INFO MainThread:34 [wandb_init.py:init():721] updated telemetry
22
+ 2024-04-13 09:40:54,849 INFO MainThread:34 [wandb_init.py:init():754] communicating run to backend with 90.0 second timeout
23
+ 2024-04-13 09:40:55,012 INFO MainThread:34 [wandb_run.py:_on_init():2344] communicating current version
24
+ 2024-04-13 09:40:55,096 INFO MainThread:34 [wandb_run.py:_on_init():2353] got version response upgrade_message: "wandb version 0.16.6 is available! To upgrade, please run:\n $ pip install wandb --upgrade"
25
+
26
+ 2024-04-13 09:40:55,096 INFO MainThread:34 [wandb_init.py:init():805] starting run threads in backend
27
+ 2024-04-13 09:41:11,226 INFO MainThread:34 [wandb_run.py:_console_start():2323] atexit reg
28
+ 2024-04-13 09:41:11,227 INFO MainThread:34 [wandb_run.py:_redirect():2178] redirect: wrap_raw
29
+ 2024-04-13 09:41:11,227 INFO MainThread:34 [wandb_run.py:_redirect():2243] Wrapping output streams.
30
+ 2024-04-13 09:41:11,228 INFO MainThread:34 [wandb_run.py:_redirect():2268] Redirects installed.
31
+ 2024-04-13 09:41:11,229 INFO MainThread:34 [wandb_init.py:init():848] run started, returning control to user process
32
+ 2024-04-13 09:41:11,235 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32000, 'max_position_embeddings': 32768, 'hidden_size': 4096, 'intermediate_size': 14336, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'sliding_window': None, 'num_key_value_heads': 8, 'hidden_act': 'silu', 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 1000000.0, 'attention_dropout': 0.0, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['MistralForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 0, 'eos_token_id': 2, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', 'transformers_version': '4.39.3', 'model_type': 'mistral', 'pretraining_tp': 1, 'quantization_config': {'quant_method': 'QuantizationMethod.GPTQ', 'bits': 4, 'tokenizer': None, 'dataset': None, 'group_size': 128, 'damp_percent': 0.1, 'desc_act': True, 'sym': True, 'true_sequential': True, 'use_cuda_fp16': False, 'model_seqlen': None, 'block_name_to_quantize': None, 'module_name_preceding_first_block': None, 'batch_size': 1, 'pad_token_id': None, 'use_exllama': True, 'max_input_length': None, 'exllama_config': {'version': 'ExllamaVersion.ONE'}, 'cache_block_outputs': True, 'modules_in_block_to_quantize': None}, 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 50, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr13_09-40-43_05fce47c7cf1', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
33
+ 2024-04-13 10:52:03,516 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
34
+ 2024-04-13 10:52:03,517 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
35
+ 2024-04-13 10:52:03,523 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
36
+ 2024-04-13 10:52:03,524 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
37
+ 2024-04-13 10:52:03,524 INFO MainThread:34 [wandb_init.py:_pause_backend():438] pausing backend
38
+ 2024-04-13 10:52:03,530 INFO MainThread:34 [wandb_init.py:_resume_backend():443] resuming backend
wandb/run-20240413_094054-v5dwwo6y/run-v5dwwo6y.wandb ADDED
Binary file (179 kB). View file