2025-02-14 04:26:13,117 - training_args.py:2100 - _setup_devices - INFO - PyTorch: setting up devices 2025-02-14 04:26:13,835 - configuration_utils.py:731 - _get_config_dict - INFO - loading configuration file ./checkpoints/longvu_llama3_2/config.json 2025-02-14 04:26:13,838 - configuration_utils.py:800 - from_dict - INFO - Model config CambrianConfig { "_name_or_path": "/tmp/iopath_cache/manifold_cache/tree/users/shenx/finetune/09281004-cambrian_llama3_2_t576_ov", "architectures": [ "CambrianLlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 128000, "connect_layer": 2, "connector_depth": 3, "connector_only": true, "dino_threshold": 0.83, "drop_threshold": 0.8, "eos_token_id": [ 128001, 128008, 128009 ], "frame_pos": false, "freeze_mm_mlp_adapter": false, "hidden_act": "silu", "hidden_size": 3072, "highres": true, "highres_connect": false, "image_aspect_ratio": "pad", "image_position": 91, "image_token_len": 144, "initializer_range": 0.02, "intermediate_size": 8192, "is_image_newline": true, "is_st_sampler": false, "lowres_token": 8, "max_position_embeddings": 131072, "mlp_bias": false, "mm_patch_merge_type": "flat", "mm_projector_lr": null, "mm_projector_type": "sva", "mm_use_im_patch_token": false, "mm_use_im_start_end": false, "mm_vision_sampler_lr": null, "mm_vision_select_feature": "patch", "mm_vision_select_layer": -2, "mm_vision_tower_aux_list": [ "siglip/CLIP-ViT-SO400M-14-384", "facebook/dinov2-giant-res378" ], "mm_vision_tower_aux_token_len_list": [ 576, 576 ], "mm_vision_tower_lr": null, "model_type": "cambrian_llama", "num_attention_heads": 24, "num_hidden_layers": 28, "num_key_value_heads": 8, "num_of_vision_sampler_layers": 10, "num_query_group": 1, "pretraining_tp": 1, "query_num_list": [ 144 ], "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 32.0, "high_freq_factor": 4.0, "low_freq_factor": 1.0, "original_max_position_embeddings": 8192, "rope_type": "llama3" }, "rope_theta": 500000.0, "spmd_debug": null, "spmd_fsdp_sharding": null, "spmd_mesh": null, "start_of_vision_sampler_layers": 0, "stride_of_vision_sampler_layers": 3, "tie_word_embeddings": false, "tokenizer_model_max_length": 8192, "tokenizer_padding_side": "right", "torch_dtype": "float32", "transformers_version": "4.43.1", "tune_mm_mlp_adapter": false, "unfreeze_mm_vision_tower": false, "use_cache": false, "use_mm_proj": true, "vision_hidden_size": 1024, "vision_tower_aux_token_len_list": [ 576, 576 ], "vocab_size": 128256 } 2025-02-14 04:26:13,838 - modeling_utils.py:3618 - from_pretrained - INFO - loading weights file ./checkpoints/longvu_llama3_2/pytorch_model.bin 2025-02-14 04:26:13,900 - configuration_utils.py:1038 - from_dict - INFO - Generate config GenerationConfig { "bos_token_id": 128000, "eos_token_id": [ 128001, 128008, 128009 ], "use_cache": false } 2025-02-14 04:26:14,475 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/config.json 2025-02-14 04:26:14,479 - configuration_utils.py:800 - from_dict - INFO - Model config Dinov2Config { "apply_layernorm": true, "architectures": [ "Dinov2Model" ], "attention_probs_dropout_prob": 0.0, "drop_path_rate": 0.0, "hidden_act": "gelu", "hidden_dropout_prob": 0.0, "hidden_size": 1536, "image_size": 518, "initializer_range": 0.02, "layer_norm_eps": 1e-06, "layerscale_value": 1.0, "mlp_ratio": 4, "model_type": "dinov2", "num_attention_heads": 24, "num_channels": 3, "num_hidden_layers": 40, "out_features": [ "stage40" ], "out_indices": [ 40 ], "patch_size": 14, "qkv_bias": true, "reshape_hidden_states": true, "stage_names": [ "stem", "stage1", "stage2", "stage3", "stage4", "stage5", "stage6", "stage7", "stage8", "stage9", "stage10", "stage11", "stage12", "stage13", "stage14", "stage15", "stage16", "stage17", "stage18", "stage19", "stage20", "stage21", "stage22", "stage23", "stage24", "stage25", "stage26", "stage27", "stage28", "stage29", "stage30", "stage31", "stage32", "stage33", "stage34", "stage35", "stage36", "stage37", "stage38", "stage39", "stage40" ], "torch_dtype": "float32", "transformers_version": "4.43.1", "use_swiglu_ffn": true } 2025-02-14 04:26:15,836 - modeling_utils.py:4450 - _load_pretrained_model - INFO - All model checkpoint weights were used when initializing CambrianLlamaForCausalLM. 2025-02-14 04:26:15,836 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of CambrianLlamaForCausalLM were initialized from the model checkpoint at ./checkpoints/longvu_llama3_2. If your task is similar to the task the model of the checkpoint was trained on, you can already use CambrianLlamaForCausalLM for predictions without further training. 2025-02-14 04:26:15,841 - configuration_utils.py:991 - from_pretrained - INFO - loading configuration file ./checkpoints/longvu_llama3_2/generation_config.json 2025-02-14 04:26:15,842 - configuration_utils.py:1038 - from_dict - INFO - Generate config GenerationConfig { "bos_token_id": 128000, "do_sample": true, "eos_token_id": [ 128001, 128008, 128009 ], "temperature": 0.6, "top_p": 0.9 } 2025-02-14 04:26:16,077 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file tokenizer.json 2025-02-14 04:26:16,078 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file added_tokens.json 2025-02-14 04:26:16,078 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file special_tokens_map.json 2025-02-14 04:26:16,078 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file tokenizer_config.json 2025-02-14 04:26:16,481 - tokenization_utils_base.py:2533 - _from_pretrained - INFO - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 2025-02-14 04:26:16,843 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/config.json 2025-02-14 04:26:16,844 - configuration_utils.py:800 - from_dict - INFO - Model config SiglipVisionConfig { "attention_dropout": 0.0, "hidden_act": "gelu_pytorch_tanh", "hidden_size": 1152, "image_size": 384, "intermediate_size": 4304, "layer_norm_eps": 1e-06, "model_type": "siglip_vision_model", "num_attention_heads": 16, "num_channels": 3, "num_hidden_layers": 27, "patch_size": 14, "transformers_version": "4.43.1" } 2025-02-14 04:26:16,845 - modeling_utils.py:3621 - from_pretrained - INFO - loading weights file model.safetensors from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/model.safetensors 2025-02-14 04:26:17,112 - modeling_utils.py:4440 - _load_pretrained_model - INFO - Some weights of the model checkpoint at google/siglip-so400m-patch14-384 were not used when initializing SiglipVisionModel: ['logit_bias', 'logit_scale', 'text_model.embeddings.position_embedding.weight', 'text_model.embeddings.token_embedding.weight', 'text_model.encoder.layers.0.layer_norm1.bias', 'text_model.encoder.layers.0.layer_norm1.weight', 'text_model.encoder.layers.0.layer_norm2.bias', 'text_model.encoder.layers.0.layer_norm2.weight', 'text_model.encoder.layers.0.mlp.fc1.bias', 'text_model.encoder.layers.0.mlp.fc1.weight', 'text_model.encoder.layers.0.mlp.fc2.bias', 'text_model.encoder.layers.0.mlp.fc2.weight', 'text_model.encoder.layers.0.self_attn.k_proj.bias', 'text_model.encoder.layers.0.self_attn.k_proj.weight', 'text_model.encoder.layers.0.self_attn.out_proj.bias', 'text_model.encoder.layers.0.self_attn.out_proj.weight', 'text_model.encoder.layers.0.self_attn.q_proj.bias', 'text_model.encoder.layers.0.self_attn.q_proj.weight', 'text_model.encoder.layers.0.self_attn.v_proj.bias', 'text_model.encoder.layers.0.self_attn.v_proj.weight', 'text_model.encoder.layers.1.layer_norm1.bias', 'text_model.encoder.layers.1.layer_norm1.weight', 'text_model.encoder.layers.1.layer_norm2.bias', 'text_model.encoder.layers.1.layer_norm2.weight', 'text_model.encoder.layers.1.mlp.fc1.bias', 'text_model.encoder.layers.1.mlp.fc1.weight', 'text_model.encoder.layers.1.mlp.fc2.bias', 'text_model.encoder.layers.1.mlp.fc2.weight', 'text_model.encoder.layers.1.self_attn.k_proj.bias', 'text_model.encoder.layers.1.self_attn.k_proj.weight', 'text_model.encoder.layers.1.self_attn.out_proj.bias', 'text_model.encoder.layers.1.self_attn.out_proj.weight', 'text_model.encoder.layers.1.self_attn.q_proj.bias', 'text_model.encoder.layers.1.self_attn.q_proj.weight', 'text_model.encoder.layers.1.self_attn.v_proj.bias', 'text_model.encoder.layers.1.self_attn.v_proj.weight', 'text_model.encoder.layers.10.layer_norm1.bias', 'text_model.encoder.layers.10.layer_norm1.weight', 'text_model.encoder.layers.10.layer_norm2.bias', 'text_model.encoder.layers.10.layer_norm2.weight', 'text_model.encoder.layers.10.mlp.fc1.bias', 'text_model.encoder.layers.10.mlp.fc1.weight', 'text_model.encoder.layers.10.mlp.fc2.bias', 'text_model.encoder.layers.10.mlp.fc2.weight', 'text_model.encoder.layers.10.self_attn.k_proj.bias', 'text_model.encoder.layers.10.self_attn.k_proj.weight', 'text_model.encoder.layers.10.self_attn.out_proj.bias', 'text_model.encoder.layers.10.self_attn.out_proj.weight', 'text_model.encoder.layers.10.self_attn.q_proj.bias', 'text_model.encoder.layers.10.self_attn.q_proj.weight', 'text_model.encoder.layers.10.self_attn.v_proj.bias', 'text_model.encoder.layers.10.self_attn.v_proj.weight', 'text_model.encoder.layers.11.layer_norm1.bias', 'text_model.encoder.layers.11.layer_norm1.weight', 'text_model.encoder.layers.11.layer_norm2.bias', 'text_model.encoder.layers.11.layer_norm2.weight', 'text_model.encoder.layers.11.mlp.fc1.bias', 'text_model.encoder.layers.11.mlp.fc1.weight', 'text_model.encoder.layers.11.mlp.fc2.bias', 'text_model.encoder.layers.11.mlp.fc2.weight', 'text_model.encoder.layers.11.self_attn.k_proj.bias', 'text_model.encoder.layers.11.self_attn.k_proj.weight', 'text_model.encoder.layers.11.self_attn.out_proj.bias', 'text_model.encoder.layers.11.self_attn.out_proj.weight', 'text_model.encoder.layers.11.self_attn.q_proj.bias', 'text_model.encoder.layers.11.self_attn.q_proj.weight', 'text_model.encoder.layers.11.self_attn.v_proj.bias', 'text_model.encoder.layers.11.self_attn.v_proj.weight', 'text_model.encoder.layers.12.layer_norm1.bias', 'text_model.encoder.layers.12.layer_norm1.weight', 'text_model.encoder.layers.12.layer_norm2.bias', 'text_model.encoder.layers.12.layer_norm2.weight', 'text_model.encoder.layers.12.mlp.fc1.bias', 'text_model.encoder.layers.12.mlp.fc1.weight', 'text_model.encoder.layers.12.mlp.fc2.bias', 'text_model.encoder.layers.12.mlp.fc2.weight', 'text_model.encoder.layers.12.self_attn.k_proj.bias', 'text_model.encoder.layers.12.self_attn.k_proj.weight', 'text_model.encoder.layers.12.self_attn.out_proj.bias', 'text_model.encoder.layers.12.self_attn.out_proj.weight', 'text_model.encoder.layers.12.self_attn.q_proj.bias', 'text_model.encoder.layers.12.self_attn.q_proj.weight', 'text_model.encoder.layers.12.self_attn.v_proj.bias', 'text_model.encoder.layers.12.self_attn.v_proj.weight', 'text_model.encoder.layers.13.layer_norm1.bias', 'text_model.encoder.layers.13.layer_norm1.weight', 'text_model.encoder.layers.13.layer_norm2.bias', 'text_model.encoder.layers.13.layer_norm2.weight', 'text_model.encoder.layers.13.mlp.fc1.bias', 'text_model.encoder.layers.13.mlp.fc1.weight', 'text_model.encoder.layers.13.mlp.fc2.bias', 'text_model.encoder.layers.13.mlp.fc2.weight', 'text_model.encoder.layers.13.self_attn.k_proj.bias', 'text_model.encoder.layers.13.self_attn.k_proj.weight', 'text_model.encoder.layers.13.self_attn.out_proj.bias', 'text_model.encoder.layers.13.self_attn.out_proj.weight', 'text_model.encoder.layers.13.self_attn.q_proj.bias', 'text_model.encoder.layers.13.self_attn.q_proj.weight', 'text_model.encoder.layers.13.self_attn.v_proj.bias', 'text_model.encoder.layers.13.self_attn.v_proj.weight', 'text_model.encoder.layers.14.layer_norm1.bias', 'text_model.encoder.layers.14.layer_norm1.weight', 'text_model.encoder.layers.14.layer_norm2.bias', 'text_model.encoder.layers.14.layer_norm2.weight', 'text_model.encoder.layers.14.mlp.fc1.bias', 'text_model.encoder.layers.14.mlp.fc1.weight', 'text_model.encoder.layers.14.mlp.fc2.bias', 'text_model.encoder.layers.14.mlp.fc2.weight', 'text_model.encoder.layers.14.self_attn.k_proj.bias', 'text_model.encoder.layers.14.self_attn.k_proj.weight', 'text_model.encoder.layers.14.self_attn.out_proj.bias', 'text_model.encoder.layers.14.self_attn.out_proj.weight', 'text_model.encoder.layers.14.self_attn.q_proj.bias', 'text_model.encoder.layers.14.self_attn.q_proj.weight', 'text_model.encoder.layers.14.self_attn.v_proj.bias', 'text_model.encoder.layers.14.self_attn.v_proj.weight', 'text_model.encoder.layers.15.layer_norm1.bias', 'text_model.encoder.layers.15.layer_norm1.weight', 'text_model.encoder.layers.15.layer_norm2.bias', 'text_model.encoder.layers.15.layer_norm2.weight', 'text_model.encoder.layers.15.mlp.fc1.bias', 'text_model.encoder.layers.15.mlp.fc1.weight', 'text_model.encoder.layers.15.mlp.fc2.bias', 'text_model.encoder.layers.15.mlp.fc2.weight', 'text_model.encoder.layers.15.self_attn.k_proj.bias', 'text_model.encoder.layers.15.self_attn.k_proj.weight', 'text_model.encoder.layers.15.self_attn.out_proj.bias', 'text_model.encoder.layers.15.self_attn.out_proj.weight', 'text_model.encoder.layers.15.self_attn.q_proj.bias', 'text_model.encoder.layers.15.self_attn.q_proj.weight', 'text_model.encoder.layers.15.self_attn.v_proj.bias', 'text_model.encoder.layers.15.self_attn.v_proj.weight', 'text_model.encoder.layers.16.layer_norm1.bias', 'text_model.encoder.layers.16.layer_norm1.weight', 'text_model.encoder.layers.16.layer_norm2.bias', 'text_model.encoder.layers.16.layer_norm2.weight', 'text_model.encoder.layers.16.mlp.fc1.bias', 'text_model.encoder.layers.16.mlp.fc1.weight', 'text_model.encoder.layers.16.mlp.fc2.bias', 'text_model.encoder.layers.16.mlp.fc2.weight', 'text_model.encoder.layers.16.self_attn.k_proj.bias', 'text_model.encoder.layers.16.self_attn.k_proj.weight', 'text_model.encoder.layers.16.self_attn.out_proj.bias', 'text_model.encoder.layers.16.self_attn.out_proj.weight', 'text_model.encoder.layers.16.self_attn.q_proj.bias', 'text_model.encoder.layers.16.self_attn.q_proj.weight', 'text_model.encoder.layers.16.self_attn.v_proj.bias', 'text_model.encoder.layers.16.self_attn.v_proj.weight', 'text_model.encoder.layers.17.layer_norm1.bias', 'text_model.encoder.layers.17.layer_norm1.weight', 'text_model.encoder.layers.17.layer_norm2.bias', 'text_model.encoder.layers.17.layer_norm2.weight', 'text_model.encoder.layers.17.mlp.fc1.bias', 'text_model.encoder.layers.17.mlp.fc1.weight', 'text_model.encoder.layers.17.mlp.fc2.bias', 'text_model.encoder.layers.17.mlp.fc2.weight', 'text_model.encoder.layers.17.self_attn.k_proj.bias', 'text_model.encoder.layers.17.self_attn.k_proj.weight', 'text_model.encoder.layers.17.self_attn.out_proj.bias', 'text_model.encoder.layers.17.self_attn.out_proj.weight', 'text_model.encoder.layers.17.self_attn.q_proj.bias', 'text_model.encoder.layers.17.self_attn.q_proj.weight', 'text_model.encoder.layers.17.self_attn.v_proj.bias', 'text_model.encoder.layers.17.self_attn.v_proj.weight', 'text_model.encoder.layers.18.layer_norm1.bias', 'text_model.encoder.layers.18.layer_norm1.weight', 'text_model.encoder.layers.18.layer_norm2.bias', 'text_model.encoder.layers.18.layer_norm2.weight', 'text_model.encoder.layers.18.mlp.fc1.bias', 'text_model.encoder.layers.18.mlp.fc1.weight', 'text_model.encoder.layers.18.mlp.fc2.bias', 'text_model.encoder.layers.18.mlp.fc2.weight', 'text_model.encoder.layers.18.self_attn.k_proj.bias', 'text_model.encoder.layers.18.self_attn.k_proj.weight', 'text_model.encoder.layers.18.self_attn.out_proj.bias', 'text_model.encoder.layers.18.self_attn.out_proj.weight', 'text_model.encoder.layers.18.self_attn.q_proj.bias', 'text_model.encoder.layers.18.self_attn.q_proj.weight', 'text_model.encoder.layers.18.self_attn.v_proj.bias', 'text_model.encoder.layers.18.self_attn.v_proj.weight', 'text_model.encoder.layers.19.layer_norm1.bias', 'text_model.encoder.layers.19.layer_norm1.weight', 'text_model.encoder.layers.19.layer_norm2.bias', 'text_model.encoder.layers.19.layer_norm2.weight', 'text_model.encoder.layers.19.mlp.fc1.bias', 'text_model.encoder.layers.19.mlp.fc1.weight', 'text_model.encoder.layers.19.mlp.fc2.bias', 'text_model.encoder.layers.19.mlp.fc2.weight', 'text_model.encoder.layers.19.self_attn.k_proj.bias', 'text_model.encoder.layers.19.self_attn.k_proj.weight', 'text_model.encoder.layers.19.self_attn.out_proj.bias', 'text_model.encoder.layers.19.self_attn.out_proj.weight', 'text_model.encoder.layers.19.self_attn.q_proj.bias', 'text_model.encoder.layers.19.self_attn.q_proj.weight', 'text_model.encoder.layers.19.self_attn.v_proj.bias', 'text_model.encoder.layers.19.self_attn.v_proj.weight', 'text_model.encoder.layers.2.layer_norm1.bias', 'text_model.encoder.layers.2.layer_norm1.weight', 'text_model.encoder.layers.2.layer_norm2.bias', 'text_model.encoder.layers.2.layer_norm2.weight', 'text_model.encoder.layers.2.mlp.fc1.bias', 'text_model.encoder.layers.2.mlp.fc1.weight', 'text_model.encoder.layers.2.mlp.fc2.bias', 'text_model.encoder.layers.2.mlp.fc2.weight', 'text_model.encoder.layers.2.self_attn.k_proj.bias', 'text_model.encoder.layers.2.self_attn.k_proj.weight', 'text_model.encoder.layers.2.self_attn.out_proj.bias', 'text_model.encoder.layers.2.self_attn.out_proj.weight', 'text_model.encoder.layers.2.self_attn.q_proj.bias', 'text_model.encoder.layers.2.self_attn.q_proj.weight', 'text_model.encoder.layers.2.self_attn.v_proj.bias', 'text_model.encoder.layers.2.self_attn.v_proj.weight', 'text_model.encoder.layers.20.layer_norm1.bias', 'text_model.encoder.layers.20.layer_norm1.weight', 'text_model.encoder.layers.20.layer_norm2.bias', 'text_model.encoder.layers.20.layer_norm2.weight', 'text_model.encoder.layers.20.mlp.fc1.bias', 'text_model.encoder.layers.20.mlp.fc1.weight', 'text_model.encoder.layers.20.mlp.fc2.bias', 'text_model.encoder.layers.20.mlp.fc2.weight', 'text_model.encoder.layers.20.self_attn.k_proj.bias', 'text_model.encoder.layers.20.self_attn.k_proj.weight', 'text_model.encoder.layers.20.self_attn.out_proj.bias', 'text_model.encoder.layers.20.self_attn.out_proj.weight', 'text_model.encoder.layers.20.self_attn.q_proj.bias', 'text_model.encoder.layers.20.self_attn.q_proj.weight', 'text_model.encoder.layers.20.self_attn.v_proj.bias', 'text_model.encoder.layers.20.self_attn.v_proj.weight', 'text_model.encoder.layers.21.layer_norm1.bias', 'text_model.encoder.layers.21.layer_norm1.weight', 'text_model.encoder.layers.21.layer_norm2.bias', 'text_model.encoder.layers.21.layer_norm2.weight', 'text_model.encoder.layers.21.mlp.fc1.bias', 'text_model.encoder.layers.21.mlp.fc1.weight', 'text_model.encoder.layers.21.mlp.fc2.bias', 'text_model.encoder.layers.21.mlp.fc2.weight', 'text_model.encoder.layers.21.self_attn.k_proj.bias', 'text_model.encoder.layers.21.self_attn.k_proj.weight', 'text_model.encoder.layers.21.self_attn.out_proj.bias', 'text_model.encoder.layers.21.self_attn.out_proj.weight', 'text_model.encoder.layers.21.self_attn.q_proj.bias', 'text_model.encoder.layers.21.self_attn.q_proj.weight', 'text_model.encoder.layers.21.self_attn.v_proj.bias', 'text_model.encoder.layers.21.self_attn.v_proj.weight', 'text_model.encoder.layers.22.layer_norm1.bias', 'text_model.encoder.layers.22.layer_norm1.weight', 'text_model.encoder.layers.22.layer_norm2.bias', 'text_model.encoder.layers.22.layer_norm2.weight', 'text_model.encoder.layers.22.mlp.fc1.bias', 'text_model.encoder.layers.22.mlp.fc1.weight', 'text_model.encoder.layers.22.mlp.fc2.bias', 'text_model.encoder.layers.22.mlp.fc2.weight', 'text_model.encoder.layers.22.self_attn.k_proj.bias', 'text_model.encoder.layers.22.self_attn.k_proj.weight', 'text_model.encoder.layers.22.self_attn.out_proj.bias', 'text_model.encoder.layers.22.self_attn.out_proj.weight', 'text_model.encoder.layers.22.self_attn.q_proj.bias', 'text_model.encoder.layers.22.self_attn.q_proj.weight', 'text_model.encoder.layers.22.self_attn.v_proj.bias', 'text_model.encoder.layers.22.self_attn.v_proj.weight', 'text_model.encoder.layers.23.layer_norm1.bias', 'text_model.encoder.layers.23.layer_norm1.weight', 'text_model.encoder.layers.23.layer_norm2.bias', 'text_model.encoder.layers.23.layer_norm2.weight', 'text_model.encoder.layers.23.mlp.fc1.bias', 'text_model.encoder.layers.23.mlp.fc1.weight', 'text_model.encoder.layers.23.mlp.fc2.bias', 'text_model.encoder.layers.23.mlp.fc2.weight', 'text_model.encoder.layers.23.self_attn.k_proj.bias', 'text_model.encoder.layers.23.self_attn.k_proj.weight', 'text_model.encoder.layers.23.self_attn.out_proj.bias', 'text_model.encoder.layers.23.self_attn.out_proj.weight', 'text_model.encoder.layers.23.self_attn.q_proj.bias', 'text_model.encoder.layers.23.self_attn.q_proj.weight', 'text_model.encoder.layers.23.self_attn.v_proj.bias', 'text_model.encoder.layers.23.self_attn.v_proj.weight', 'text_model.encoder.layers.24.layer_norm1.bias', 'text_model.encoder.layers.24.layer_norm1.weight', 'text_model.encoder.layers.24.layer_norm2.bias', 'text_model.encoder.layers.24.layer_norm2.weight', 'text_model.encoder.layers.24.mlp.fc1.bias', 'text_model.encoder.layers.24.mlp.fc1.weight', 'text_model.encoder.layers.24.mlp.fc2.bias', 'text_model.encoder.layers.24.mlp.fc2.weight', 'text_model.encoder.layers.24.self_attn.k_proj.bias', 'text_model.encoder.layers.24.self_attn.k_proj.weight', 'text_model.encoder.layers.24.self_attn.out_proj.bias', 'text_model.encoder.layers.24.self_attn.out_proj.weight', 'text_model.encoder.layers.24.self_attn.q_proj.bias', 'text_model.encoder.layers.24.self_attn.q_proj.weight', 'text_model.encoder.layers.24.self_attn.v_proj.bias', 'text_model.encoder.layers.24.self_attn.v_proj.weight', 'text_model.encoder.layers.25.layer_norm1.bias', 'text_model.encoder.layers.25.layer_norm1.weight', 'text_model.encoder.layers.25.layer_norm2.bias', 'text_model.encoder.layers.25.layer_norm2.weight', 'text_model.encoder.layers.25.mlp.fc1.bias', 'text_model.encoder.layers.25.mlp.fc1.weight', 'text_model.encoder.layers.25.mlp.fc2.bias', 'text_model.encoder.layers.25.mlp.fc2.weight', 'text_model.encoder.layers.25.self_attn.k_proj.bias', 'text_model.encoder.layers.25.self_attn.k_proj.weight', 'text_model.encoder.layers.25.self_attn.out_proj.bias', 'text_model.encoder.layers.25.self_attn.out_proj.weight', 'text_model.encoder.layers.25.self_attn.q_proj.bias', 'text_model.encoder.layers.25.self_attn.q_proj.weight', 'text_model.encoder.layers.25.self_attn.v_proj.bias', 'text_model.encoder.layers.25.self_attn.v_proj.weight', 'text_model.encoder.layers.26.layer_norm1.bias', 'text_model.encoder.layers.26.layer_norm1.weight', 'text_model.encoder.layers.26.layer_norm2.bias', 'text_model.encoder.layers.26.layer_norm2.weight', 'text_model.encoder.layers.26.mlp.fc1.bias', 'text_model.encoder.layers.26.mlp.fc1.weight', 'text_model.encoder.layers.26.mlp.fc2.bias', 'text_model.encoder.layers.26.mlp.fc2.weight', 'text_model.encoder.layers.26.self_attn.k_proj.bias', 'text_model.encoder.layers.26.self_attn.k_proj.weight', 'text_model.encoder.layers.26.self_attn.out_proj.bias', 'text_model.encoder.layers.26.self_attn.out_proj.weight', 'text_model.encoder.layers.26.self_attn.q_proj.bias', 'text_model.encoder.layers.26.self_attn.q_proj.weight', 'text_model.encoder.layers.26.self_attn.v_proj.bias', 'text_model.encoder.layers.26.self_attn.v_proj.weight', 'text_model.encoder.layers.3.layer_norm1.bias', 'text_model.encoder.layers.3.layer_norm1.weight', 'text_model.encoder.layers.3.layer_norm2.bias', 'text_model.encoder.layers.3.layer_norm2.weight', 'text_model.encoder.layers.3.mlp.fc1.bias', 'text_model.encoder.layers.3.mlp.fc1.weight', 'text_model.encoder.layers.3.mlp.fc2.bias', 'text_model.encoder.layers.3.mlp.fc2.weight', 'text_model.encoder.layers.3.self_attn.k_proj.bias', 'text_model.encoder.layers.3.self_attn.k_proj.weight', 'text_model.encoder.layers.3.self_attn.out_proj.bias', 'text_model.encoder.layers.3.self_attn.out_proj.weight', 'text_model.encoder.layers.3.self_attn.q_proj.bias', 'text_model.encoder.layers.3.self_attn.q_proj.weight', 'text_model.encoder.layers.3.self_attn.v_proj.bias', 'text_model.encoder.layers.3.self_attn.v_proj.weight', 'text_model.encoder.layers.4.layer_norm1.bias', 'text_model.encoder.layers.4.layer_norm1.weight', 'text_model.encoder.layers.4.layer_norm2.bias', 'text_model.encoder.layers.4.layer_norm2.weight', 'text_model.encoder.layers.4.mlp.fc1.bias', 'text_model.encoder.layers.4.mlp.fc1.weight', 'text_model.encoder.layers.4.mlp.fc2.bias', 'text_model.encoder.layers.4.mlp.fc2.weight', 'text_model.encoder.layers.4.self_attn.k_proj.bias', 'text_model.encoder.layers.4.self_attn.k_proj.weight', 'text_model.encoder.layers.4.self_attn.out_proj.bias', 'text_model.encoder.layers.4.self_attn.out_proj.weight', 'text_model.encoder.layers.4.self_attn.q_proj.bias', 'text_model.encoder.layers.4.self_attn.q_proj.weight', 'text_model.encoder.layers.4.self_attn.v_proj.bias', 'text_model.encoder.layers.4.self_attn.v_proj.weight', 'text_model.encoder.layers.5.layer_norm1.bias', 'text_model.encoder.layers.5.layer_norm1.weight', 'text_model.encoder.layers.5.layer_norm2.bias', 'text_model.encoder.layers.5.layer_norm2.weight', 'text_model.encoder.layers.5.mlp.fc1.bias', 'text_model.encoder.layers.5.mlp.fc1.weight', 'text_model.encoder.layers.5.mlp.fc2.bias', 'text_model.encoder.layers.5.mlp.fc2.weight', 'text_model.encoder.layers.5.self_attn.k_proj.bias', 'text_model.encoder.layers.5.self_attn.k_proj.weight', 'text_model.encoder.layers.5.self_attn.out_proj.bias', 'text_model.encoder.layers.5.self_attn.out_proj.weight', 'text_model.encoder.layers.5.self_attn.q_proj.bias', 'text_model.encoder.layers.5.self_attn.q_proj.weight', 'text_model.encoder.layers.5.self_attn.v_proj.bias', 'text_model.encoder.layers.5.self_attn.v_proj.weight', 'text_model.encoder.layers.6.layer_norm1.bias', 'text_model.encoder.layers.6.layer_norm1.weight', 'text_model.encoder.layers.6.layer_norm2.bias', 'text_model.encoder.layers.6.layer_norm2.weight', 'text_model.encoder.layers.6.mlp.fc1.bias', 'text_model.encoder.layers.6.mlp.fc1.weight', 'text_model.encoder.layers.6.mlp.fc2.bias', 'text_model.encoder.layers.6.mlp.fc2.weight', 'text_model.encoder.layers.6.self_attn.k_proj.bias', 'text_model.encoder.layers.6.self_attn.k_proj.weight', 'text_model.encoder.layers.6.self_attn.out_proj.bias', 'text_model.encoder.layers.6.self_attn.out_proj.weight', 'text_model.encoder.layers.6.self_attn.q_proj.bias', 'text_model.encoder.layers.6.self_attn.q_proj.weight', 'text_model.encoder.layers.6.self_attn.v_proj.bias', 'text_model.encoder.layers.6.self_attn.v_proj.weight', 'text_model.encoder.layers.7.layer_norm1.bias', 'text_model.encoder.layers.7.layer_norm1.weight', 'text_model.encoder.layers.7.layer_norm2.bias', 'text_model.encoder.layers.7.layer_norm2.weight', 'text_model.encoder.layers.7.mlp.fc1.bias', 'text_model.encoder.layers.7.mlp.fc1.weight', 'text_model.encoder.layers.7.mlp.fc2.bias', 'text_model.encoder.layers.7.mlp.fc2.weight', 'text_model.encoder.layers.7.self_attn.k_proj.bias', 'text_model.encoder.layers.7.self_attn.k_proj.weight', 'text_model.encoder.layers.7.self_attn.out_proj.bias', 'text_model.encoder.layers.7.self_attn.out_proj.weight', 'text_model.encoder.layers.7.self_attn.q_proj.bias', 'text_model.encoder.layers.7.self_attn.q_proj.weight', 'text_model.encoder.layers.7.self_attn.v_proj.bias', 'text_model.encoder.layers.7.self_attn.v_proj.weight', 'text_model.encoder.layers.8.layer_norm1.bias', 'text_model.encoder.layers.8.layer_norm1.weight', 'text_model.encoder.layers.8.layer_norm2.bias', 'text_model.encoder.layers.8.layer_norm2.weight', 'text_model.encoder.layers.8.mlp.fc1.bias', 'text_model.encoder.layers.8.mlp.fc1.weight', 'text_model.encoder.layers.8.mlp.fc2.bias', 'text_model.encoder.layers.8.mlp.fc2.weight', 'text_model.encoder.layers.8.self_attn.k_proj.bias', 'text_model.encoder.layers.8.self_attn.k_proj.weight', 'text_model.encoder.layers.8.self_attn.out_proj.bias', 'text_model.encoder.layers.8.self_attn.out_proj.weight', 'text_model.encoder.layers.8.self_attn.q_proj.bias', 'text_model.encoder.layers.8.self_attn.q_proj.weight', 'text_model.encoder.layers.8.self_attn.v_proj.bias', 'text_model.encoder.layers.8.self_attn.v_proj.weight', 'text_model.encoder.layers.9.layer_norm1.bias', 'text_model.encoder.layers.9.layer_norm1.weight', 'text_model.encoder.layers.9.layer_norm2.bias', 'text_model.encoder.layers.9.layer_norm2.weight', 'text_model.encoder.layers.9.mlp.fc1.bias', 'text_model.encoder.layers.9.mlp.fc1.weight', 'text_model.encoder.layers.9.mlp.fc2.bias', 'text_model.encoder.layers.9.mlp.fc2.weight', 'text_model.encoder.layers.9.self_attn.k_proj.bias', 'text_model.encoder.layers.9.self_attn.k_proj.weight', 'text_model.encoder.layers.9.self_attn.out_proj.bias', 'text_model.encoder.layers.9.self_attn.out_proj.weight', 'text_model.encoder.layers.9.self_attn.q_proj.bias', 'text_model.encoder.layers.9.self_attn.q_proj.weight', 'text_model.encoder.layers.9.self_attn.v_proj.bias', 'text_model.encoder.layers.9.self_attn.v_proj.weight', 'text_model.final_layer_norm.bias', 'text_model.final_layer_norm.weight', 'text_model.head.bias', 'text_model.head.weight'] - This IS expected if you are initializing SiglipVisionModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing SiglipVisionModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). 2025-02-14 04:26:17,114 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of SiglipVisionModel were initialized from the model checkpoint at google/siglip-so400m-patch14-384. If your task is similar to the task the model of the checkpoint was trained on, you can already use SiglipVisionModel for predictions without further training. 2025-02-14 04:26:17,305 - image_processing_base.py:375 - get_image_processor_dict - INFO - loading configuration file preprocessor_config.json from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/preprocessor_config.json 2025-02-14 04:26:17,306 - image_processing_base.py:429 - from_dict - INFO - Image processor SiglipImageProcessor { "do_convert_rgb": null, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.5, 0.5, 0.5 ], "image_processor_type": "SiglipImageProcessor", "image_std": [ 0.5, 0.5, 0.5 ], "processor_class": "SiglipProcessor", "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "height": 384, "width": 384 } } 2025-02-14 04:26:17,690 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/config.json 2025-02-14 04:26:17,692 - configuration_utils.py:800 - from_dict - INFO - Model config Dinov2Config { "apply_layernorm": true, "architectures": [ "Dinov2Model" ], "attention_probs_dropout_prob": 0.0, "drop_path_rate": 0.0, "hidden_act": "gelu", "hidden_dropout_prob": 0.0, "hidden_size": 1536, "image_size": 518, "initializer_range": 0.02, "layer_norm_eps": 1e-06, "layerscale_value": 1.0, "mlp_ratio": 4, "model_type": "dinov2", "num_attention_heads": 24, "num_channels": 3, "num_hidden_layers": 40, "out_features": [ "stage40" ], "out_indices": [ 40 ], "patch_size": 14, "qkv_bias": true, "reshape_hidden_states": true, "stage_names": [ "stem", "stage1", "stage2", "stage3", "stage4", "stage5", "stage6", "stage7", "stage8", "stage9", "stage10", "stage11", "stage12", "stage13", "stage14", "stage15", "stage16", "stage17", "stage18", "stage19", "stage20", "stage21", "stage22", "stage23", "stage24", "stage25", "stage26", "stage27", "stage28", "stage29", "stage30", "stage31", "stage32", "stage33", "stage34", "stage35", "stage36", "stage37", "stage38", "stage39", "stage40" ], "torch_dtype": "float32", "transformers_version": "4.43.1", "use_swiglu_ffn": true } 2025-02-14 04:26:17,692 - modeling_utils.py:3621 - from_pretrained - INFO - loading weights file model.safetensors from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/model.safetensors 2025-02-14 04:26:18,296 - modeling_utils.py:4450 - _load_pretrained_model - INFO - All model checkpoint weights were used when initializing Dinov2Model. 2025-02-14 04:26:18,296 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of Dinov2Model were initialized from the model checkpoint at facebook/dinov2-giant. If your task is similar to the task the model of the checkpoint was trained on, you can already use Dinov2Model for predictions without further training. 2025-02-14 04:26:18,485 - image_processing_base.py:375 - get_image_processor_dict - INFO - loading configuration file preprocessor_config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/preprocessor_config.json 2025-02-14 04:26:18,488 - image_processing_base.py:429 - from_dict - INFO - Image processor BitImageProcessor { "crop_size": { "height": 378, "width": 378 }, "do_center_crop": true, "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.485, 0.456, 0.406 ], "image_processor_type": "BitImageProcessor", "image_std": [ 0.229, 0.224, 0.225 ], "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "shortest_edge": 378 } } 2025-02-14 04:26:19,431 - finetune_llama.py:1239 - train - INFO - Total params: 3264865280 2025-02-14 04:26:19,431 - finetune_llama.py:1240 - train - INFO - Trainable params: 12589056 2025-02-14 04:26:19,431 - finetune_llama.py:1241 - train - INFO - LM head params: 394002432 2025-02-14 04:26:22,010 - trainer_callback.py:423 - add_callback - WARNING - You are adding a to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is :DefaultFlowCallback TensorBoardCallback 2025-02-14 04:26:22,010 - trainer.py:648 - __init__ - INFO - Using auto half precision backend 2025-02-14 04:26:22,505 - trainer.py:2134 - _inner_training_loop - INFO - ***** Running training ***** 2025-02-14 04:26:22,506 - trainer.py:2135 - _inner_training_loop - INFO - Num examples = 554 2025-02-14 04:26:22,506 - trainer.py:2136 - _inner_training_loop - INFO - Num Epochs = 2 2025-02-14 04:26:22,506 - trainer.py:2137 - _inner_training_loop - INFO - Instantaneous batch size per device = 1 2025-02-14 04:26:22,506 - trainer.py:2140 - _inner_training_loop - INFO - Total train batch size (w. parallel, distributed & accumulation) = 1 2025-02-14 04:26:22,506 - trainer.py:2141 - _inner_training_loop - INFO - Gradient Accumulation steps = 1 2025-02-14 04:26:22,506 - trainer.py:2142 - _inner_training_loop - INFO - Total optimization steps = 1,108 2025-02-14 04:26:22,508 - trainer.py:2143 - _inner_training_loop - INFO - Number of trainable parameters = 406,591,488 2025-02-14 04:26:48,198 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:48,198 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:26:48,221 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:26:48,225 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:48,225 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 213, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:26:48,226 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:48,226 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 213, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:26:51,578 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:26:51,578 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:26:51,578 - resource_logging.py:150 - __exit__ - DEBUG - Time: 3.35 seconds 2025-02-14 04:26:51,578 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:51,578 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12760.04 MB 2025-02-14 04:26:51,578 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 13547.39 MB 2025-02-14 04:26:51,578 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 787.35 MB 2025-02-14 04:26:51,578 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 13220.45 MB 2025-02-14 04:26:51,578 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 14554.23 MB 2025-02-14 04:26:51,578 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1333.79 MB 2025-02-14 04:26:51,578 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22492.26 MB 2025-02-14 04:26:51,618 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:26:51,618 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:26:51,618 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.04 seconds 2025-02-14 04:26:51,618 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:51,618 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13547.32 MB 2025-02-14 04:26:51,618 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 13890.88 MB 2025-02-14 04:26:51,618 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 343.55 MB 2025-02-14 04:26:51,618 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 14554.23 MB 2025-02-14 04:26:51,618 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17439.92 MB 2025-02-14 04:26:51,618 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 2885.68 MB 2025-02-14 04:26:51,618 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16515.39 MB 2025-02-14 04:26:52,731 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:26:52,731 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:26:52,731 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.11 seconds 2025-02-14 04:26:52,731 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,731 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13890.88 MB 2025-02-14 04:26:52,731 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14169.57 MB 2025-02-14 04:26:52,731 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 278.69 MB 2025-02-14 04:26:52,731 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17439.92 MB 2025-02-14 04:26:52,731 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 15323.89 MB 2025-02-14 04:26:52,731 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -2116.03 MB 2025-02-14 04:26:52,731 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18146.50 MB 2025-02-14 04:26:52,740 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:26:52,740 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:26:52,740 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:26:52,740 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,740 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14169.57 MB 2025-02-14 04:26:52,740 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15161.33 MB 2025-02-14 04:26:52,740 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 991.76 MB 2025-02-14 04:26:52,740 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 15323.89 MB 2025-02-14 04:26:52,740 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 16320.04 MB 2025-02-14 04:26:52,740 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 996.15 MB 2025-02-14 04:26:52,740 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 15905.49 MB 2025-02-14 04:26:52,858 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:26:52,858 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:26:52,858 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.12 seconds 2025-02-14 04:26:52,858 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,858 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15161.33 MB 2025-02-14 04:26:52,858 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16338.34 MB 2025-02-14 04:26:52,858 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1177.01 MB 2025-02-14 04:26:52,858 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 16320.04 MB 2025-02-14 04:26:52,858 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20176.70 MB 2025-02-14 04:26:52,858 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3856.66 MB 2025-02-14 04:26:52,858 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19249.05 MB 2025-02-14 04:26:52,859 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:26:52,859 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:26:52,859 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.13 seconds 2025-02-14 04:26:52,859 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,859 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14169.57 MB 2025-02-14 04:26:52,859 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16338.34 MB 2025-02-14 04:26:52,859 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2168.77 MB 2025-02-14 04:26:52,859 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 15323.89 MB 2025-02-14 04:26:52,859 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20176.70 MB 2025-02-14 04:26:52,859 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4852.81 MB 2025-02-14 04:26:52,859 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19249.05 MB 2025-02-14 04:26:52,944 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:26:52,944 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:26:52,944 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds 2025-02-14 04:26:52,944 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,944 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17143.45 MB 2025-02-14 04:26:52,944 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17546.12 MB 2025-02-14 04:26:52,944 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 402.68 MB 2025-02-14 04:26:52,944 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20176.70 MB 2025-02-14 04:26:52,944 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20396.90 MB 2025-02-14 04:26:52,944 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 220.20 MB 2025-02-14 04:26:52,944 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 17919.91 MB 2025-02-14 04:26:52,961 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:26:52,961 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:26:52,961 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:26:52,961 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,961 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17762.90 MB 2025-02-14 04:26:52,961 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17967.68 MB 2025-02-14 04:26:52,961 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 204.79 MB 2025-02-14 04:26:52,961 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20396.90 MB 2025-02-14 04:26:52,961 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20401.09 MB 2025-02-14 04:26:52,961 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4.19 MB 2025-02-14 04:26:52,961 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18025.36 MB 2025-02-14 04:26:52,963 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:26:52,963 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:26:52,963 - resource_logging.py:150 - __exit__ - DEBUG - Time: 4.73 seconds 2025-02-14 04:26:52,963 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:52,963 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12017.34 MB 2025-02-14 04:26:52,963 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18168.75 MB 2025-02-14 04:26:52,963 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 6151.41 MB 2025-02-14 04:26:52,963 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 12475.96 MB 2025-02-14 04:26:52,963 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20401.09 MB 2025-02-14 04:26:52,963 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 7925.14 MB 2025-02-14 04:26:52,963 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18168.75 MB 2025-02-14 04:26:52,987 - logging.py:328 - warning_once - WARNING - The attention layers in this model are transitioning from computing the RoPE embeddings internally through `position_ids` (2D tensor with the indexes of the tokens), to using externally computed `position_embeddings` (Tuple of tensors, containing cos and sin). In v4.45 `position_ids` will be removed and `position_embeddings` will be mandatory. 2025-02-14 04:26:53,250 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:26:53,250 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:26:53,250 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.29 seconds 2025-02-14 04:26:53,250 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:53,250 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13144.06 MB 2025-02-14 04:26:53,250 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16158.88 MB 2025-02-14 04:26:53,250 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3014.82 MB 2025-02-14 04:26:53,250 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20401.09 MB 2025-02-14 04:26:53,250 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20401.09 MB 2025-02-14 04:26:53,251 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:26:53,251 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16460.25 MB 2025-02-14 04:26:53,268 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8162, cut from 8164 2025-02-14 04:26:53,271 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:26:53,279 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:26:53,279 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:26:53,279 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds 2025-02-14 04:26:53,279 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:26:53,279 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16158.88 MB 2025-02-14 04:26:53,279 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24597.90 MB 2025-02-14 04:26:53,279 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8439.02 MB 2025-02-14 04:26:53,279 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20401.09 MB 2025-02-14 04:26:53,279 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 30891.05 MB 2025-02-14 04:26:53,280 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10489.95 MB 2025-02-14 04:26:53,280 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24597.90 MB 2025-02-14 04:26:53,434 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7954] 2025-02-14 04:26:53,435 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:53,435 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:26:53,436 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:53,436 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:26:53,441 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:26:53,442 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:26:53,442 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:26:53,442 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:27:58,105 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:27:58,105 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:27:58,112 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:27:58,120 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:27:58,120 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 290, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:27:58,122 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:27:58,122 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 290, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:28:02,597 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:28:02,597 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:28:02,597 - resource_logging.py:150 - __exit__ - DEBUG - Time: 4.47 seconds 2025-02-14 04:28:02,597 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:02,597 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14989.47 MB 2025-02-14 04:28:02,597 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16015.77 MB 2025-02-14 04:28:02,597 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1026.29 MB 2025-02-14 04:28:02,597 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 43476.06 MB 2025-02-14 04:28:02,597 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18679.33 MB 2025-02-14 04:28:02,597 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -24796.73 MB 2025-02-14 04:28:02,597 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24914.64 MB 2025-02-14 04:28:02,616 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:28:02,616 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:28:02,616 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:28:02,616 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:02,616 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16015.77 MB 2025-02-14 04:28:02,616 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16400.57 MB 2025-02-14 04:28:02,616 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 384.80 MB 2025-02-14 04:28:02,616 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18679.33 MB 2025-02-14 04:28:02,616 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 22051.55 MB 2025-02-14 04:28:02,616 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3372.22 MB 2025-02-14 04:28:02,616 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19871.46 MB 2025-02-14 04:28:03,967 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:28:03,967 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:28:03,967 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.35 seconds 2025-02-14 04:28:03,967 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:03,967 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16400.57 MB 2025-02-14 04:28:03,967 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16764.20 MB 2025-02-14 04:28:03,967 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 363.63 MB 2025-02-14 04:28:03,967 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 22051.55 MB 2025-02-14 04:28:03,967 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19581.11 MB 2025-02-14 04:28:03,967 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -2470.45 MB 2025-02-14 04:28:03,967 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 20741.91 MB 2025-02-14 04:28:03,977 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:28:03,977 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:28:03,977 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:28:03,977 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:03,977 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16764.20 MB 2025-02-14 04:28:03,977 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18058.23 MB 2025-02-14 04:28:03,978 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1294.03 MB 2025-02-14 04:28:03,978 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19581.11 MB 2025-02-14 04:28:03,978 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20877.15 MB 2025-02-14 04:28:03,978 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1296.04 MB 2025-02-14 04:28:03,978 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19029.17 MB 2025-02-14 04:28:04,121 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:28:04,121 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:28:04,121 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.14 seconds 2025-02-14 04:28:04,121 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,121 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18058.23 MB 2025-02-14 04:28:04,121 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 19593.92 MB 2025-02-14 04:28:04,121 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1535.69 MB 2025-02-14 04:28:04,121 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20877.15 MB 2025-02-14 04:28:04,121 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25092.42 MB 2025-02-14 04:28:04,121 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4215.28 MB 2025-02-14 04:28:04,121 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23391.73 MB 2025-02-14 04:28:04,122 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:28:04,122 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:28:04,122 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.15 seconds 2025-02-14 04:28:04,122 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,122 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16764.20 MB 2025-02-14 04:28:04,122 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 19593.92 MB 2025-02-14 04:28:04,122 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2829.73 MB 2025-02-14 04:28:04,122 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19581.11 MB 2025-02-14 04:28:04,122 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25092.42 MB 2025-02-14 04:28:04,122 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 5511.32 MB 2025-02-14 04:28:04,122 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23391.73 MB 2025-02-14 04:28:04,239 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:28:04,239 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:28:04,239 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.11 seconds 2025-02-14 04:28:04,239 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,239 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20644.40 MB 2025-02-14 04:28:04,239 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21169.79 MB 2025-02-14 04:28:04,239 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 525.40 MB 2025-02-14 04:28:04,239 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25092.42 MB 2025-02-14 04:28:04,239 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25377.64 MB 2025-02-14 04:28:04,239 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 285.21 MB 2025-02-14 04:28:04,239 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21654.63 MB 2025-02-14 04:28:04,255 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:28:04,255 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:28:04,255 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:28:04,255 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,255 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21452.63 MB 2025-02-14 04:28:04,255 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21658.67 MB 2025-02-14 04:28:04,255 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 206.04 MB 2025-02-14 04:28:04,255 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25377.64 MB 2025-02-14 04:28:04,255 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25381.83 MB 2025-02-14 04:28:04,255 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4.19 MB 2025-02-14 04:28:04,255 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21754.19 MB 2025-02-14 04:28:04,256 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:28:04,256 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:28:04,256 - resource_logging.py:150 - __exit__ - DEBUG - Time: 6.13 seconds 2025-02-14 04:28:04,256 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,256 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13979.09 MB 2025-02-14 04:28:04,256 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21859.74 MB 2025-02-14 04:28:04,256 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 7880.65 MB 2025-02-14 04:28:04,256 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 43476.06 MB 2025-02-14 04:28:04,256 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25381.83 MB 2025-02-14 04:28:04,256 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -18094.23 MB 2025-02-14 04:28:04,256 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21859.74 MB 2025-02-14 04:28:04,522 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:28:04,522 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:28:04,522 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds 2025-02-14 04:28:04,522 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,522 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21859.74 MB 2025-02-14 04:28:04,522 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24873.78 MB 2025-02-14 04:28:04,522 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3014.03 MB 2025-02-14 04:28:04,522 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25381.83 MB 2025-02-14 04:28:04,522 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26321.35 MB 2025-02-14 04:28:04,522 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 939.52 MB 2025-02-14 04:28:04,522 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25175.41 MB 2025-02-14 04:28:04,540 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8162, cut from 8164 2025-02-14 04:28:04,541 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:28:04,547 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:28:04,547 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:28:04,547 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:28:04,547 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:28:04,547 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18388.32 MB 2025-02-14 04:28:04,547 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26827.35 MB 2025-02-14 04:28:04,547 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8439.02 MB 2025-02-14 04:28:04,547 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26321.35 MB 2025-02-14 04:28:04,547 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 36811.31 MB 2025-02-14 04:28:04,547 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10489.95 MB 2025-02-14 04:28:04,547 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26827.35 MB 2025-02-14 04:28:04,710 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7954] 2025-02-14 04:28:04,711 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:28:04,711 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:28:04,712 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:28:04,712 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:28:04,717 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:28:04,718 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:28:04,718 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:28:04,718 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:29:15,051 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:15,051 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:29:15,057 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:29:15,061 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:15,061 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1627, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:29:15,062 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:15,062 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1627, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:29:39,991 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:29:39,992 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:29:39,992 - resource_logging.py:150 - __exit__ - DEBUG - Time: 24.92 seconds 2025-02-14 04:29:39,992 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:39,992 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24305.90 MB 2025-02-14 04:29:39,992 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30064.68 MB 2025-02-14 04:29:39,992 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 5758.78 MB 2025-02-14 04:29:39,992 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 49396.32 MB 2025-02-14 04:29:39,992 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 39447.43 MB 2025-02-14 04:29:39,992 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -9948.89 MB 2025-02-14 04:29:39,992 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 38987.41 MB 2025-02-14 04:29:40,072 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:29:40,072 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:29:40,072 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds 2025-02-14 04:29:40,072 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:40,072 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 30064.68 MB 2025-02-14 04:29:40,072 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24236.12 MB 2025-02-14 04:29:40,072 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -5828.57 MB 2025-02-14 04:29:40,072 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 39447.43 MB 2025-02-14 04:29:40,072 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 48897.20 MB 2025-02-14 04:29:40,072 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9449.77 MB 2025-02-14 04:29:40,072 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 40945.26 MB 2025-02-14 04:29:41,971 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:29:41,971 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:29:41,971 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.90 seconds 2025-02-14 04:29:41,971 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:41,971 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24236.12 MB 2025-02-14 04:29:41,971 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24766.96 MB 2025-02-14 04:29:41,971 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:29:41,971 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 48897.20 MB 2025-02-14 04:29:41,971 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 30909.92 MB 2025-02-14 04:29:41,971 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -17987.27 MB 2025-02-14 04:29:41,971 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28746.29 MB 2025-02-14 04:29:41,987 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:29:41,987 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:29:41,987 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:29:41,987 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:41,987 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24766.96 MB 2025-02-14 04:29:41,987 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26656.49 MB 2025-02-14 04:29:41,987 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:29:41,987 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30909.92 MB 2025-02-14 04:29:41,987 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 31853.64 MB 2025-02-14 04:29:41,987 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 943.72 MB 2025-02-14 04:29:41,987 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28073.92 MB 2025-02-14 04:29:42,195 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:29:42,195 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:29:42,196 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.21 seconds 2025-02-14 04:29:42,196 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,196 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 26656.49 MB 2025-02-14 04:29:42,196 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28898.35 MB 2025-02-14 04:29:42,196 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:29:42,196 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 31853.64 MB 2025-02-14 04:29:42,196 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37515.95 MB 2025-02-14 04:29:42,196 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 5662.31 MB 2025-02-14 04:29:42,196 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 34442.63 MB 2025-02-14 04:29:42,196 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:29:42,196 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:29:42,196 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:29:42,196 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,196 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24766.96 MB 2025-02-14 04:29:42,196 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28898.35 MB 2025-02-14 04:29:42,196 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:29:42,196 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30909.92 MB 2025-02-14 04:29:42,196 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37515.95 MB 2025-02-14 04:29:42,196 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6606.03 MB 2025-02-14 04:29:42,196 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 34442.63 MB 2025-02-14 04:29:42,359 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:29:42,359 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:29:42,359 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:29:42,359 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,359 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 30431.89 MB 2025-02-14 04:29:42,359 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31198.89 MB 2025-02-14 04:29:42,359 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:29:42,359 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37515.95 MB 2025-02-14 04:29:42,359 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37931.19 MB 2025-02-14 04:29:42,359 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 415.24 MB 2025-02-14 04:29:42,359 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31906.68 MB 2025-02-14 04:29:42,378 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:29:42,378 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:29:42,378 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:29:42,378 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,378 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 31611.78 MB 2025-02-14 04:29:42,378 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31843.21 MB 2025-02-14 04:29:42,378 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 231.43 MB 2025-02-14 04:29:42,378 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37931.19 MB 2025-02-14 04:29:42,378 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37931.19 MB 2025-02-14 04:29:42,378 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:29:42,378 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32050.89 MB 2025-02-14 04:29:42,379 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:29:42,379 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:29:42,379 - resource_logging.py:150 - __exit__ - DEBUG - Time: 27.31 seconds 2025-02-14 04:29:42,379 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,379 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18637.30 MB 2025-02-14 04:29:42,379 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32044.28 MB 2025-02-14 04:29:42,379 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 13406.98 MB 2025-02-14 04:29:42,379 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 49396.32 MB 2025-02-14 04:29:42,379 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37931.19 MB 2025-02-14 04:29:42,379 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -11465.13 MB 2025-02-14 04:29:42,379 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32050.89 MB 2025-02-14 04:29:42,649 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:29:42,649 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:29:42,649 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:29:42,649 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,649 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 32044.28 MB 2025-02-14 04:29:42,649 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 23641.69 MB 2025-02-14 04:29:42,649 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -8402.59 MB 2025-02-14 04:29:42,649 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37931.19 MB 2025-02-14 04:29:42,649 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37931.19 MB 2025-02-14 04:29:42,649 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:29:42,649 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 34555.95 MB 2025-02-14 04:29:42,667 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8162, cut from 8164 2025-02-14 04:29:42,667 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:29:42,673 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:29:42,673 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:29:42,673 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:29:42,673 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:29:42,673 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23641.69 MB 2025-02-14 04:29:42,673 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32080.38 MB 2025-02-14 04:29:42,673 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8438.69 MB 2025-02-14 04:29:42,673 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37931.19 MB 2025-02-14 04:29:42,673 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 42127.59 MB 2025-02-14 04:29:42,673 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4196.40 MB 2025-02-14 04:29:42,673 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32080.38 MB 2025-02-14 04:29:42,832 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7954] 2025-02-14 04:29:42,833 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:42,833 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:29:42,834 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:42,834 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:29:42,839 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:29:42,840 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:42,840 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:29:42,840 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:29:56,518 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:56,518 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:29:56,525 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:29:56,531 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:56,531 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1754, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:29:56,533 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:29:56,533 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1754, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:30:23,910 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:30:23,910 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:30:23,910 - resource_logging.py:150 - __exit__ - DEBUG - Time: 27.37 seconds 2025-02-14 04:30:23,910 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:23,910 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25190.86 MB 2025-02-14 04:30:23,910 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31398.43 MB 2025-02-14 04:30:23,910 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 6207.57 MB 2025-02-14 04:30:23,910 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50516.20 MB 2025-02-14 04:30:23,910 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 39892.03 MB 2025-02-14 04:30:23,910 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -10624.17 MB 2025-02-14 04:30:23,910 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 40325.35 MB 2025-02-14 04:30:24,015 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:30:24,015 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:30:24,015 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.10 seconds 2025-02-14 04:30:24,015 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:24,015 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 31398.43 MB 2025-02-14 04:30:24,015 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24896.35 MB 2025-02-14 04:30:24,015 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -6502.08 MB 2025-02-14 04:30:24,015 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 39892.03 MB 2025-02-14 04:30:24,015 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 59068.38 MB 2025-02-14 04:30:24,015 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 19176.36 MB 2025-02-14 04:30:24,015 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 49844.37 MB 2025-02-14 04:30:25,941 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:30:25,941 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:30:25,941 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.92 seconds 2025-02-14 04:30:25,941 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:25,941 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24896.35 MB 2025-02-14 04:30:25,942 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25427.19 MB 2025-02-14 04:30:25,942 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:30:25,942 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 59068.38 MB 2025-02-14 04:30:25,942 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 30905.73 MB 2025-02-14 04:30:25,942 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -28162.65 MB 2025-02-14 04:30:25,942 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29406.52 MB 2025-02-14 04:30:25,957 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:30:25,957 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:30:25,957 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:30:25,957 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:25,957 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25427.19 MB 2025-02-14 04:30:25,957 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27316.72 MB 2025-02-14 04:30:25,957 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:30:25,957 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30905.73 MB 2025-02-14 04:30:25,957 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 31849.45 MB 2025-02-14 04:30:25,958 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 943.72 MB 2025-02-14 04:30:25,958 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28734.15 MB 2025-02-14 04:30:26,162 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:30:26,162 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:30:26,162 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds 2025-02-14 04:30:26,162 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,162 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27316.72 MB 2025-02-14 04:30:26,162 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29558.58 MB 2025-02-14 04:30:26,162 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:30:26,162 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 31849.45 MB 2025-02-14 04:30:26,162 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37983.62 MB 2025-02-14 04:30:26,162 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:30:26,162 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35102.86 MB 2025-02-14 04:30:26,163 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:30:26,163 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:30:26,163 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:30:26,163 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,163 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25427.19 MB 2025-02-14 04:30:26,163 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29558.58 MB 2025-02-14 04:30:26,163 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:30:26,163 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30905.73 MB 2025-02-14 04:30:26,163 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37983.62 MB 2025-02-14 04:30:26,163 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 7077.89 MB 2025-02-14 04:30:26,163 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35102.86 MB 2025-02-14 04:30:26,322 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:30:26,322 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:30:26,322 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.15 seconds 2025-02-14 04:30:26,322 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,322 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 31092.12 MB 2025-02-14 04:30:26,322 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31859.12 MB 2025-02-14 04:30:26,322 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:30:26,322 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37983.62 MB 2025-02-14 04:30:26,323 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38400.95 MB 2025-02-14 04:30:26,323 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 417.33 MB 2025-02-14 04:30:26,323 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32566.91 MB 2025-02-14 04:30:26,341 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:30:26,341 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:30:26,341 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:30:26,341 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,341 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 32272.01 MB 2025-02-14 04:30:26,341 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32500.57 MB 2025-02-14 04:30:26,341 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.56 MB 2025-02-14 04:30:26,341 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38400.95 MB 2025-02-14 04:30:26,341 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38400.95 MB 2025-02-14 04:30:26,341 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:30:26,341 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32740.12 MB 2025-02-14 04:30:26,342 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:30:26,342 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:30:26,342 - resource_logging.py:150 - __exit__ - DEBUG - Time: 29.81 seconds 2025-02-14 04:30:26,342 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,342 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19079.78 MB 2025-02-14 04:30:26,342 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32700.88 MB 2025-02-14 04:30:26,342 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 13621.10 MB 2025-02-14 04:30:26,342 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50516.20 MB 2025-02-14 04:30:26,342 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38400.95 MB 2025-02-14 04:30:26,342 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -12115.25 MB 2025-02-14 04:30:26,342 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32740.12 MB 2025-02-14 04:30:26,610 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:30:26,611 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:30:26,611 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:30:26,611 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,611 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 32700.88 MB 2025-02-14 04:30:26,611 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24072.36 MB 2025-02-14 04:30:26,611 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -8628.52 MB 2025-02-14 04:30:26,611 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38400.95 MB 2025-02-14 04:30:26,611 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38400.95 MB 2025-02-14 04:30:26,611 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:30:26,611 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35203.03 MB 2025-02-14 04:30:26,629 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8131, cut from 8133 2025-02-14 04:30:26,629 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:30:26,635 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:30:26,635 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:30:26,635 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:30:26,635 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:26,635 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24072.36 MB 2025-02-14 04:30:26,635 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32480.10 MB 2025-02-14 04:30:26,635 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8407.74 MB 2025-02-14 04:30:26,635 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38400.95 MB 2025-02-14 04:30:26,635 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 46760.20 MB 2025-02-14 04:30:26,635 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 8359.25 MB 2025-02-14 04:30:26,635 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32480.10 MB 2025-02-14 04:30:26,793 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7923] 2025-02-14 04:30:26,794 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:26,794 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:30:26,795 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:26,795 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:30:26,800 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:30:26,801 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:26,802 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:30:26,802 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:30:38,743 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:38,743 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:30:38,749 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:30:38,753 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:38,753 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 321, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:30:38,754 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:38,754 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 321, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:30:43,773 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:30:43,773 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:30:43,773 - resource_logging.py:150 - __exit__ - DEBUG - Time: 5.01 seconds 2025-02-14 04:30:43,773 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:43,773 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15205.49 MB 2025-02-14 04:30:43,773 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16341.49 MB 2025-02-14 04:30:43,773 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1136.00 MB 2025-02-14 04:30:43,773 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 55119.45 MB 2025-02-14 04:30:43,773 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 22045.26 MB 2025-02-14 04:30:43,773 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -33074.18 MB 2025-02-14 04:30:43,773 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25356.33 MB 2025-02-14 04:30:43,806 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:30:43,806 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:30:43,806 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds 2025-02-14 04:30:43,806 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:43,806 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16341.49 MB 2025-02-14 04:30:43,806 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16660.05 MB 2025-02-14 04:30:43,806 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 318.56 MB 2025-02-14 04:30:43,806 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 22045.26 MB 2025-02-14 04:30:43,806 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 23064.48 MB 2025-02-14 04:30:43,807 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1019.22 MB 2025-02-14 04:30:43,807 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 20390.29 MB 2025-02-14 04:30:45,187 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:30:45,187 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:30:45,187 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.38 seconds 2025-02-14 04:30:45,187 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,188 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16660.05 MB 2025-02-14 04:30:45,188 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17042.26 MB 2025-02-14 04:30:45,188 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 382.21 MB 2025-02-14 04:30:45,188 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 23064.48 MB 2025-02-14 04:30:45,188 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 23064.48 MB 2025-02-14 04:30:45,188 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:30:45,188 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21000.36 MB 2025-02-14 04:30:45,198 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:30:45,198 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:30:45,198 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:30:45,198 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,198 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17042.26 MB 2025-02-14 04:30:45,198 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18403.31 MB 2025-02-14 04:30:45,198 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1361.05 MB 2025-02-14 04:30:45,198 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 23064.48 MB 2025-02-14 04:30:45,198 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 23064.48 MB 2025-02-14 04:30:45,198 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:30:45,198 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19423.86 MB 2025-02-14 04:30:45,347 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:30:45,347 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:30:45,347 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.15 seconds 2025-02-14 04:30:45,347 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,347 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18403.31 MB 2025-02-14 04:30:45,347 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20017.46 MB 2025-02-14 04:30:45,347 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1614.15 MB 2025-02-14 04:30:45,347 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 23064.48 MB 2025-02-14 04:30:45,347 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26122.13 MB 2025-02-14 04:30:45,347 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3057.65 MB 2025-02-14 04:30:45,347 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24009.33 MB 2025-02-14 04:30:45,348 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:30:45,348 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:30:45,348 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:30:45,348 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,348 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17042.26 MB 2025-02-14 04:30:45,348 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20017.46 MB 2025-02-14 04:30:45,348 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2975.21 MB 2025-02-14 04:30:45,348 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 23064.48 MB 2025-02-14 04:30:45,348 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26122.13 MB 2025-02-14 04:30:45,348 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3057.65 MB 2025-02-14 04:30:45,348 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24009.33 MB 2025-02-14 04:30:45,464 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:30:45,464 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:30:45,464 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.11 seconds 2025-02-14 04:30:45,464 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,464 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21121.61 MB 2025-02-14 04:30:45,464 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21673.85 MB 2025-02-14 04:30:45,464 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 552.24 MB 2025-02-14 04:30:45,464 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26122.13 MB 2025-02-14 04:30:45,464 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26422.02 MB 2025-02-14 04:30:45,464 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 299.89 MB 2025-02-14 04:30:45,464 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22183.46 MB 2025-02-14 04:30:45,478 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:30:45,478 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:30:45,478 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:30:45,478 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,478 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21971.14 MB 2025-02-14 04:30:45,478 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22199.94 MB 2025-02-14 04:30:45,478 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.80 MB 2025-02-14 04:30:45,478 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26422.02 MB 2025-02-14 04:30:45,478 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26422.02 MB 2025-02-14 04:30:45,478 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:30:45,478 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22285.05 MB 2025-02-14 04:30:45,479 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:30:45,479 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:30:45,479 - resource_logging.py:150 - __exit__ - DEBUG - Time: 6.72 seconds 2025-02-14 04:30:45,479 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,479 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14087.10 MB 2025-02-14 04:30:45,479 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22400.67 MB 2025-02-14 04:30:45,479 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8313.57 MB 2025-02-14 04:30:45,479 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 55119.45 MB 2025-02-14 04:30:45,480 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26422.02 MB 2025-02-14 04:30:45,480 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -28697.43 MB 2025-02-14 04:30:45,480 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22400.67 MB 2025-02-14 04:30:45,749 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:30:45,749 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:30:45,749 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:30:45,749 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,749 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22400.67 MB 2025-02-14 04:30:45,749 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25409.54 MB 2025-02-14 04:30:45,749 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3008.87 MB 2025-02-14 04:30:45,749 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26422.02 MB 2025-02-14 04:30:45,749 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26824.67 MB 2025-02-14 04:30:45,749 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 402.65 MB 2025-02-14 04:30:45,749 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25710.88 MB 2025-02-14 04:30:45,767 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8148, cut from 8150 2025-02-14 04:30:45,767 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 1,'] 2025-02-14 04:30:45,773 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:30:45,773 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:30:45,773 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:30:45,773 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:45,773 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18557.59 MB 2025-02-14 04:30:45,773 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26982.54 MB 2025-02-14 04:30:45,773 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8424.95 MB 2025-02-14 04:30:45,773 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26824.67 MB 2025-02-14 04:30:45,773 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37295.75 MB 2025-02-14 04:30:45,774 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10471.08 MB 2025-02-14 04:30:45,774 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26982.54 MB 2025-02-14 04:30:45,928 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7940] 2025-02-14 04:30:45,930 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:45,930 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:30:45,931 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:45,931 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:30:45,935 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:30:45,936 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:45,936 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:30:45,937 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 1,'] 2025-02-14 04:30:55,797 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:55,798 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:30:55,805 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:30:55,812 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:55,812 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 191, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:30:55,814 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:30:55,814 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 191, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:30:58,869 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:30:58,869 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:30:58,869 - resource_logging.py:150 - __exit__ - DEBUG - Time: 3.05 seconds 2025-02-14 04:30:58,869 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:58,869 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14299.63 MB 2025-02-14 04:30:58,869 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14975.56 MB 2025-02-14 04:30:58,869 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 675.94 MB 2025-02-14 04:30:58,869 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 45671.78 MB 2025-02-14 04:30:58,869 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17926.46 MB 2025-02-14 04:30:58,869 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -27745.32 MB 2025-02-14 04:30:58,869 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23853.72 MB 2025-02-14 04:30:58,888 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:30:58,889 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:30:58,889 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:30:58,889 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:58,889 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14975.56 MB 2025-02-14 04:30:58,889 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15303.05 MB 2025-02-14 04:30:58,889 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 327.49 MB 2025-02-14 04:30:58,889 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17926.46 MB 2025-02-14 04:30:58,889 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18603.84 MB 2025-02-14 04:30:58,889 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 677.38 MB 2025-02-14 04:30:58,889 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 17711.51 MB 2025-02-14 04:30:59,891 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:30:59,891 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:30:59,891 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.00 seconds 2025-02-14 04:30:59,891 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:59,891 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15303.05 MB 2025-02-14 04:30:59,891 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15556.53 MB 2025-02-14 04:30:59,891 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 253.48 MB 2025-02-14 04:30:59,891 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18603.84 MB 2025-02-14 04:30:59,891 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18329.11 MB 2025-02-14 04:30:59,891 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -274.73 MB 2025-02-14 04:30:59,891 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19495.69 MB 2025-02-14 04:30:59,903 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:30:59,903 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:30:59,904 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:30:59,904 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:30:59,904 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15556.47 MB 2025-02-14 04:30:59,904 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16458.76 MB 2025-02-14 04:30:59,904 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 902.30 MB 2025-02-14 04:30:59,904 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18329.11 MB 2025-02-14 04:30:59,904 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18780.00 MB 2025-02-14 04:30:59,904 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 450.89 MB 2025-02-14 04:30:59,904 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 17135.59 MB 2025-02-14 04:31:00,040 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:31:00,040 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:31:00,040 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.13 seconds 2025-02-14 04:31:00,040 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,040 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16458.76 MB 2025-02-14 04:31:00,040 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17529.54 MB 2025-02-14 04:31:00,040 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1070.78 MB 2025-02-14 04:31:00,040 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18780.00 MB 2025-02-14 04:31:00,040 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21711.81 MB 2025-02-14 04:31:00,040 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 2931.82 MB 2025-02-14 04:31:00,040 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 20177.95 MB 2025-02-14 04:31:00,042 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:31:00,042 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:31:00,042 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.15 seconds 2025-02-14 04:31:00,042 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,042 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15556.47 MB 2025-02-14 04:31:00,042 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17529.54 MB 2025-02-14 04:31:00,042 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1973.08 MB 2025-02-14 04:31:00,042 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18329.11 MB 2025-02-14 04:31:00,042 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21711.81 MB 2025-02-14 04:31:00,042 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3382.71 MB 2025-02-14 04:31:00,042 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 20177.95 MB 2025-02-14 04:31:00,172 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:31:00,172 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:31:00,172 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.12 seconds 2025-02-14 04:31:00,172 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,172 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18261.81 MB 2025-02-14 04:31:00,172 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18628.05 MB 2025-02-14 04:31:00,172 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 366.24 MB 2025-02-14 04:31:00,172 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 21711.81 MB 2025-02-14 04:31:00,172 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21908.95 MB 2025-02-14 04:31:00,172 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 197.13 MB 2025-02-14 04:31:00,172 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18972.06 MB 2025-02-14 04:31:00,189 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:31:00,189 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:31:00,189 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:31:00,189 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,189 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18825.21 MB 2025-02-14 04:31:00,189 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 19050.29 MB 2025-02-14 04:31:00,189 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 225.07 MB 2025-02-14 04:31:00,189 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 21908.95 MB 2025-02-14 04:31:00,189 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21908.95 MB 2025-02-14 04:31:00,189 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:31:00,189 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19086.72 MB 2025-02-14 04:31:00,191 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:31:00,191 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:31:00,191 - resource_logging.py:150 - __exit__ - DEBUG - Time: 4.37 seconds 2025-02-14 04:31:00,191 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,191 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13634.17 MB 2025-02-14 04:31:00,191 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 19251.29 MB 2025-02-14 04:31:00,191 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 5617.12 MB 2025-02-14 04:31:00,191 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 45671.78 MB 2025-02-14 04:31:00,191 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21908.95 MB 2025-02-14 04:31:00,191 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -23762.83 MB 2025-02-14 04:31:00,192 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19251.29 MB 2025-02-14 04:31:00,483 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:31:00,483 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:31:00,483 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.29 seconds 2025-02-14 04:31:00,483 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,483 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19251.29 MB 2025-02-14 04:31:00,483 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17650.90 MB 2025-02-14 04:31:00,483 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -1600.39 MB 2025-02-14 04:31:00,483 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 21908.95 MB 2025-02-14 04:31:00,483 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21908.95 MB 2025-02-14 04:31:00,483 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:31:00,483 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19251.30 MB 2025-02-14 04:31:00,503 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8159, cut from 8161 2025-02-14 04:31:00,503 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:31:00,510 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:31:00,511 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:31:00,511 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:31:00,511 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:31:00,511 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17650.90 MB 2025-02-14 04:31:00,511 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26086.49 MB 2025-02-14 04:31:00,511 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8435.59 MB 2025-02-14 04:31:00,511 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 21908.95 MB 2025-02-14 04:31:00,511 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 32394.71 MB 2025-02-14 04:31:00,511 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10485.76 MB 2025-02-14 04:31:00,511 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26086.49 MB 2025-02-14 04:31:00,762 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7951] 2025-02-14 04:31:00,764 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:31:00,764 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:31:00,766 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:31:00,766 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:31:00,773 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:31:00,775 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:31:00,776 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:31:00,776 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:32:06,187 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:06,187 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:32:06,192 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:32:06,196 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:06,196 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 170, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:32:06,197 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:06,197 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 170, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:32:08,806 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:32:08,806 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:32:08,806 - resource_logging.py:150 - __exit__ - DEBUG - Time: 2.60 seconds 2025-02-14 04:32:08,806 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:08,806 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14153.77 MB 2025-02-14 04:32:08,806 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14755.39 MB 2025-02-14 04:32:08,806 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 601.62 MB 2025-02-14 04:32:08,806 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 40783.31 MB 2025-02-14 04:32:08,806 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18100.52 MB 2025-02-14 04:32:08,806 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -22682.80 MB 2025-02-14 04:32:08,806 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23625.95 MB 2025-02-14 04:32:08,819 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:32:08,819 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:32:08,819 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:32:08,819 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:08,819 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14755.39 MB 2025-02-14 04:32:08,819 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14984.33 MB 2025-02-14 04:32:08,819 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.93 MB 2025-02-14 04:32:08,819 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18100.52 MB 2025-02-14 04:32:08,819 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18670.94 MB 2025-02-14 04:32:08,819 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 570.43 MB 2025-02-14 04:32:08,819 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 17052.92 MB 2025-02-14 04:32:09,629 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:32:09,629 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:32:09,629 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.81 seconds 2025-02-14 04:32:09,629 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,629 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14984.33 MB 2025-02-14 04:32:09,629 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15197.99 MB 2025-02-14 04:32:09,629 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 213.66 MB 2025-02-14 04:32:09,629 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18670.94 MB 2025-02-14 04:32:09,629 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18331.21 MB 2025-02-14 04:32:09,629 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -339.74 MB 2025-02-14 04:32:09,629 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19155.80 MB 2025-02-14 04:32:09,637 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:32:09,637 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:32:09,637 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds 2025-02-14 04:32:09,637 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,637 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15197.92 MB 2025-02-14 04:32:09,637 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15958.28 MB 2025-02-14 04:32:09,637 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 760.35 MB 2025-02-14 04:32:09,637 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18331.21 MB 2025-02-14 04:32:09,637 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 18331.21 MB 2025-02-14 04:32:09,637 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:32:09,637 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16528.80 MB 2025-02-14 04:32:09,723 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:32:09,723 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:32:09,723 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds 2025-02-14 04:32:09,723 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,723 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15958.28 MB 2025-02-14 04:32:09,723 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16860.66 MB 2025-02-14 04:32:09,723 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 902.39 MB 2025-02-14 04:32:09,723 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18331.21 MB 2025-02-14 04:32:09,723 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20239.61 MB 2025-02-14 04:32:09,723 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1908.41 MB 2025-02-14 04:32:09,723 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19092.20 MB 2025-02-14 04:32:09,723 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:32:09,723 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:32:09,723 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.09 seconds 2025-02-14 04:32:09,723 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,724 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15197.92 MB 2025-02-14 04:32:09,724 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16860.66 MB 2025-02-14 04:32:09,724 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1662.74 MB 2025-02-14 04:32:09,724 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 18331.21 MB 2025-02-14 04:32:09,724 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20239.61 MB 2025-02-14 04:32:09,724 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1908.41 MB 2025-02-14 04:32:09,724 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19092.20 MB 2025-02-14 04:32:09,790 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:32:09,790 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:32:09,790 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds 2025-02-14 04:32:09,790 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,790 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17477.91 MB 2025-02-14 04:32:09,790 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17786.63 MB 2025-02-14 04:32:09,790 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 308.72 MB 2025-02-14 04:32:09,790 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20239.61 MB 2025-02-14 04:32:09,790 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20405.29 MB 2025-02-14 04:32:09,790 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 165.68 MB 2025-02-14 04:32:09,790 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18078.93 MB 2025-02-14 04:32:09,799 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:32:09,799 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:32:09,799 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:32:09,799 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,799 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17952.83 MB 2025-02-14 04:32:09,799 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18180.93 MB 2025-02-14 04:32:09,799 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.11 MB 2025-02-14 04:32:09,799 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20405.29 MB 2025-02-14 04:32:09,799 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20405.29 MB 2025-02-14 04:32:09,799 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:32:09,799 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18199.08 MB 2025-02-14 04:32:09,800 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:32:09,800 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:32:09,800 - resource_logging.py:150 - __exit__ - DEBUG - Time: 3.60 seconds 2025-02-14 04:32:09,800 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:09,800 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13561.48 MB 2025-02-14 04:32:09,800 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18381.88 MB 2025-02-14 04:32:09,800 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4820.40 MB 2025-02-14 04:32:09,800 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 40783.31 MB 2025-02-14 04:32:09,800 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20405.29 MB 2025-02-14 04:32:09,800 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -20378.03 MB 2025-02-14 04:32:09,800 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18381.88 MB 2025-02-14 04:32:10,067 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:32:10,067 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:32:10,067 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:32:10,067 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:10,067 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18381.88 MB 2025-02-14 04:32:10,067 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17436.30 MB 2025-02-14 04:32:10,067 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -945.59 MB 2025-02-14 04:32:10,067 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20405.29 MB 2025-02-14 04:32:10,067 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20405.29 MB 2025-02-14 04:32:10,067 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:32:10,067 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19185.13 MB 2025-02-14 04:32:10,094 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8157, cut from 8159 2025-02-14 04:32:10,095 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:32:10,105 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:32:10,105 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:32:10,105 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.04 seconds 2025-02-14 04:32:10,105 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:32:10,105 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17436.30 MB 2025-02-14 04:32:10,105 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25870.92 MB 2025-02-14 04:32:10,105 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8434.62 MB 2025-02-14 04:32:10,105 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20405.29 MB 2025-02-14 04:32:10,105 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 30886.85 MB 2025-02-14 04:32:10,105 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10481.57 MB 2025-02-14 04:32:10,105 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25870.92 MB 2025-02-14 04:32:10,263 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7949] 2025-02-14 04:32:10,264 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:10,264 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:32:10,265 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:10,265 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:32:10,270 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:32:10,271 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:10,271 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:32:10,271 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:32:53,348 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:53,349 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:32:53,354 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:32:53,358 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:53,358 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1377, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:32:53,359 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:32:53,359 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1377, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:33:14,677 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:33:14,677 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:33:14,677 - resource_logging.py:150 - __exit__ - DEBUG - Time: 21.31 seconds 2025-02-14 04:33:14,677 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:14,677 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22563.86 MB 2025-02-14 04:33:14,677 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27437.64 MB 2025-02-14 04:33:14,677 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4873.78 MB 2025-02-14 04:33:14,677 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 39271.27 MB 2025-02-14 04:33:14,677 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38551.95 MB 2025-02-14 04:33:14,677 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -719.32 MB 2025-02-14 04:33:14,677 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 36338.59 MB 2025-02-14 04:33:14,815 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:33:14,815 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:33:14,815 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.14 seconds 2025-02-14 04:33:14,815 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:14,815 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27437.64 MB 2025-02-14 04:33:14,815 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22936.44 MB 2025-02-14 04:33:14,815 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -4501.20 MB 2025-02-14 04:33:14,815 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38551.95 MB 2025-02-14 04:33:14,815 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 48175.78 MB 2025-02-14 04:33:14,815 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9623.83 MB 2025-02-14 04:33:14,815 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 41911.52 MB 2025-02-14 04:33:16,720 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:33:16,720 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:33:16,720 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.90 seconds 2025-02-14 04:33:16,720 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:16,720 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22936.44 MB 2025-02-14 04:33:16,720 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 23467.29 MB 2025-02-14 04:33:16,720 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:33:16,720 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 48175.78 MB 2025-02-14 04:33:16,720 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33678.16 MB 2025-02-14 04:33:16,720 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -14497.61 MB 2025-02-14 04:33:16,720 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 27446.95 MB 2025-02-14 04:33:16,733 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:33:16,733 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:33:16,733 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:33:16,733 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:16,733 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23467.29 MB 2025-02-14 04:33:16,733 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25356.82 MB 2025-02-14 04:33:16,733 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:33:16,733 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33678.16 MB 2025-02-14 04:33:16,733 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33678.16 MB 2025-02-14 04:33:16,733 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:33:16,733 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26774.25 MB 2025-02-14 04:33:16,943 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:33:16,943 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:33:16,943 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.21 seconds 2025-02-14 04:33:16,943 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:16,943 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25356.82 MB 2025-02-14 04:33:16,943 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27598.68 MB 2025-02-14 04:33:16,943 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:33:16,943 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33678.16 MB 2025-02-14 04:33:16,943 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37453.04 MB 2025-02-14 04:33:16,943 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3774.87 MB 2025-02-14 04:33:16,943 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 33142.96 MB 2025-02-14 04:33:16,944 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:33:16,944 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:33:16,944 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:33:16,944 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:16,944 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23467.29 MB 2025-02-14 04:33:16,944 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27598.68 MB 2025-02-14 04:33:16,944 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:33:16,944 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33678.16 MB 2025-02-14 04:33:16,944 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37453.04 MB 2025-02-14 04:33:16,944 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3774.87 MB 2025-02-14 04:33:16,944 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 33142.96 MB 2025-02-14 04:33:17,217 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:33:17,217 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:33:17,217 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:33:17,217 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:17,217 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29132.22 MB 2025-02-14 04:33:17,217 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29899.22 MB 2025-02-14 04:33:17,217 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:33:17,217 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37453.04 MB 2025-02-14 04:33:17,217 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37868.27 MB 2025-02-14 04:33:17,217 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 415.24 MB 2025-02-14 04:33:17,217 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30607.01 MB 2025-02-14 04:33:17,235 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:33:17,235 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:33:17,235 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:33:17,235 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:17,235 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 30312.11 MB 2025-02-14 04:33:17,235 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30540.70 MB 2025-02-14 04:33:17,235 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.59 MB 2025-02-14 04:33:17,235 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37868.27 MB 2025-02-14 04:33:17,235 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37868.27 MB 2025-02-14 04:33:17,235 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:33:17,235 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30773.97 MB 2025-02-14 04:33:17,236 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:33:17,236 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:33:17,236 - resource_logging.py:150 - __exit__ - DEBUG - Time: 23.87 seconds 2025-02-14 04:33:17,236 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:17,236 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17766.28 MB 2025-02-14 04:33:17,236 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30740.59 MB 2025-02-14 04:33:17,236 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12974.31 MB 2025-02-14 04:33:17,236 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 39271.27 MB 2025-02-14 04:33:17,236 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37868.27 MB 2025-02-14 04:33:17,236 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -1402.99 MB 2025-02-14 04:33:17,236 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30773.97 MB 2025-02-14 04:33:17,504 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:33:17,504 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:33:17,504 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:33:17,504 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:17,504 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 30740.59 MB 2025-02-14 04:33:17,504 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22752.39 MB 2025-02-14 04:33:17,504 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7988.20 MB 2025-02-14 04:33:17,504 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37868.27 MB 2025-02-14 04:33:17,504 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37868.27 MB 2025-02-14 04:33:17,504 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:33:17,505 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 33237.51 MB 2025-02-14 04:33:17,522 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8114, cut from 8116 2025-02-14 04:33:17,522 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:33:17,528 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:33:17,528 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:33:17,528 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:33:17,528 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:33:17,528 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22752.39 MB 2025-02-14 04:33:17,529 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31141.53 MB 2025-02-14 04:33:17,529 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8389.15 MB 2025-02-14 04:33:17,529 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37868.27 MB 2025-02-14 04:33:17,529 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 42039.51 MB 2025-02-14 04:33:17,529 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4171.24 MB 2025-02-14 04:33:17,529 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31141.53 MB 2025-02-14 04:33:17,684 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7906] 2025-02-14 04:33:17,686 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:33:17,686 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:33:17,687 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:33:17,687 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:33:17,691 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:33:17,692 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:33:17,692 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:33:17,692 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:34:30,108 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:30,108 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:34:30,113 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:34:30,117 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:30,118 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 806, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:34:30,119 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:30,119 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 806, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:34:42,586 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:34:42,586 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:34:42,586 - resource_logging.py:150 - __exit__ - DEBUG - Time: 12.46 seconds 2025-02-14 04:34:42,586 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:42,586 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18585.04 MB 2025-02-14 04:34:42,586 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21437.43 MB 2025-02-14 04:34:42,586 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2852.39 MB 2025-02-14 04:34:42,586 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50381.98 MB 2025-02-14 04:34:42,586 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 28147.97 MB 2025-02-14 04:34:42,586 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -22234.01 MB 2025-02-14 04:34:42,586 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30322.15 MB 2025-02-14 04:34:42,641 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:34:42,642 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:34:42,642 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds 2025-02-14 04:34:42,642 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:42,642 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21437.43 MB 2025-02-14 04:34:42,642 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 19967.99 MB 2025-02-14 04:34:42,642 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -1469.44 MB 2025-02-14 04:34:42,642 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 28147.97 MB 2025-02-14 04:34:42,642 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35129.39 MB 2025-02-14 04:34:42,642 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6981.42 MB 2025-02-14 04:34:42,642 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31247.72 MB 2025-02-14 04:34:44,576 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:34:44,576 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:34:44,576 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.93 seconds 2025-02-14 04:34:44,576 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:44,576 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19967.99 MB 2025-02-14 04:34:44,576 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20498.84 MB 2025-02-14 04:34:44,576 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:34:44,576 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 35129.39 MB 2025-02-14 04:34:44,576 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26709.33 MB 2025-02-14 04:34:44,576 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -8420.07 MB 2025-02-14 04:34:44,576 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24478.17 MB 2025-02-14 04:34:44,589 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:34:44,589 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:34:44,589 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:34:44,589 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:44,589 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20498.84 MB 2025-02-14 04:34:44,589 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22388.37 MB 2025-02-14 04:34:44,589 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:34:44,589 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26709.33 MB 2025-02-14 04:34:44,589 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26709.33 MB 2025-02-14 04:34:44,589 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:34:44,589 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23805.80 MB 2025-02-14 04:34:44,932 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:34:44,932 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:34:44,932 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.34 seconds 2025-02-14 04:34:44,932 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:44,932 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22388.37 MB 2025-02-14 04:34:44,932 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24630.23 MB 2025-02-14 04:34:44,932 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:34:44,932 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26709.33 MB 2025-02-14 04:34:44,932 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 32843.50 MB 2025-02-14 04:34:44,933 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:34:44,933 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30174.51 MB 2025-02-14 04:34:44,934 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:34:44,934 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:34:44,934 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.36 seconds 2025-02-14 04:34:44,934 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:44,934 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20498.84 MB 2025-02-14 04:34:44,934 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24630.23 MB 2025-02-14 04:34:44,934 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:34:44,934 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26709.33 MB 2025-02-14 04:34:44,934 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 32843.50 MB 2025-02-14 04:34:44,934 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:34:44,934 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30174.51 MB 2025-02-14 04:34:45,196 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:34:45,197 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:34:45,197 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds 2025-02-14 04:34:45,197 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:45,197 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 26163.77 MB 2025-02-14 04:34:45,197 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26930.77 MB 2025-02-14 04:34:45,197 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:34:45,197 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 32843.50 MB 2025-02-14 04:34:45,197 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33258.73 MB 2025-02-14 04:34:45,197 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 415.24 MB 2025-02-14 04:34:45,197 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 27638.56 MB 2025-02-14 04:34:45,225 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:34:45,225 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:34:45,225 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:34:45,225 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:45,225 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27343.66 MB 2025-02-14 04:34:45,225 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27571.53 MB 2025-02-14 04:34:45,225 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 227.87 MB 2025-02-14 04:34:45,225 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33258.73 MB 2025-02-14 04:34:45,226 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33258.73 MB 2025-02-14 04:34:45,226 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:34:45,226 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 27783.78 MB 2025-02-14 04:34:45,228 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:34:45,228 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:34:45,228 - resource_logging.py:150 - __exit__ - DEBUG - Time: 15.11 seconds 2025-02-14 04:34:45,228 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:45,228 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15776.88 MB 2025-02-14 04:34:45,228 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27771.71 MB 2025-02-14 04:34:45,228 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 11994.84 MB 2025-02-14 04:34:45,228 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50381.98 MB 2025-02-14 04:34:45,228 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33258.73 MB 2025-02-14 04:34:45,228 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -17123.25 MB 2025-02-14 04:34:45,228 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 27783.78 MB 2025-02-14 04:34:45,522 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:34:45,522 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:34:45,522 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.29 seconds 2025-02-14 04:34:45,522 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:45,522 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27771.71 MB 2025-02-14 04:34:45,522 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20767.55 MB 2025-02-14 04:34:45,522 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7004.16 MB 2025-02-14 04:34:45,522 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33258.73 MB 2025-02-14 04:34:45,522 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 33258.73 MB 2025-02-14 04:34:45,522 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:34:45,522 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30272.32 MB 2025-02-14 04:34:45,541 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8126, cut from 8128 2025-02-14 04:34:45,542 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:34:45,549 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:34:45,549 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:34:45,549 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:34:45,549 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:34:45,549 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20767.55 MB 2025-02-14 04:34:45,549 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29169.08 MB 2025-02-14 04:34:45,549 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8401.53 MB 2025-02-14 04:34:45,549 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 33258.73 MB 2025-02-14 04:34:45,549 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 41613.79 MB 2025-02-14 04:34:45,549 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 8355.05 MB 2025-02-14 04:34:45,549 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29169.08 MB 2025-02-14 04:34:45,795 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7918] 2025-02-14 04:34:45,798 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:45,798 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:34:45,800 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:45,800 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:34:45,807 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:34:45,809 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:45,809 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:34:45,809 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:34:55,755 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:55,756 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:34:55,763 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:34:55,770 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:55,770 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1742, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:34:55,772 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:34:55,772 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1742, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:35:23,236 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:35:23,236 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:35:23,237 - resource_logging.py:150 - __exit__ - DEBUG - Time: 27.45 seconds 2025-02-14 04:35:23,237 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:23,237 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25107.24 MB 2025-02-14 04:35:23,237 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31272.87 MB 2025-02-14 04:35:23,237 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 6165.63 MB 2025-02-14 04:35:23,237 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 49968.84 MB 2025-02-14 04:35:23,237 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 39785.07 MB 2025-02-14 04:35:23,237 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -10183.77 MB 2025-02-14 04:35:23,237 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 40241.73 MB 2025-02-14 04:35:23,330 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:35:23,331 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:35:23,331 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.09 seconds 2025-02-14 04:35:23,331 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:23,331 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 31272.87 MB 2025-02-14 04:35:23,331 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24833.97 MB 2025-02-14 04:35:23,331 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -6438.90 MB 2025-02-14 04:35:23,331 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 39785.07 MB 2025-02-14 04:35:23,331 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 55253.66 MB 2025-02-14 04:35:23,331 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 15468.59 MB 2025-02-14 04:35:23,331 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 47020.95 MB 2025-02-14 04:35:25,257 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:35:25,257 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:35:25,257 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.92 seconds 2025-02-14 04:35:25,257 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,257 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24833.97 MB 2025-02-14 04:35:25,257 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25364.81 MB 2025-02-14 04:35:25,257 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:35:25,257 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 55253.66 MB 2025-02-14 04:35:25,257 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 30857.49 MB 2025-02-14 04:35:25,257 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -24396.17 MB 2025-02-14 04:35:25,257 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29344.14 MB 2025-02-14 04:35:25,273 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:35:25,273 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:35:25,273 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:35:25,273 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,273 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25364.81 MB 2025-02-14 04:35:25,273 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27254.34 MB 2025-02-14 04:35:25,273 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:35:25,273 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30857.49 MB 2025-02-14 04:35:25,273 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 31801.21 MB 2025-02-14 04:35:25,273 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 943.72 MB 2025-02-14 04:35:25,273 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28671.77 MB 2025-02-14 04:35:25,479 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:35:25,479 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:35:25,479 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds 2025-02-14 04:35:25,479 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,479 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27254.34 MB 2025-02-14 04:35:25,479 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29496.20 MB 2025-02-14 04:35:25,479 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:35:25,479 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 31801.21 MB 2025-02-14 04:35:25,479 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37935.38 MB 2025-02-14 04:35:25,479 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:35:25,479 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35040.48 MB 2025-02-14 04:35:25,480 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:35:25,480 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:35:25,480 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:35:25,480 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,480 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25364.81 MB 2025-02-14 04:35:25,480 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29496.20 MB 2025-02-14 04:35:25,480 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:35:25,480 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30857.49 MB 2025-02-14 04:35:25,480 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37935.38 MB 2025-02-14 04:35:25,480 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 7077.89 MB 2025-02-14 04:35:25,480 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35040.48 MB 2025-02-14 04:35:25,670 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:35:25,671 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:35:25,671 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.18 seconds 2025-02-14 04:35:25,671 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,671 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 31029.74 MB 2025-02-14 04:35:25,671 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 31796.74 MB 2025-02-14 04:35:25,671 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:35:25,671 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 37935.38 MB 2025-02-14 04:35:25,671 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38350.62 MB 2025-02-14 04:35:25,671 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 415.24 MB 2025-02-14 04:35:25,671 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32504.53 MB 2025-02-14 04:35:25,698 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:35:25,699 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:35:25,699 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:35:25,699 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,699 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 32209.63 MB 2025-02-14 04:35:25,699 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32436.99 MB 2025-02-14 04:35:25,699 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 227.36 MB 2025-02-14 04:35:25,699 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38350.62 MB 2025-02-14 04:35:25,699 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38350.62 MB 2025-02-14 04:35:25,699 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:35:25,699 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32677.07 MB 2025-02-14 04:35:25,701 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:35:25,701 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:35:25,701 - resource_logging.py:150 - __exit__ - DEBUG - Time: 29.93 seconds 2025-02-14 04:35:25,701 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,701 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19037.97 MB 2025-02-14 04:35:25,701 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32637.21 MB 2025-02-14 04:35:25,701 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 13599.23 MB 2025-02-14 04:35:25,701 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 49968.84 MB 2025-02-14 04:35:25,701 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38350.62 MB 2025-02-14 04:35:25,701 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -11618.22 MB 2025-02-14 04:35:25,701 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32677.07 MB 2025-02-14 04:35:25,972 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:35:25,972 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:35:25,972 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:35:25,972 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:25,972 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 32637.21 MB 2025-02-14 04:35:25,972 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24029.03 MB 2025-02-14 04:35:25,972 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -8608.18 MB 2025-02-14 04:35:25,972 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38350.62 MB 2025-02-14 04:35:25,972 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 38350.62 MB 2025-02-14 04:35:25,972 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:35:25,972 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35138.12 MB 2025-02-14 04:35:25,997 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8127, cut from 8129 2025-02-14 04:35:25,997 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:35:26,004 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:35:26,004 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:35:26,004 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds 2025-02-14 04:35:26,004 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:35:26,004 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24029.03 MB 2025-02-14 04:35:26,004 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 32432.59 MB 2025-02-14 04:35:26,004 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8403.56 MB 2025-02-14 04:35:26,004 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 38350.62 MB 2025-02-14 04:35:26,004 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 42528.15 MB 2025-02-14 04:35:26,004 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4177.53 MB 2025-02-14 04:35:26,004 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32432.59 MB 2025-02-14 04:35:26,171 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7919] 2025-02-14 04:35:26,172 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:35:26,172 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:35:26,173 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:35:26,173 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:35:26,178 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:35:26,179 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:35:26,179 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:35:26,179 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:36:15,876 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:15,877 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:36:15,882 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:36:15,886 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:15,886 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 184, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:36:15,887 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:15,887 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 184, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:36:18,832 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:36:18,832 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:36:18,832 - resource_logging.py:150 - __exit__ - DEBUG - Time: 2.94 seconds 2025-02-14 04:36:18,832 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,832 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14250.85 MB 2025-02-14 04:36:18,832 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14902.01 MB 2025-02-14 04:36:18,832 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 651.17 MB 2025-02-14 04:36:18,832 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50883.20 MB 2025-02-14 04:36:18,832 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,832 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -33028.05 MB 2025-02-14 04:36:18,832 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 23723.03 MB 2025-02-14 04:36:18,843 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:36:18,843 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:36:18,843 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:36:18,843 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,843 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14902.01 MB 2025-02-14 04:36:18,843 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14009.54 MB 2025-02-14 04:36:18,843 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -892.47 MB 2025-02-14 04:36:18,843 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,843 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,843 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,844 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 15098.94 MB 2025-02-14 04:36:18,923 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:36:18,923 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:36:18,923 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds 2025-02-14 04:36:18,923 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,923 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14009.54 MB 2025-02-14 04:36:18,923 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14025.47 MB 2025-02-14 04:36:18,923 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 15.93 MB 2025-02-14 04:36:18,923 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,923 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,923 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,923 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14775.43 MB 2025-02-14 04:36:18,929 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:36:18,929 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:36:18,929 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds 2025-02-14 04:36:18,930 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,930 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14025.40 MB 2025-02-14 04:36:18,930 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14082.86 MB 2025-02-14 04:36:18,930 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 57.46 MB 2025-02-14 04:36:18,930 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,930 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,930 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,930 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14126.05 MB 2025-02-14 04:36:18,947 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:36:18,947 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:36:18,947 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:36:18,947 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,947 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14082.86 MB 2025-02-14 04:36:18,947 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14150.18 MB 2025-02-14 04:36:18,947 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 67.32 MB 2025-02-14 04:36:18,947 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,947 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,947 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,947 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14318.14 MB 2025-02-14 04:36:18,948 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:36:18,948 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:36:18,948 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:36:18,948 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,949 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14025.40 MB 2025-02-14 04:36:18,949 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14150.18 MB 2025-02-14 04:36:18,949 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 124.78 MB 2025-02-14 04:36:18,949 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,949 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17855.15 MB 2025-02-14 04:36:18,949 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,949 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14318.14 MB 2025-02-14 04:36:18,962 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:36:18,962 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:36:18,962 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:36:18,962 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,962 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14196.84 MB 2025-02-14 04:36:18,962 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14220.89 MB 2025-02-14 04:36:18,962 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 24.05 MB 2025-02-14 04:36:18,962 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17855.15 MB 2025-02-14 04:36:18,962 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17861.44 MB 2025-02-14 04:36:18,962 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6.29 MB 2025-02-14 04:36:18,962 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14251.23 MB 2025-02-14 04:36:18,967 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:36:18,967 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:36:18,967 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds 2025-02-14 04:36:18,967 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,967 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14233.29 MB 2025-02-14 04:36:18,967 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14250.51 MB 2025-02-14 04:36:18,967 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 17.22 MB 2025-02-14 04:36:18,967 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17861.44 MB 2025-02-14 04:36:18,967 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17861.44 MB 2025-02-14 04:36:18,967 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:18,967 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14250.51 MB 2025-02-14 04:36:18,969 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:36:18,969 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:36:18,969 - resource_logging.py:150 - __exit__ - DEBUG - Time: 3.08 seconds 2025-02-14 04:36:18,969 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:18,969 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13609.78 MB 2025-02-14 04:36:18,969 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14281.12 MB 2025-02-14 04:36:18,969 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 671.34 MB 2025-02-14 04:36:18,969 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 50883.20 MB 2025-02-14 04:36:18,969 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17861.44 MB 2025-02-14 04:36:18,969 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -33021.76 MB 2025-02-14 04:36:18,969 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14281.12 MB 2025-02-14 04:36:19,045 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:36:19,045 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:36:19,045 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds 2025-02-14 04:36:19,045 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:19,045 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14281.12 MB 2025-02-14 04:36:19,045 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 14740.08 MB 2025-02-14 04:36:19,045 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 458.96 MB 2025-02-14 04:36:19,045 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17861.44 MB 2025-02-14 04:36:19,045 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17863.54 MB 2025-02-14 04:36:19,045 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 2.10 MB 2025-02-14 04:36:19,045 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 14785.97 MB 2025-02-14 04:36:19,050 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 1231, cut from 1233 2025-02-14 04:36:19,051 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:36:19,053 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:36:19,053 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:36:19,053 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds 2025-02-14 04:36:19,053 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:36:19,053 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14144.28 MB 2025-02-14 04:36:19,053 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15428.84 MB 2025-02-14 04:36:19,054 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1284.57 MB 2025-02-14 04:36:19,054 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 17863.54 MB 2025-02-14 04:36:19,054 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 17863.54 MB 2025-02-14 04:36:19,054 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:36:19,054 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 15428.84 MB 2025-02-14 04:36:19,089 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 1023] 2025-02-14 04:36:19,092 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:19,092 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:36:19,094 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:19,094 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:36:19,101 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:36:19,103 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:36:19,103 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:36:19,103 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 1 ('] 2025-02-14 04:37:04,659 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:04,660 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:37:04,665 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:37:04,669 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:04,669 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1217, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:37:04,670 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:04,670 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1217, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:37:23,467 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:37:23,467 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:37:23,467 - resource_logging.py:150 - __exit__ - DEBUG - Time: 18.79 seconds 2025-02-14 04:37:23,467 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:23,467 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21449.61 MB 2025-02-14 04:37:23,467 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25757.16 MB 2025-02-14 04:37:23,467 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4307.55 MB 2025-02-14 04:37:23,467 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26348.62 MB 2025-02-14 04:37:23,467 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 29704.06 MB 2025-02-14 04:37:23,467 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3355.44 MB 2025-02-14 04:37:23,467 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 34772.16 MB 2025-02-14 04:37:23,549 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:37:23,549 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:37:23,550 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds 2025-02-14 04:37:23,550 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:23,550 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 25757.16 MB 2025-02-14 04:37:23,550 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22104.98 MB 2025-02-14 04:37:23,550 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -3652.18 MB 2025-02-14 04:37:23,550 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 29704.06 MB 2025-02-14 04:37:23,550 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 46045.07 MB 2025-02-14 04:37:23,550 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 16341.01 MB 2025-02-14 04:37:23,550 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 38615.64 MB 2025-02-14 04:37:25,508 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:37:25,509 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:37:25,509 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.96 seconds 2025-02-14 04:37:25,509 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,509 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22104.98 MB 2025-02-14 04:37:25,509 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22635.82 MB 2025-02-14 04:37:25,509 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:37:25,509 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 46045.07 MB 2025-02-14 04:37:25,509 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 24658.31 MB 2025-02-14 04:37:25,509 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -21386.76 MB 2025-02-14 04:37:25,509 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26616.20 MB 2025-02-14 04:37:25,523 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:37:25,523 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:37:25,523 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:37:25,523 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,523 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22635.82 MB 2025-02-14 04:37:25,523 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24525.35 MB 2025-02-14 04:37:25,523 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.52 MB 2025-02-14 04:37:25,523 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24658.31 MB 2025-02-14 04:37:25,523 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 27961.33 MB 2025-02-14 04:37:25,523 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3303.01 MB 2025-02-14 04:37:25,523 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25942.77 MB 2025-02-14 04:37:25,728 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:37:25,728 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:37:25,728 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds 2025-02-14 04:37:25,728 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,728 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24525.35 MB 2025-02-14 04:37:25,728 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26767.20 MB 2025-02-14 04:37:25,728 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:37:25,728 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 27961.33 MB 2025-02-14 04:37:25,728 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34567.36 MB 2025-02-14 04:37:25,728 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6606.03 MB 2025-02-14 04:37:25,728 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32311.48 MB 2025-02-14 04:37:25,729 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:37:25,729 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:37:25,729 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:37:25,729 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,729 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22635.82 MB 2025-02-14 04:37:25,729 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26767.20 MB 2025-02-14 04:37:25,729 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.38 MB 2025-02-14 04:37:25,729 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24658.31 MB 2025-02-14 04:37:25,729 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34567.36 MB 2025-02-14 04:37:25,729 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9909.04 MB 2025-02-14 04:37:25,729 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32311.48 MB 2025-02-14 04:37:25,891 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:37:25,891 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:37:25,891 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:37:25,891 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,891 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 28300.74 MB 2025-02-14 04:37:25,891 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29067.75 MB 2025-02-14 04:37:25,891 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:37:25,891 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34567.36 MB 2025-02-14 04:37:25,891 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34980.50 MB 2025-02-14 04:37:25,891 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 413.14 MB 2025-02-14 04:37:25,891 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29775.53 MB 2025-02-14 04:37:25,909 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:37:25,909 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:37:25,909 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:37:25,909 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,909 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29480.63 MB 2025-02-14 04:37:25,909 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29708.94 MB 2025-02-14 04:37:25,909 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.31 MB 2025-02-14 04:37:25,910 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34980.50 MB 2025-02-14 04:37:25,910 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34980.50 MB 2025-02-14 04:37:25,910 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:37:25,910 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29945.26 MB 2025-02-14 04:37:25,911 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:37:25,911 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:37:25,911 - resource_logging.py:150 - __exit__ - DEBUG - Time: 21.24 seconds 2025-02-14 04:37:25,911 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:25,911 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17209.16 MB 2025-02-14 04:37:25,911 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29909.45 MB 2025-02-14 04:37:25,911 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12700.29 MB 2025-02-14 04:37:25,911 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 22106.08 MB 2025-02-14 04:37:25,911 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34980.50 MB 2025-02-14 04:37:25,911 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 12874.42 MB 2025-02-14 04:37:25,911 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29945.26 MB 2025-02-14 04:37:26,178 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:37:26,178 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:37:26,178 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:37:26,178 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:26,178 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29909.45 MB 2025-02-14 04:37:26,178 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22204.78 MB 2025-02-14 04:37:26,178 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7704.68 MB 2025-02-14 04:37:26,178 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34980.50 MB 2025-02-14 04:37:26,178 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34980.50 MB 2025-02-14 04:37:26,178 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:37:26,178 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32414.05 MB 2025-02-14 04:37:26,196 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8139, cut from 8141 2025-02-14 04:37:26,197 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 ('] 2025-02-14 04:37:26,203 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:37:26,203 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:37:26,203 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:37:26,203 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:37:26,203 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22204.78 MB 2025-02-14 04:37:26,203 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30619.73 MB 2025-02-14 04:37:26,203 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8414.95 MB 2025-02-14 04:37:26,203 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34980.50 MB 2025-02-14 04:37:26,203 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 45441.09 MB 2025-02-14 04:37:26,203 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10460.59 MB 2025-02-14 04:37:26,203 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30619.73 MB 2025-02-14 04:37:26,360 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7931] 2025-02-14 04:37:26,361 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:26,361 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:37:26,362 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:26,362 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:37:26,367 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:37:26,368 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:37:26,368 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:37:26,368 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 ('] 2025-02-14 04:38:14,110 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:14,110 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:38:14,115 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:38:14,119 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:14,119 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1005, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:38:14,120 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:14,120 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1005, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:38:29,686 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:38:29,686 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:38:29,686 - resource_logging.py:150 - __exit__ - DEBUG - Time: 15.56 seconds 2025-02-14 04:38:29,686 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:29,686 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19971.71 MB 2025-02-14 04:38:29,686 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 23528.48 MB 2025-02-14 04:38:29,686 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3556.77 MB 2025-02-14 04:38:29,686 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 53808.73 MB 2025-02-14 04:38:29,686 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 28892.46 MB 2025-02-14 04:38:29,686 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -24916.26 MB 2025-02-14 04:38:29,686 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32387.48 MB 2025-02-14 04:38:29,749 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:38:29,749 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:38:29,749 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds 2025-02-14 04:38:29,749 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:29,749 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23528.48 MB 2025-02-14 04:38:29,749 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21002.53 MB 2025-02-14 04:38:29,749 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -2525.94 MB 2025-02-14 04:38:29,749 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 28892.46 MB 2025-02-14 04:38:29,749 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 40783.31 MB 2025-02-14 04:38:29,749 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 11890.85 MB 2025-02-14 04:38:29,749 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 34858.28 MB 2025-02-14 04:38:31,712 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:38:31,712 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:38:31,712 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.96 seconds 2025-02-14 04:38:31,712 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:31,712 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21002.53 MB 2025-02-14 04:38:31,712 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21533.37 MB 2025-02-14 04:38:31,712 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:38:31,712 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 40783.31 MB 2025-02-14 04:38:31,713 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 24662.51 MB 2025-02-14 04:38:31,713 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -16120.81 MB 2025-02-14 04:38:31,713 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25512.71 MB 2025-02-14 04:38:31,728 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:38:31,728 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:38:31,728 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:38:31,728 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:31,728 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21533.37 MB 2025-02-14 04:38:31,728 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 23422.91 MB 2025-02-14 04:38:31,728 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:38:31,728 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24662.51 MB 2025-02-14 04:38:31,728 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 27965.52 MB 2025-02-14 04:38:31,728 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3303.01 MB 2025-02-14 04:38:31,728 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24840.34 MB 2025-02-14 04:38:31,934 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:38:31,934 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:38:31,934 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds 2025-02-14 04:38:31,934 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:31,934 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23422.91 MB 2025-02-14 04:38:31,934 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25664.76 MB 2025-02-14 04:38:31,934 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:38:31,934 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 27965.52 MB 2025-02-14 04:38:31,934 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34099.69 MB 2025-02-14 04:38:31,934 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:38:31,934 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31209.04 MB 2025-02-14 04:38:31,934 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:38:31,934 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:38:31,934 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:38:31,934 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:31,934 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21533.37 MB 2025-02-14 04:38:31,935 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25664.76 MB 2025-02-14 04:38:31,935 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:38:31,935 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24662.51 MB 2025-02-14 04:38:31,935 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34099.69 MB 2025-02-14 04:38:31,935 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9437.18 MB 2025-02-14 04:38:31,935 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31209.04 MB 2025-02-14 04:38:32,096 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:38:32,096 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:38:32,096 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:38:32,096 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:32,096 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27198.31 MB 2025-02-14 04:38:32,096 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27965.31 MB 2025-02-14 04:38:32,096 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:38:32,096 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34099.69 MB 2025-02-14 04:38:32,096 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34512.83 MB 2025-02-14 04:38:32,096 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 413.14 MB 2025-02-14 04:38:32,096 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28673.10 MB 2025-02-14 04:38:32,114 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:38:32,114 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:38:32,114 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:38:32,114 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:32,114 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 28378.20 MB 2025-02-14 04:38:32,114 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28606.21 MB 2025-02-14 04:38:32,114 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 228.02 MB 2025-02-14 04:38:32,114 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34512.83 MB 2025-02-14 04:38:32,114 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34512.83 MB 2025-02-14 04:38:32,114 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:38:32,114 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28842.44 MB 2025-02-14 04:38:32,115 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:38:32,115 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:38:32,115 - resource_logging.py:150 - __exit__ - DEBUG - Time: 17.99 seconds 2025-02-14 04:38:32,115 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:32,115 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16470.21 MB 2025-02-14 04:38:32,115 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28807.24 MB 2025-02-14 04:38:32,115 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12337.03 MB 2025-02-14 04:38:32,115 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 53808.73 MB 2025-02-14 04:38:32,115 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34512.83 MB 2025-02-14 04:38:32,115 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -19295.90 MB 2025-02-14 04:38:32,115 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 28842.44 MB 2025-02-14 04:38:32,382 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:38:32,382 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:38:32,382 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:38:32,382 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:32,382 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 28807.24 MB 2025-02-14 04:38:32,382 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21473.83 MB 2025-02-14 04:38:32,382 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7333.40 MB 2025-02-14 04:38:32,382 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34512.83 MB 2025-02-14 04:38:32,382 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34512.83 MB 2025-02-14 04:38:32,382 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:38:32,382 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31318.29 MB 2025-02-14 04:38:32,400 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8160, cut from 8162 2025-02-14 04:38:32,400 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:38:32,406 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:38:32,406 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:38:32,407 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:38:32,407 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:38:32,407 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21473.83 MB 2025-02-14 04:38:32,407 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29911.31 MB 2025-02-14 04:38:32,407 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8437.47 MB 2025-02-14 04:38:32,407 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34512.83 MB 2025-02-14 04:38:32,407 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 44998.59 MB 2025-02-14 04:38:32,407 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10485.76 MB 2025-02-14 04:38:32,407 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29911.31 MB 2025-02-14 04:38:32,562 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7952] 2025-02-14 04:38:32,564 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:32,564 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:38:32,565 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:32,565 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:38:32,569 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:38:32,570 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:38:32,570 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:38:32,570 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:39:25,341 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:25,341 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:39:25,350 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:39:25,357 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:25,357 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1098, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:39:25,359 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:25,359 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1098, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:39:42,366 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:39:42,366 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:39:42,366 - resource_logging.py:150 - __exit__ - DEBUG - Time: 17.00 seconds 2025-02-14 04:39:42,366 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:42,366 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20619.75 MB 2025-02-14 04:39:42,366 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24505.77 MB 2025-02-14 04:39:42,366 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3886.02 MB 2025-02-14 04:39:42,366 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 53387.20 MB 2025-02-14 04:39:42,366 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 29232.20 MB 2025-02-14 04:39:42,366 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -24155.00 MB 2025-02-14 04:39:42,366 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 33489.31 MB 2025-02-14 04:39:42,437 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:39:42,437 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:39:42,437 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds 2025-02-14 04:39:42,437 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:42,437 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24505.77 MB 2025-02-14 04:39:42,437 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21486.01 MB 2025-02-14 04:39:42,437 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -3019.76 MB 2025-02-14 04:39:42,437 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 29232.20 MB 2025-02-14 04:39:42,437 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 43037.75 MB 2025-02-14 04:39:42,437 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 13805.55 MB 2025-02-14 04:39:42,437 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 36421.70 MB 2025-02-14 04:39:44,419 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:39:44,419 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:39:44,419 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.98 seconds 2025-02-14 04:39:44,419 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,419 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21486.01 MB 2025-02-14 04:39:44,419 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22016.85 MB 2025-02-14 04:39:44,420 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:39:44,420 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 43037.75 MB 2025-02-14 04:39:44,420 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 24672.99 MB 2025-02-14 04:39:44,420 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -18364.76 MB 2025-02-14 04:39:44,420 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25997.22 MB 2025-02-14 04:39:44,433 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:39:44,433 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:39:44,433 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:39:44,433 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,433 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22016.85 MB 2025-02-14 04:39:44,433 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 23906.39 MB 2025-02-14 04:39:44,433 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:39:44,433 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24672.99 MB 2025-02-14 04:39:44,433 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 27976.01 MB 2025-02-14 04:39:44,433 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3303.01 MB 2025-02-14 04:39:44,433 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25323.81 MB 2025-02-14 04:39:44,642 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:39:44,642 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:39:44,642 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.21 seconds 2025-02-14 04:39:44,642 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,642 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 23906.39 MB 2025-02-14 04:39:44,642 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26148.24 MB 2025-02-14 04:39:44,642 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:39:44,642 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 27976.01 MB 2025-02-14 04:39:44,642 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34110.18 MB 2025-02-14 04:39:44,642 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:39:44,642 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31692.52 MB 2025-02-14 04:39:44,643 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:39:44,643 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:39:44,643 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:39:44,643 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,643 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22016.85 MB 2025-02-14 04:39:44,643 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26148.24 MB 2025-02-14 04:39:44,643 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:39:44,643 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 24672.99 MB 2025-02-14 04:39:44,643 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34110.18 MB 2025-02-14 04:39:44,643 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9437.18 MB 2025-02-14 04:39:44,643 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31692.52 MB 2025-02-14 04:39:44,804 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:39:44,804 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:39:44,804 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:39:44,804 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,804 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27681.78 MB 2025-02-14 04:39:44,804 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28448.79 MB 2025-02-14 04:39:44,804 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:39:44,804 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34110.18 MB 2025-02-14 04:39:44,804 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34521.22 MB 2025-02-14 04:39:44,804 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 411.04 MB 2025-02-14 04:39:44,804 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29156.57 MB 2025-02-14 04:39:44,822 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:39:44,822 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:39:44,822 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:39:44,822 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,822 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 28861.68 MB 2025-02-14 04:39:44,822 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29089.37 MB 2025-02-14 04:39:44,822 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 227.70 MB 2025-02-14 04:39:44,822 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34521.22 MB 2025-02-14 04:39:44,822 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34521.22 MB 2025-02-14 04:39:44,822 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:39:44,822 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29326.08 MB 2025-02-14 04:39:44,823 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:39:44,823 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:39:44,823 - resource_logging.py:150 - __exit__ - DEBUG - Time: 19.46 seconds 2025-02-14 04:39:44,823 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:44,823 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16794.23 MB 2025-02-14 04:39:44,823 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29289.85 MB 2025-02-14 04:39:44,823 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12495.63 MB 2025-02-14 04:39:44,823 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 53387.20 MB 2025-02-14 04:39:44,823 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34521.22 MB 2025-02-14 04:39:44,823 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -18865.98 MB 2025-02-14 04:39:44,823 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29326.08 MB 2025-02-14 04:39:45,089 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:39:45,090 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:39:45,090 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds 2025-02-14 04:39:45,090 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:45,090 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29289.85 MB 2025-02-14 04:39:45,090 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21789.47 MB 2025-02-14 04:39:45,090 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7500.38 MB 2025-02-14 04:39:45,090 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34521.22 MB 2025-02-14 04:39:45,090 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34521.22 MB 2025-02-14 04:39:45,090 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:39:45,090 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31794.15 MB 2025-02-14 04:39:45,107 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8138, cut from 8140 2025-02-14 04:39:45,108 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:39:45,114 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:39:45,114 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:39:45,114 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:39:45,114 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:39:45,114 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21789.47 MB 2025-02-14 04:39:45,114 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30203.45 MB 2025-02-14 04:39:45,114 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8413.98 MB 2025-02-14 04:39:45,114 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34521.22 MB 2025-02-14 04:39:45,114 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 44979.72 MB 2025-02-14 04:39:45,114 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10458.50 MB 2025-02-14 04:39:45,114 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30203.45 MB 2025-02-14 04:39:45,270 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7930] 2025-02-14 04:39:45,272 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:45,272 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:39:45,273 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:45,273 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:39:45,277 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:39:45,278 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:45,278 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:39:45,278 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:39:59,737 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:59,737 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:39:59,742 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:39:59,745 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:59,745 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1131, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:39:59,746 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:39:59,746 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1131, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:40:17,434 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:40:17,435 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:40:17,435 - resource_logging.py:150 - __exit__ - DEBUG - Time: 17.68 seconds 2025-02-14 04:40:17,435 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:17,435 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20849.70 MB 2025-02-14 04:40:17,435 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24853.16 MB 2025-02-14 04:40:17,435 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4003.46 MB 2025-02-14 04:40:17,435 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 57526.98 MB 2025-02-14 04:40:17,435 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 31379.69 MB 2025-02-14 04:40:17,435 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -26147.29 MB 2025-02-14 04:40:17,435 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 33718.45 MB 2025-02-14 04:40:17,504 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:40:17,504 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:40:17,504 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds 2025-02-14 04:40:17,504 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:17,504 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24853.16 MB 2025-02-14 04:40:17,504 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21657.57 MB 2025-02-14 04:40:17,504 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -3195.59 MB 2025-02-14 04:40:17,504 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 31379.69 MB 2025-02-14 04:40:17,504 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 42362.47 MB 2025-02-14 04:40:17,504 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10982.79 MB 2025-02-14 04:40:17,504 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 36991.68 MB 2025-02-14 04:40:19,439 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:40:19,440 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:40:19,440 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.93 seconds 2025-02-14 04:40:19,440 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,440 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21657.57 MB 2025-02-14 04:40:19,440 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22188.41 MB 2025-02-14 04:40:19,440 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:40:19,440 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 42362.47 MB 2025-02-14 04:40:19,440 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26698.84 MB 2025-02-14 04:40:19,440 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -15663.63 MB 2025-02-14 04:40:19,440 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26168.78 MB 2025-02-14 04:40:19,458 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:40:19,458 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:40:19,458 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:40:19,458 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,458 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22188.41 MB 2025-02-14 04:40:19,458 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24077.94 MB 2025-02-14 04:40:19,458 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:40:19,458 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26698.84 MB 2025-02-14 04:40:19,458 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 28586.28 MB 2025-02-14 04:40:19,458 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 1887.44 MB 2025-02-14 04:40:19,458 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25495.37 MB 2025-02-14 04:40:19,665 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:40:19,665 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:40:19,665 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.21 seconds 2025-02-14 04:40:19,665 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,665 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24077.94 MB 2025-02-14 04:40:19,665 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26319.80 MB 2025-02-14 04:40:19,665 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:40:19,665 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 28586.28 MB 2025-02-14 04:40:19,665 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34248.59 MB 2025-02-14 04:40:19,665 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 5662.31 MB 2025-02-14 04:40:19,665 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31864.08 MB 2025-02-14 04:40:19,666 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:40:19,666 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:40:19,666 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:40:19,666 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,666 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22188.41 MB 2025-02-14 04:40:19,666 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26319.80 MB 2025-02-14 04:40:19,666 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:40:19,666 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26698.84 MB 2025-02-14 04:40:19,666 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34248.59 MB 2025-02-14 04:40:19,666 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 7549.75 MB 2025-02-14 04:40:19,666 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31864.08 MB 2025-02-14 04:40:19,828 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:40:19,828 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:40:19,828 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:40:19,828 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,828 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 27853.34 MB 2025-02-14 04:40:19,828 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 28620.34 MB 2025-02-14 04:40:19,828 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:40:19,828 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34248.59 MB 2025-02-14 04:40:19,828 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34661.73 MB 2025-02-14 04:40:19,828 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 413.14 MB 2025-02-14 04:40:19,828 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29328.13 MB 2025-02-14 04:40:19,846 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:40:19,847 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:40:19,847 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:40:19,847 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,847 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29033.23 MB 2025-02-14 04:40:19,847 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29262.36 MB 2025-02-14 04:40:19,847 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 229.13 MB 2025-02-14 04:40:19,847 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34661.73 MB 2025-02-14 04:40:19,847 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34661.73 MB 2025-02-14 04:40:19,847 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:40:19,847 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29497.32 MB 2025-02-14 04:40:19,848 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:40:19,848 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:40:19,848 - resource_logging.py:150 - __exit__ - DEBUG - Time: 20.10 seconds 2025-02-14 04:40:19,848 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:19,848 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16909.20 MB 2025-02-14 04:40:19,848 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29462.95 MB 2025-02-14 04:40:19,848 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12553.74 MB 2025-02-14 04:40:19,848 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 57526.98 MB 2025-02-14 04:40:19,848 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34661.73 MB 2025-02-14 04:40:19,848 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -22865.25 MB 2025-02-14 04:40:19,848 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29497.32 MB 2025-02-14 04:40:20,115 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:40:20,115 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:40:20,115 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:40:20,115 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:20,115 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29462.95 MB 2025-02-14 04:40:20,115 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21905.97 MB 2025-02-14 04:40:20,115 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7556.97 MB 2025-02-14 04:40:20,115 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34661.73 MB 2025-02-14 04:40:20,115 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 34661.73 MB 2025-02-14 04:40:20,115 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:40:20,115 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 31968.47 MB 2025-02-14 04:40:20,133 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8142, cut from 8144 2025-02-14 04:40:20,133 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:40:20,139 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:40:20,139 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:40:20,139 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:40:20,139 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:20,139 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21905.97 MB 2025-02-14 04:40:20,139 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30324.13 MB 2025-02-14 04:40:20,139 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8418.15 MB 2025-02-14 04:40:20,139 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 34661.73 MB 2025-02-14 04:40:20,139 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 45124.42 MB 2025-02-14 04:40:20,139 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10462.69 MB 2025-02-14 04:40:20,139 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30324.13 MB 2025-02-14 04:40:20,296 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7934] 2025-02-14 04:40:20,297 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:20,297 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:40:20,298 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:20,298 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:40:20,303 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:40:20,304 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:20,304 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:40:20,304 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:40:37,572 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:37,572 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:40:37,580 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:40:37,586 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:37,586 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 310, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:40:37,588 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:37,588 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 310, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:40:42,518 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:40:42,518 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:40:42,519 - resource_logging.py:150 - __exit__ - DEBUG - Time: 4.92 seconds 2025-02-14 04:40:42,519 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:42,519 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15128.84 MB 2025-02-14 04:40:42,519 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16225.91 MB 2025-02-14 04:40:42,519 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1097.07 MB 2025-02-14 04:40:42,519 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 57677.97 MB 2025-02-14 04:40:42,519 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19897.78 MB 2025-02-14 04:40:42,519 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -37780.19 MB 2025-02-14 04:40:42,519 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25054.00 MB 2025-02-14 04:40:42,539 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:40:42,539 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:40:42,539 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:40:42,539 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:42,539 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16225.91 MB 2025-02-14 04:40:42,539 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16702.55 MB 2025-02-14 04:40:42,539 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 476.64 MB 2025-02-14 04:40:42,539 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19897.78 MB 2025-02-14 04:40:42,539 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 23089.64 MB 2025-02-14 04:40:42,539 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3191.87 MB 2025-02-14 04:40:42,539 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 20491.20 MB 2025-02-14 04:40:43,983 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:40:43,983 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:40:43,983 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.44 seconds 2025-02-14 04:40:43,983 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:43,983 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16702.55 MB 2025-02-14 04:40:43,983 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 17103.34 MB 2025-02-14 04:40:43,983 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 400.79 MB 2025-02-14 04:40:43,983 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 23089.64 MB 2025-02-14 04:40:43,983 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20921.19 MB 2025-02-14 04:40:43,983 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -2168.46 MB 2025-02-14 04:40:43,983 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21043.90 MB 2025-02-14 04:40:43,994 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:40:43,994 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:40:43,994 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:40:43,994 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:43,994 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17103.34 MB 2025-02-14 04:40:43,994 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 18530.97 MB 2025-02-14 04:40:43,994 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1427.64 MB 2025-02-14 04:40:43,994 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20921.19 MB 2025-02-14 04:40:43,995 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 21634.22 MB 2025-02-14 04:40:43,995 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 713.03 MB 2025-02-14 04:40:43,995 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 19601.13 MB 2025-02-14 04:40:44,153 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:40:44,154 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:40:44,154 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:40:44,154 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,154 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18530.97 MB 2025-02-14 04:40:44,154 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20224.12 MB 2025-02-14 04:40:44,154 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1693.14 MB 2025-02-14 04:40:44,154 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 21634.22 MB 2025-02-14 04:40:44,154 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25912.41 MB 2025-02-14 04:40:44,154 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4278.19 MB 2025-02-14 04:40:44,154 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24412.13 MB 2025-02-14 04:40:44,154 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:40:44,154 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:40:44,154 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.17 seconds 2025-02-14 04:40:44,154 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,154 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17103.34 MB 2025-02-14 04:40:44,154 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 20224.12 MB 2025-02-14 04:40:44,154 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3120.78 MB 2025-02-14 04:40:44,154 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20921.19 MB 2025-02-14 04:40:44,154 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25912.41 MB 2025-02-14 04:40:44,154 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 4991.22 MB 2025-02-14 04:40:44,154 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 24412.13 MB 2025-02-14 04:40:44,332 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:40:44,332 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:40:44,332 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.17 seconds 2025-02-14 04:40:44,332 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,332 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21381.94 MB 2025-02-14 04:40:44,332 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21961.03 MB 2025-02-14 04:40:44,332 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 579.09 MB 2025-02-14 04:40:44,332 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25912.41 MB 2025-02-14 04:40:44,332 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26222.79 MB 2025-02-14 04:40:44,332 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 310.38 MB 2025-02-14 04:40:44,332 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22495.41 MB 2025-02-14 04:40:44,353 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:40:44,353 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:40:44,353 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:40:44,353 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,353 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22272.76 MB 2025-02-14 04:40:44,353 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22494.21 MB 2025-02-14 04:40:44,353 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 221.45 MB 2025-02-14 04:40:44,354 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26222.79 MB 2025-02-14 04:40:44,354 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26222.79 MB 2025-02-14 04:40:44,354 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:40:44,354 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22600.51 MB 2025-02-14 04:40:44,355 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:40:44,355 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:40:44,355 - resource_logging.py:150 - __exit__ - DEBUG - Time: 6.76 seconds 2025-02-14 04:40:44,356 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,356 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14048.77 MB 2025-02-14 04:40:44,356 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22694.86 MB 2025-02-14 04:40:44,356 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8646.09 MB 2025-02-14 04:40:44,356 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 57677.97 MB 2025-02-14 04:40:44,356 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26222.79 MB 2025-02-14 04:40:44,356 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -31455.18 MB 2025-02-14 04:40:44,356 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 22694.86 MB 2025-02-14 04:40:44,641 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:40:44,641 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:40:44,641 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.28 seconds 2025-02-14 04:40:44,641 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,641 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22694.86 MB 2025-02-14 04:40:44,641 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 25702.63 MB 2025-02-14 04:40:44,641 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 3007.77 MB 2025-02-14 04:40:44,641 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26222.79 MB 2025-02-14 04:40:44,641 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 26893.88 MB 2025-02-14 04:40:44,641 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 671.09 MB 2025-02-14 04:40:44,641 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26003.91 MB 2025-02-14 04:40:44,667 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8145, cut from 8147 2025-02-14 04:40:44,668 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:40:44,741 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:40:44,741 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:40:44,741 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.10 seconds 2025-02-14 04:40:44,741 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:40:44,741 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18584.19 MB 2025-02-14 04:40:44,741 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 27006.15 MB 2025-02-14 04:40:44,741 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8421.96 MB 2025-02-14 04:40:44,741 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 26893.88 MB 2025-02-14 04:40:44,741 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 37358.67 MB 2025-02-14 04:40:44,741 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 10464.79 MB 2025-02-14 04:40:44,741 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 27006.15 MB 2025-02-14 04:40:44,990 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7937] 2025-02-14 04:40:44,993 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:44,993 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:40:44,995 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:44,995 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:40:45,002 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:40:45,004 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:40:45,004 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:40:45,004 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:41:39,431 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:39,431 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:41:39,436 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:41:39,440 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:39,440 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 331, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:41:39,442 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:39,442 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 331, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:41:44,528 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:41:44,528 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:41:44,528 - resource_logging.py:150 - __exit__ - DEBUG - Time: 5.08 seconds 2025-02-14 04:41:44,528 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,528 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15275.17 MB 2025-02-14 04:41:44,528 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16447.48 MB 2025-02-14 04:41:44,528 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1172.31 MB 2025-02-14 04:41:44,528 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 45730.50 MB 2025-02-14 04:41:44,528 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20222.84 MB 2025-02-14 04:41:44,528 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -25507.66 MB 2025-02-14 04:41:44,528 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 25426.82 MB 2025-02-14 04:41:44,541 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:41:44,541 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:41:44,541 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:41:44,541 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,541 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16447.48 MB 2025-02-14 04:41:44,541 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15082.76 MB 2025-02-14 04:41:44,541 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -1364.71 MB 2025-02-14 04:41:44,541 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20222.84 MB 2025-02-14 04:41:44,541 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 20222.84 MB 2025-02-14 04:41:44,541 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:41:44,541 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 17273.05 MB 2025-02-14 04:41:44,810 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:41:44,810 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:41:44,810 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.27 seconds 2025-02-14 04:41:44,810 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,810 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15082.76 MB 2025-02-14 04:41:44,810 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15157.08 MB 2025-02-14 04:41:44,810 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 74.32 MB 2025-02-14 04:41:44,810 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 20222.84 MB 2025-02-14 04:41:44,810 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19050.53 MB 2025-02-14 04:41:44,810 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -1172.31 MB 2025-02-14 04:41:44,810 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18656.89 MB 2025-02-14 04:41:44,815 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:41:44,815 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:41:44,815 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds 2025-02-14 04:41:44,815 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,815 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15157.01 MB 2025-02-14 04:41:44,815 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15421.49 MB 2025-02-14 04:41:44,815 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 264.47 MB 2025-02-14 04:41:44,815 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19050.53 MB 2025-02-14 04:41:44,815 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19050.53 MB 2025-02-14 04:41:44,815 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:41:44,815 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 15619.93 MB 2025-02-14 04:41:44,869 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:41:44,869 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:41:44,870 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds 2025-02-14 04:41:44,870 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,870 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15421.49 MB 2025-02-14 04:41:44,870 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15742.74 MB 2025-02-14 04:41:44,870 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 321.25 MB 2025-02-14 04:41:44,870 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19050.53 MB 2025-02-14 04:41:44,870 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19050.53 MB 2025-02-14 04:41:44,870 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:41:44,870 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16511.54 MB 2025-02-14 04:41:44,870 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:41:44,870 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:41:44,870 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds 2025-02-14 04:41:44,870 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,870 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15157.01 MB 2025-02-14 04:41:44,870 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 15742.74 MB 2025-02-14 04:41:44,870 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 585.73 MB 2025-02-14 04:41:44,870 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19050.53 MB 2025-02-14 04:41:44,870 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19050.53 MB 2025-02-14 04:41:44,870 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:41:44,870 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16511.54 MB 2025-02-14 04:41:44,900 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:41:44,900 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:41:44,900 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds 2025-02-14 04:41:44,900 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,900 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16053.38 MB 2025-02-14 04:41:44,900 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16188.29 MB 2025-02-14 04:41:44,900 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 134.91 MB 2025-02-14 04:41:44,900 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19050.53 MB 2025-02-14 04:41:44,900 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19130.22 MB 2025-02-14 04:41:44,900 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 79.69 MB 2025-02-14 04:41:44,900 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16287.38 MB 2025-02-14 04:41:44,916 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:41:44,916 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:41:44,916 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:41:44,916 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,916 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16273.63 MB 2025-02-14 04:41:44,916 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16410.07 MB 2025-02-14 04:41:44,916 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 136.44 MB 2025-02-14 04:41:44,916 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19130.22 MB 2025-02-14 04:41:44,916 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19132.32 MB 2025-02-14 04:41:44,917 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 2.10 MB 2025-02-14 04:41:44,917 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16410.07 MB 2025-02-14 04:41:44,918 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:41:44,918 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:41:44,918 - resource_logging.py:150 - __exit__ - DEBUG - Time: 5.47 seconds 2025-02-14 04:41:44,918 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:44,918 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14121.94 MB 2025-02-14 04:41:44,918 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16531.58 MB 2025-02-14 04:41:44,918 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2409.64 MB 2025-02-14 04:41:44,918 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 45730.50 MB 2025-02-14 04:41:44,918 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19132.32 MB 2025-02-14 04:41:44,918 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -26598.18 MB 2025-02-14 04:41:44,918 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 16531.58 MB 2025-02-14 04:41:45,072 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:41:45,072 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:41:45,072 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.15 seconds 2025-02-14 04:41:45,072 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:45,072 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16531.58 MB 2025-02-14 04:41:45,072 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 16270.51 MB 2025-02-14 04:41:45,072 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -261.07 MB 2025-02-14 04:41:45,072 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19132.32 MB 2025-02-14 04:41:45,072 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 19132.32 MB 2025-02-14 04:41:45,072 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:41:45,072 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 18413.75 MB 2025-02-14 04:41:45,083 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 4927, cut from 4929 2025-02-14 04:41:45,084 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:41:45,088 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:41:45,088 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:41:45,088 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:41:45,088 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:41:45,088 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16270.51 MB 2025-02-14 04:41:45,088 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 21370.66 MB 2025-02-14 04:41:45,088 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 5100.15 MB 2025-02-14 04:41:45,088 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 19132.32 MB 2025-02-14 04:41:45,088 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25472.01 MB 2025-02-14 04:41:45,088 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6339.69 MB 2025-02-14 04:41:45,088 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 21370.66 MB 2025-02-14 04:41:45,186 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 4719] 2025-02-14 04:41:45,187 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:45,187 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:41:45,188 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:45,188 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:41:45,193 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:41:45,194 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:45,194 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:41:45,194 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The final rate for this video is 2 ('] 2025-02-14 04:41:50,049 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:50,050 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0] 2025-02-14 04:41:50,057 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224 2025-02-14 04:41:50,064 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:50,064 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 1260, 3, 384, 384]), torch.float32, cuda:0] 2025-02-14 04:41:50,066 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:41:50,066 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 1260, 3, 378, 378]), torch.float32, cuda:0] 2025-02-14 04:42:09,581 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino 2025-02-14 04:42:09,582 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 856 2025-02-14 04:42:09,582 - resource_logging.py:150 - __exit__ - DEBUG - Time: 19.51 seconds 2025-02-14 04:42:09,582 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:09,582 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21748.59 MB 2025-02-14 04:42:09,582 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26207.66 MB 2025-02-14 04:42:09,582 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4459.07 MB 2025-02-14 04:42:09,582 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30542.92 MB 2025-02-14 04:42:09,582 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 31511.81 MB 2025-02-14 04:42:09,582 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 968.88 MB 2025-02-14 04:42:09,582 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 35071.14 MB 2025-02-14 04:42:09,738 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame 2025-02-14 04:42:09,738 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 862 2025-02-14 04:42:09,738 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds 2025-02-14 04:42:09,738 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:09,738 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 26207.66 MB 2025-02-14 04:42:09,738 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22328.20 MB 2025-02-14 04:42:09,738 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -3879.46 MB 2025-02-14 04:42:09,738 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 31511.81 MB 2025-02-14 04:42:09,738 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 47133.49 MB 2025-02-14 04:42:09,738 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 15621.69 MB 2025-02-14 04:42:09,738 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 39558.25 MB 2025-02-14 04:42:11,664 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip 2025-02-14 04:42:11,664 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877 2025-02-14 04:42:11,664 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.92 seconds 2025-02-14 04:42:11,664 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:11,664 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22328.20 MB 2025-02-14 04:42:11,664 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22859.04 MB 2025-02-14 04:42:11,664 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 530.84 MB 2025-02-14 04:42:11,664 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 47133.49 MB 2025-02-14 04:42:11,665 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 25931.28 MB 2025-02-14 04:42:11,665 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: -21202.21 MB 2025-02-14 04:42:11,665 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26839.41 MB 2025-02-14 04:42:11,678 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1 2025-02-14 04:42:11,678 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 918 2025-02-14 04:42:11,678 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds 2025-02-14 04:42:11,678 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:11,678 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22859.04 MB 2025-02-14 04:42:11,678 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 24748.57 MB 2025-02-14 04:42:11,679 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 1889.53 MB 2025-02-14 04:42:11,679 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25931.28 MB 2025-02-14 04:42:11,679 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 29234.30 MB 2025-02-14 04:42:11,679 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 3303.01 MB 2025-02-14 04:42:11,679 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 26166.00 MB 2025-02-14 04:42:11,883 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group 2025-02-14 04:42:11,883 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 936 2025-02-14 04:42:11,883 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds 2025-02-14 04:42:11,883 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:11,883 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 24748.57 MB 2025-02-14 04:42:11,883 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26990.43 MB 2025-02-14 04:42:11,883 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 2241.86 MB 2025-02-14 04:42:11,883 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 29234.30 MB 2025-02-14 04:42:11,883 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35368.47 MB 2025-02-14 04:42:11,883 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 6134.17 MB 2025-02-14 04:42:11,883 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32534.71 MB 2025-02-14 04:42:11,884 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA 2025-02-14 04:42:11,884 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 913 2025-02-14 04:42:11,884 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.22 seconds 2025-02-14 04:42:11,884 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:11,884 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22859.04 MB 2025-02-14 04:42:11,884 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 26990.43 MB 2025-02-14 04:42:11,884 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 4131.39 MB 2025-02-14 04:42:11,884 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 25931.28 MB 2025-02-14 04:42:11,884 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35368.47 MB 2025-02-14 04:42:11,884 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 9437.18 MB 2025-02-14 04:42:11,884 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32534.71 MB 2025-02-14 04:42:12,104 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding 2025-02-14 04:42:12,104 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1079 2025-02-14 04:42:12,104 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.21 seconds 2025-02-14 04:42:12,104 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:12,104 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 28523.97 MB 2025-02-14 04:42:12,104 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29290.97 MB 2025-02-14 04:42:12,104 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 767.00 MB 2025-02-14 04:42:12,104 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 35368.47 MB 2025-02-14 04:42:12,104 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35783.70 MB 2025-02-14 04:42:12,104 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 415.24 MB 2025-02-14 04:42:12,104 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 29998.76 MB 2025-02-14 04:42:12,122 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC 2025-02-14 04:42:12,122 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1380 2025-02-14 04:42:12,122 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:42:12,122 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:12,122 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 29703.86 MB 2025-02-14 04:42:12,122 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 29931.84 MB 2025-02-14 04:42:12,122 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 227.98 MB 2025-02-14 04:42:12,122 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 35783.70 MB 2025-02-14 04:42:12,122 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35783.70 MB 2025-02-14 04:42:12,122 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:42:12,122 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30164.29 MB 2025-02-14 04:42:12,124 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal 2025-02-14 04:42:12,124 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309 2025-02-14 04:42:12,124 - resource_logging.py:150 - __exit__ - DEBUG - Time: 22.06 seconds 2025-02-14 04:42:12,124 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:12,124 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17358.65 MB 2025-02-14 04:42:12,124 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30131.73 MB 2025-02-14 04:42:12,124 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 12773.08 MB 2025-02-14 04:42:12,124 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 30542.92 MB 2025-02-14 04:42:12,124 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35783.70 MB 2025-02-14 04:42:12,124 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 5240.78 MB 2025-02-14 04:42:12,124 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30164.29 MB 2025-02-14 04:42:12,389 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward 2025-02-14 04:42:12,389 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390 2025-02-14 04:42:12,389 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds 2025-02-14 04:42:12,389 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:12,389 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 30131.73 MB 2025-02-14 04:42:12,389 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 22344.75 MB 2025-02-14 04:42:12,389 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: -7786.98 MB 2025-02-14 04:42:12,389 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 35783.70 MB 2025-02-14 04:42:12,389 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 35783.70 MB 2025-02-14 04:42:12,389 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 0.00 MB 2025-02-14 04:42:12,389 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 32628.65 MB 2025-02-14 04:42:12,407 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8114, cut from 8116 2025-02-14 04:42:12,407 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2,'] 2025-02-14 04:42:12,413 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits 2025-02-14 04:42:12,413 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456 2025-02-14 04:42:12,413 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds 2025-02-14 04:42:12,413 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0 2025-02-14 04:42:12,413 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 22344.75 MB 2025-02-14 04:42:12,413 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block: 30733.90 MB 2025-02-14 04:42:12,413 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change: 8389.15 MB 2025-02-14 04:42:12,414 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block: 35783.70 MB 2025-02-14 04:42:12,414 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block: 44126.18 MB 2025-02-14 04:42:12,414 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change: 8342.47 MB 2025-02-14 04:42:12,414 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated: 30733.90 MB 2025-02-14 04:42:12,568 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7906] 2025-02-14 04:42:12,569 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:42:12,569 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0] 2025-02-14 04:42:12,570 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:42:12,570 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0] 2025-02-14 04:42:12,575 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237] 2025-02-14 04:42:12,576 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown 2025-02-14 04:42:12,576 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0] 2025-02-14 04:42:12,576 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2,']