2025-02-15 03:00:20,262 - training_args.py:2100 - _setup_devices - INFO - PyTorch: setting up devices
2025-02-15 03:00:20,999 - configuration_utils.py:731 - _get_config_dict - INFO - loading configuration file ./checkpoints/longvu_llama3_2/config.json
2025-02-15 03:00:21,002 - configuration_utils.py:800 - from_dict - INFO - Model config CambrianConfig {
  "_name_or_path": "/tmp/iopath_cache/manifold_cache/tree/users/shenx/finetune/09281004-cambrian_llama3_2_t576_ov",
  "architectures": [
    "CambrianLlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "connect_layer": 2,
  "connector_depth": 3,
  "connector_only": true,
  "dino_threshold": 0.83,
  "drop_threshold": 0.8,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "frame_pos": false,
  "freeze_mm_mlp_adapter": false,
  "hidden_act": "silu",
  "hidden_size": 3072,
  "highres": true,
  "highres_connect": false,
  "image_aspect_ratio": "pad",
  "image_position": 91,
  "image_token_len": 144,
  "initializer_range": 0.02,
  "intermediate_size": 8192,
  "is_image_newline": true,
  "is_st_sampler": false,
  "lowres_token": 8,
  "max_position_embeddings": 131072,
  "mlp_bias": false,
  "mm_patch_merge_type": "flat",
  "mm_projector_lr": null,
  "mm_projector_type": "sva",
  "mm_use_im_patch_token": false,
  "mm_use_im_start_end": false,
  "mm_vision_sampler_lr": null,
  "mm_vision_select_feature": "patch",
  "mm_vision_select_layer": -2,
  "mm_vision_tower_aux_list": [
    "siglip/CLIP-ViT-SO400M-14-384",
    "facebook/dinov2-giant-res378"
  ],
  "mm_vision_tower_aux_token_len_list": [
    576,
    576
  ],
  "mm_vision_tower_lr": null,
  "model_type": "cambrian_llama",
  "num_attention_heads": 24,
  "num_hidden_layers": 28,
  "num_key_value_heads": 8,
  "num_of_vision_sampler_layers": 10,
  "num_query_group": 1,
  "pretraining_tp": 1,
  "query_num_list": [
    144
  ],
  "rms_norm_eps": 1e-05,
  "rope_scaling": {
    "factor": 32.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "rope_theta": 500000.0,
  "spmd_debug": null,
  "spmd_fsdp_sharding": null,
  "spmd_mesh": null,
  "start_of_vision_sampler_layers": 0,
  "stride_of_vision_sampler_layers": 3,
  "tie_word_embeddings": false,
  "tokenizer_model_max_length": 8192,
  "tokenizer_padding_side": "right",
  "torch_dtype": "float32",
  "transformers_version": "4.43.1",
  "tune_mm_mlp_adapter": false,
  "unfreeze_mm_vision_tower": false,
  "use_cache": false,
  "use_mm_proj": true,
  "vision_hidden_size": 1024,
  "vision_tower_aux_token_len_list": [
    576,
    576
  ],
  "vocab_size": 128256
}

2025-02-15 03:00:21,002 - modeling_utils.py:3618 - from_pretrained - INFO - loading weights file ./checkpoints/longvu_llama3_2/pytorch_model.bin
2025-02-15 03:00:21,063 - configuration_utils.py:1038 - from_dict - INFO - Generate config GenerationConfig {
  "bos_token_id": 128000,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "use_cache": false
}

2025-02-15 03:00:21,501 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/config.json
2025-02-15 03:00:21,505 - configuration_utils.py:800 - from_dict - INFO - Model config Dinov2Config {
  "apply_layernorm": true,
  "architectures": [
    "Dinov2Model"
  ],
  "attention_probs_dropout_prob": 0.0,
  "drop_path_rate": 0.0,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.0,
  "hidden_size": 1536,
  "image_size": 518,
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-06,
  "layerscale_value": 1.0,
  "mlp_ratio": 4,
  "model_type": "dinov2",
  "num_attention_heads": 24,
  "num_channels": 3,
  "num_hidden_layers": 40,
  "out_features": [
    "stage40"
  ],
  "out_indices": [
    40
  ],
  "patch_size": 14,
  "qkv_bias": true,
  "reshape_hidden_states": true,
  "stage_names": [
    "stem",
    "stage1",
    "stage2",
    "stage3",
    "stage4",
    "stage5",
    "stage6",
    "stage7",
    "stage8",
    "stage9",
    "stage10",
    "stage11",
    "stage12",
    "stage13",
    "stage14",
    "stage15",
    "stage16",
    "stage17",
    "stage18",
    "stage19",
    "stage20",
    "stage21",
    "stage22",
    "stage23",
    "stage24",
    "stage25",
    "stage26",
    "stage27",
    "stage28",
    "stage29",
    "stage30",
    "stage31",
    "stage32",
    "stage33",
    "stage34",
    "stage35",
    "stage36",
    "stage37",
    "stage38",
    "stage39",
    "stage40"
  ],
  "torch_dtype": "float32",
  "transformers_version": "4.43.1",
  "use_swiglu_ffn": true
}

2025-02-15 03:00:22,830 - modeling_utils.py:4450 - _load_pretrained_model - INFO - All model checkpoint weights were used when initializing CambrianLlamaForCausalLM.

2025-02-15 03:00:22,830 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of CambrianLlamaForCausalLM were initialized from the model checkpoint at ./checkpoints/longvu_llama3_2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CambrianLlamaForCausalLM for predictions without further training.
2025-02-15 03:00:22,838 - configuration_utils.py:991 - from_pretrained - INFO - loading configuration file ./checkpoints/longvu_llama3_2/generation_config.json
2025-02-15 03:00:22,838 - configuration_utils.py:1038 - from_dict - INFO - Generate config GenerationConfig {
  "bos_token_id": 128000,
  "do_sample": true,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "temperature": 0.6,
  "top_p": 0.9
}

2025-02-15 03:00:23,018 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file tokenizer.json
2025-02-15 03:00:23,018 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file added_tokens.json
2025-02-15 03:00:23,018 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file special_tokens_map.json
2025-02-15 03:00:23,018 - tokenization_utils_base.py:2287 - from_pretrained - INFO - loading file tokenizer_config.json
2025-02-15 03:00:23,245 - tokenization_utils_base.py:2533 - _from_pretrained - INFO - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2025-02-15 03:00:23,627 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/config.json
2025-02-15 03:00:23,629 - configuration_utils.py:800 - from_dict - INFO - Model config SiglipVisionConfig {
  "attention_dropout": 0.0,
  "hidden_act": "gelu_pytorch_tanh",
  "hidden_size": 1152,
  "image_size": 384,
  "intermediate_size": 4304,
  "layer_norm_eps": 1e-06,
  "model_type": "siglip_vision_model",
  "num_attention_heads": 16,
  "num_channels": 3,
  "num_hidden_layers": 27,
  "patch_size": 14,
  "transformers_version": "4.43.1"
}

2025-02-15 03:00:23,630 - modeling_utils.py:3621 - from_pretrained - INFO - loading weights file model.safetensors from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/model.safetensors
2025-02-15 03:00:23,783 - modeling_utils.py:4440 - _load_pretrained_model - INFO - Some weights of the model checkpoint at google/siglip-so400m-patch14-384 were not used when initializing SiglipVisionModel: ['logit_bias', 'logit_scale', 'text_model.embeddings.position_embedding.weight', 'text_model.embeddings.token_embedding.weight', 'text_model.encoder.layers.0.layer_norm1.bias', 'text_model.encoder.layers.0.layer_norm1.weight', 'text_model.encoder.layers.0.layer_norm2.bias', 'text_model.encoder.layers.0.layer_norm2.weight', 'text_model.encoder.layers.0.mlp.fc1.bias', 'text_model.encoder.layers.0.mlp.fc1.weight', 'text_model.encoder.layers.0.mlp.fc2.bias', 'text_model.encoder.layers.0.mlp.fc2.weight', 'text_model.encoder.layers.0.self_attn.k_proj.bias', 'text_model.encoder.layers.0.self_attn.k_proj.weight', 'text_model.encoder.layers.0.self_attn.out_proj.bias', 'text_model.encoder.layers.0.self_attn.out_proj.weight', 'text_model.encoder.layers.0.self_attn.q_proj.bias', 'text_model.encoder.layers.0.self_attn.q_proj.weight', 'text_model.encoder.layers.0.self_attn.v_proj.bias', 'text_model.encoder.layers.0.self_attn.v_proj.weight', 'text_model.encoder.layers.1.layer_norm1.bias', 'text_model.encoder.layers.1.layer_norm1.weight', 'text_model.encoder.layers.1.layer_norm2.bias', 'text_model.encoder.layers.1.layer_norm2.weight', 'text_model.encoder.layers.1.mlp.fc1.bias', 'text_model.encoder.layers.1.mlp.fc1.weight', 'text_model.encoder.layers.1.mlp.fc2.bias', 'text_model.encoder.layers.1.mlp.fc2.weight', 'text_model.encoder.layers.1.self_attn.k_proj.bias', 'text_model.encoder.layers.1.self_attn.k_proj.weight', 'text_model.encoder.layers.1.self_attn.out_proj.bias', 'text_model.encoder.layers.1.self_attn.out_proj.weight', 'text_model.encoder.layers.1.self_attn.q_proj.bias', 'text_model.encoder.layers.1.self_attn.q_proj.weight', 'text_model.encoder.layers.1.self_attn.v_proj.bias', 'text_model.encoder.layers.1.self_attn.v_proj.weight', 'text_model.encoder.layers.10.layer_norm1.bias', 'text_model.encoder.layers.10.layer_norm1.weight', 'text_model.encoder.layers.10.layer_norm2.bias', 'text_model.encoder.layers.10.layer_norm2.weight', 'text_model.encoder.layers.10.mlp.fc1.bias', 'text_model.encoder.layers.10.mlp.fc1.weight', 'text_model.encoder.layers.10.mlp.fc2.bias', 'text_model.encoder.layers.10.mlp.fc2.weight', 'text_model.encoder.layers.10.self_attn.k_proj.bias', 'text_model.encoder.layers.10.self_attn.k_proj.weight', 'text_model.encoder.layers.10.self_attn.out_proj.bias', 'text_model.encoder.layers.10.self_attn.out_proj.weight', 'text_model.encoder.layers.10.self_attn.q_proj.bias', 'text_model.encoder.layers.10.self_attn.q_proj.weight', 'text_model.encoder.layers.10.self_attn.v_proj.bias', 'text_model.encoder.layers.10.self_attn.v_proj.weight', 'text_model.encoder.layers.11.layer_norm1.bias', 'text_model.encoder.layers.11.layer_norm1.weight', 'text_model.encoder.layers.11.layer_norm2.bias', 'text_model.encoder.layers.11.layer_norm2.weight', 'text_model.encoder.layers.11.mlp.fc1.bias', 'text_model.encoder.layers.11.mlp.fc1.weight', 'text_model.encoder.layers.11.mlp.fc2.bias', 'text_model.encoder.layers.11.mlp.fc2.weight', 'text_model.encoder.layers.11.self_attn.k_proj.bias', 'text_model.encoder.layers.11.self_attn.k_proj.weight', 'text_model.encoder.layers.11.self_attn.out_proj.bias', 'text_model.encoder.layers.11.self_attn.out_proj.weight', 'text_model.encoder.layers.11.self_attn.q_proj.bias', 'text_model.encoder.layers.11.self_attn.q_proj.weight', 'text_model.encoder.layers.11.self_attn.v_proj.bias', 'text_model.encoder.layers.11.self_attn.v_proj.weight', 'text_model.encoder.layers.12.layer_norm1.bias', 'text_model.encoder.layers.12.layer_norm1.weight', 'text_model.encoder.layers.12.layer_norm2.bias', 'text_model.encoder.layers.12.layer_norm2.weight', 'text_model.encoder.layers.12.mlp.fc1.bias', 'text_model.encoder.layers.12.mlp.fc1.weight', 'text_model.encoder.layers.12.mlp.fc2.bias', 'text_model.encoder.layers.12.mlp.fc2.weight', 'text_model.encoder.layers.12.self_attn.k_proj.bias', 'text_model.encoder.layers.12.self_attn.k_proj.weight', 'text_model.encoder.layers.12.self_attn.out_proj.bias', 'text_model.encoder.layers.12.self_attn.out_proj.weight', 'text_model.encoder.layers.12.self_attn.q_proj.bias', 'text_model.encoder.layers.12.self_attn.q_proj.weight', 'text_model.encoder.layers.12.self_attn.v_proj.bias', 'text_model.encoder.layers.12.self_attn.v_proj.weight', 'text_model.encoder.layers.13.layer_norm1.bias', 'text_model.encoder.layers.13.layer_norm1.weight', 'text_model.encoder.layers.13.layer_norm2.bias', 'text_model.encoder.layers.13.layer_norm2.weight', 'text_model.encoder.layers.13.mlp.fc1.bias', 'text_model.encoder.layers.13.mlp.fc1.weight', 'text_model.encoder.layers.13.mlp.fc2.bias', 'text_model.encoder.layers.13.mlp.fc2.weight', 'text_model.encoder.layers.13.self_attn.k_proj.bias', 'text_model.encoder.layers.13.self_attn.k_proj.weight', 'text_model.encoder.layers.13.self_attn.out_proj.bias', 'text_model.encoder.layers.13.self_attn.out_proj.weight', 'text_model.encoder.layers.13.self_attn.q_proj.bias', 'text_model.encoder.layers.13.self_attn.q_proj.weight', 'text_model.encoder.layers.13.self_attn.v_proj.bias', 'text_model.encoder.layers.13.self_attn.v_proj.weight', 'text_model.encoder.layers.14.layer_norm1.bias', 'text_model.encoder.layers.14.layer_norm1.weight', 'text_model.encoder.layers.14.layer_norm2.bias', 'text_model.encoder.layers.14.layer_norm2.weight', 'text_model.encoder.layers.14.mlp.fc1.bias', 'text_model.encoder.layers.14.mlp.fc1.weight', 'text_model.encoder.layers.14.mlp.fc2.bias', 'text_model.encoder.layers.14.mlp.fc2.weight', 'text_model.encoder.layers.14.self_attn.k_proj.bias', 'text_model.encoder.layers.14.self_attn.k_proj.weight', 'text_model.encoder.layers.14.self_attn.out_proj.bias', 'text_model.encoder.layers.14.self_attn.out_proj.weight', 'text_model.encoder.layers.14.self_attn.q_proj.bias', 'text_model.encoder.layers.14.self_attn.q_proj.weight', 'text_model.encoder.layers.14.self_attn.v_proj.bias', 'text_model.encoder.layers.14.self_attn.v_proj.weight', 'text_model.encoder.layers.15.layer_norm1.bias', 'text_model.encoder.layers.15.layer_norm1.weight', 'text_model.encoder.layers.15.layer_norm2.bias', 'text_model.encoder.layers.15.layer_norm2.weight', 'text_model.encoder.layers.15.mlp.fc1.bias', 'text_model.encoder.layers.15.mlp.fc1.weight', 'text_model.encoder.layers.15.mlp.fc2.bias', 'text_model.encoder.layers.15.mlp.fc2.weight', 'text_model.encoder.layers.15.self_attn.k_proj.bias', 'text_model.encoder.layers.15.self_attn.k_proj.weight', 'text_model.encoder.layers.15.self_attn.out_proj.bias', 'text_model.encoder.layers.15.self_attn.out_proj.weight', 'text_model.encoder.layers.15.self_attn.q_proj.bias', 'text_model.encoder.layers.15.self_attn.q_proj.weight', 'text_model.encoder.layers.15.self_attn.v_proj.bias', 'text_model.encoder.layers.15.self_attn.v_proj.weight', 'text_model.encoder.layers.16.layer_norm1.bias', 'text_model.encoder.layers.16.layer_norm1.weight', 'text_model.encoder.layers.16.layer_norm2.bias', 'text_model.encoder.layers.16.layer_norm2.weight', 'text_model.encoder.layers.16.mlp.fc1.bias', 'text_model.encoder.layers.16.mlp.fc1.weight', 'text_model.encoder.layers.16.mlp.fc2.bias', 'text_model.encoder.layers.16.mlp.fc2.weight', 'text_model.encoder.layers.16.self_attn.k_proj.bias', 'text_model.encoder.layers.16.self_attn.k_proj.weight', 'text_model.encoder.layers.16.self_attn.out_proj.bias', 'text_model.encoder.layers.16.self_attn.out_proj.weight', 'text_model.encoder.layers.16.self_attn.q_proj.bias', 'text_model.encoder.layers.16.self_attn.q_proj.weight', 'text_model.encoder.layers.16.self_attn.v_proj.bias', 'text_model.encoder.layers.16.self_attn.v_proj.weight', 'text_model.encoder.layers.17.layer_norm1.bias', 'text_model.encoder.layers.17.layer_norm1.weight', 'text_model.encoder.layers.17.layer_norm2.bias', 'text_model.encoder.layers.17.layer_norm2.weight', 'text_model.encoder.layers.17.mlp.fc1.bias', 'text_model.encoder.layers.17.mlp.fc1.weight', 'text_model.encoder.layers.17.mlp.fc2.bias', 'text_model.encoder.layers.17.mlp.fc2.weight', 'text_model.encoder.layers.17.self_attn.k_proj.bias', 'text_model.encoder.layers.17.self_attn.k_proj.weight', 'text_model.encoder.layers.17.self_attn.out_proj.bias', 'text_model.encoder.layers.17.self_attn.out_proj.weight', 'text_model.encoder.layers.17.self_attn.q_proj.bias', 'text_model.encoder.layers.17.self_attn.q_proj.weight', 'text_model.encoder.layers.17.self_attn.v_proj.bias', 'text_model.encoder.layers.17.self_attn.v_proj.weight', 'text_model.encoder.layers.18.layer_norm1.bias', 'text_model.encoder.layers.18.layer_norm1.weight', 'text_model.encoder.layers.18.layer_norm2.bias', 'text_model.encoder.layers.18.layer_norm2.weight', 'text_model.encoder.layers.18.mlp.fc1.bias', 'text_model.encoder.layers.18.mlp.fc1.weight', 'text_model.encoder.layers.18.mlp.fc2.bias', 'text_model.encoder.layers.18.mlp.fc2.weight', 'text_model.encoder.layers.18.self_attn.k_proj.bias', 'text_model.encoder.layers.18.self_attn.k_proj.weight', 'text_model.encoder.layers.18.self_attn.out_proj.bias', 'text_model.encoder.layers.18.self_attn.out_proj.weight', 'text_model.encoder.layers.18.self_attn.q_proj.bias', 'text_model.encoder.layers.18.self_attn.q_proj.weight', 'text_model.encoder.layers.18.self_attn.v_proj.bias', 'text_model.encoder.layers.18.self_attn.v_proj.weight', 'text_model.encoder.layers.19.layer_norm1.bias', 'text_model.encoder.layers.19.layer_norm1.weight', 'text_model.encoder.layers.19.layer_norm2.bias', 'text_model.encoder.layers.19.layer_norm2.weight', 'text_model.encoder.layers.19.mlp.fc1.bias', 'text_model.encoder.layers.19.mlp.fc1.weight', 'text_model.encoder.layers.19.mlp.fc2.bias', 'text_model.encoder.layers.19.mlp.fc2.weight', 'text_model.encoder.layers.19.self_attn.k_proj.bias', 'text_model.encoder.layers.19.self_attn.k_proj.weight', 'text_model.encoder.layers.19.self_attn.out_proj.bias', 'text_model.encoder.layers.19.self_attn.out_proj.weight', 'text_model.encoder.layers.19.self_attn.q_proj.bias', 'text_model.encoder.layers.19.self_attn.q_proj.weight', 'text_model.encoder.layers.19.self_attn.v_proj.bias', 'text_model.encoder.layers.19.self_attn.v_proj.weight', 'text_model.encoder.layers.2.layer_norm1.bias', 'text_model.encoder.layers.2.layer_norm1.weight', 'text_model.encoder.layers.2.layer_norm2.bias', 'text_model.encoder.layers.2.layer_norm2.weight', 'text_model.encoder.layers.2.mlp.fc1.bias', 'text_model.encoder.layers.2.mlp.fc1.weight', 'text_model.encoder.layers.2.mlp.fc2.bias', 'text_model.encoder.layers.2.mlp.fc2.weight', 'text_model.encoder.layers.2.self_attn.k_proj.bias', 'text_model.encoder.layers.2.self_attn.k_proj.weight', 'text_model.encoder.layers.2.self_attn.out_proj.bias', 'text_model.encoder.layers.2.self_attn.out_proj.weight', 'text_model.encoder.layers.2.self_attn.q_proj.bias', 'text_model.encoder.layers.2.self_attn.q_proj.weight', 'text_model.encoder.layers.2.self_attn.v_proj.bias', 'text_model.encoder.layers.2.self_attn.v_proj.weight', 'text_model.encoder.layers.20.layer_norm1.bias', 'text_model.encoder.layers.20.layer_norm1.weight', 'text_model.encoder.layers.20.layer_norm2.bias', 'text_model.encoder.layers.20.layer_norm2.weight', 'text_model.encoder.layers.20.mlp.fc1.bias', 'text_model.encoder.layers.20.mlp.fc1.weight', 'text_model.encoder.layers.20.mlp.fc2.bias', 'text_model.encoder.layers.20.mlp.fc2.weight', 'text_model.encoder.layers.20.self_attn.k_proj.bias', 'text_model.encoder.layers.20.self_attn.k_proj.weight', 'text_model.encoder.layers.20.self_attn.out_proj.bias', 'text_model.encoder.layers.20.self_attn.out_proj.weight', 'text_model.encoder.layers.20.self_attn.q_proj.bias', 'text_model.encoder.layers.20.self_attn.q_proj.weight', 'text_model.encoder.layers.20.self_attn.v_proj.bias', 'text_model.encoder.layers.20.self_attn.v_proj.weight', 'text_model.encoder.layers.21.layer_norm1.bias', 'text_model.encoder.layers.21.layer_norm1.weight', 'text_model.encoder.layers.21.layer_norm2.bias', 'text_model.encoder.layers.21.layer_norm2.weight', 'text_model.encoder.layers.21.mlp.fc1.bias', 'text_model.encoder.layers.21.mlp.fc1.weight', 'text_model.encoder.layers.21.mlp.fc2.bias', 'text_model.encoder.layers.21.mlp.fc2.weight', 'text_model.encoder.layers.21.self_attn.k_proj.bias', 'text_model.encoder.layers.21.self_attn.k_proj.weight', 'text_model.encoder.layers.21.self_attn.out_proj.bias', 'text_model.encoder.layers.21.self_attn.out_proj.weight', 'text_model.encoder.layers.21.self_attn.q_proj.bias', 'text_model.encoder.layers.21.self_attn.q_proj.weight', 'text_model.encoder.layers.21.self_attn.v_proj.bias', 'text_model.encoder.layers.21.self_attn.v_proj.weight', 'text_model.encoder.layers.22.layer_norm1.bias', 'text_model.encoder.layers.22.layer_norm1.weight', 'text_model.encoder.layers.22.layer_norm2.bias', 'text_model.encoder.layers.22.layer_norm2.weight', 'text_model.encoder.layers.22.mlp.fc1.bias', 'text_model.encoder.layers.22.mlp.fc1.weight', 'text_model.encoder.layers.22.mlp.fc2.bias', 'text_model.encoder.layers.22.mlp.fc2.weight', 'text_model.encoder.layers.22.self_attn.k_proj.bias', 'text_model.encoder.layers.22.self_attn.k_proj.weight', 'text_model.encoder.layers.22.self_attn.out_proj.bias', 'text_model.encoder.layers.22.self_attn.out_proj.weight', 'text_model.encoder.layers.22.self_attn.q_proj.bias', 'text_model.encoder.layers.22.self_attn.q_proj.weight', 'text_model.encoder.layers.22.self_attn.v_proj.bias', 'text_model.encoder.layers.22.self_attn.v_proj.weight', 'text_model.encoder.layers.23.layer_norm1.bias', 'text_model.encoder.layers.23.layer_norm1.weight', 'text_model.encoder.layers.23.layer_norm2.bias', 'text_model.encoder.layers.23.layer_norm2.weight', 'text_model.encoder.layers.23.mlp.fc1.bias', 'text_model.encoder.layers.23.mlp.fc1.weight', 'text_model.encoder.layers.23.mlp.fc2.bias', 'text_model.encoder.layers.23.mlp.fc2.weight', 'text_model.encoder.layers.23.self_attn.k_proj.bias', 'text_model.encoder.layers.23.self_attn.k_proj.weight', 'text_model.encoder.layers.23.self_attn.out_proj.bias', 'text_model.encoder.layers.23.self_attn.out_proj.weight', 'text_model.encoder.layers.23.self_attn.q_proj.bias', 'text_model.encoder.layers.23.self_attn.q_proj.weight', 'text_model.encoder.layers.23.self_attn.v_proj.bias', 'text_model.encoder.layers.23.self_attn.v_proj.weight', 'text_model.encoder.layers.24.layer_norm1.bias', 'text_model.encoder.layers.24.layer_norm1.weight', 'text_model.encoder.layers.24.layer_norm2.bias', 'text_model.encoder.layers.24.layer_norm2.weight', 'text_model.encoder.layers.24.mlp.fc1.bias', 'text_model.encoder.layers.24.mlp.fc1.weight', 'text_model.encoder.layers.24.mlp.fc2.bias', 'text_model.encoder.layers.24.mlp.fc2.weight', 'text_model.encoder.layers.24.self_attn.k_proj.bias', 'text_model.encoder.layers.24.self_attn.k_proj.weight', 'text_model.encoder.layers.24.self_attn.out_proj.bias', 'text_model.encoder.layers.24.self_attn.out_proj.weight', 'text_model.encoder.layers.24.self_attn.q_proj.bias', 'text_model.encoder.layers.24.self_attn.q_proj.weight', 'text_model.encoder.layers.24.self_attn.v_proj.bias', 'text_model.encoder.layers.24.self_attn.v_proj.weight', 'text_model.encoder.layers.25.layer_norm1.bias', 'text_model.encoder.layers.25.layer_norm1.weight', 'text_model.encoder.layers.25.layer_norm2.bias', 'text_model.encoder.layers.25.layer_norm2.weight', 'text_model.encoder.layers.25.mlp.fc1.bias', 'text_model.encoder.layers.25.mlp.fc1.weight', 'text_model.encoder.layers.25.mlp.fc2.bias', 'text_model.encoder.layers.25.mlp.fc2.weight', 'text_model.encoder.layers.25.self_attn.k_proj.bias', 'text_model.encoder.layers.25.self_attn.k_proj.weight', 'text_model.encoder.layers.25.self_attn.out_proj.bias', 'text_model.encoder.layers.25.self_attn.out_proj.weight', 'text_model.encoder.layers.25.self_attn.q_proj.bias', 'text_model.encoder.layers.25.self_attn.q_proj.weight', 'text_model.encoder.layers.25.self_attn.v_proj.bias', 'text_model.encoder.layers.25.self_attn.v_proj.weight', 'text_model.encoder.layers.26.layer_norm1.bias', 'text_model.encoder.layers.26.layer_norm1.weight', 'text_model.encoder.layers.26.layer_norm2.bias', 'text_model.encoder.layers.26.layer_norm2.weight', 'text_model.encoder.layers.26.mlp.fc1.bias', 'text_model.encoder.layers.26.mlp.fc1.weight', 'text_model.encoder.layers.26.mlp.fc2.bias', 'text_model.encoder.layers.26.mlp.fc2.weight', 'text_model.encoder.layers.26.self_attn.k_proj.bias', 'text_model.encoder.layers.26.self_attn.k_proj.weight', 'text_model.encoder.layers.26.self_attn.out_proj.bias', 'text_model.encoder.layers.26.self_attn.out_proj.weight', 'text_model.encoder.layers.26.self_attn.q_proj.bias', 'text_model.encoder.layers.26.self_attn.q_proj.weight', 'text_model.encoder.layers.26.self_attn.v_proj.bias', 'text_model.encoder.layers.26.self_attn.v_proj.weight', 'text_model.encoder.layers.3.layer_norm1.bias', 'text_model.encoder.layers.3.layer_norm1.weight', 'text_model.encoder.layers.3.layer_norm2.bias', 'text_model.encoder.layers.3.layer_norm2.weight', 'text_model.encoder.layers.3.mlp.fc1.bias', 'text_model.encoder.layers.3.mlp.fc1.weight', 'text_model.encoder.layers.3.mlp.fc2.bias', 'text_model.encoder.layers.3.mlp.fc2.weight', 'text_model.encoder.layers.3.self_attn.k_proj.bias', 'text_model.encoder.layers.3.self_attn.k_proj.weight', 'text_model.encoder.layers.3.self_attn.out_proj.bias', 'text_model.encoder.layers.3.self_attn.out_proj.weight', 'text_model.encoder.layers.3.self_attn.q_proj.bias', 'text_model.encoder.layers.3.self_attn.q_proj.weight', 'text_model.encoder.layers.3.self_attn.v_proj.bias', 'text_model.encoder.layers.3.self_attn.v_proj.weight', 'text_model.encoder.layers.4.layer_norm1.bias', 'text_model.encoder.layers.4.layer_norm1.weight', 'text_model.encoder.layers.4.layer_norm2.bias', 'text_model.encoder.layers.4.layer_norm2.weight', 'text_model.encoder.layers.4.mlp.fc1.bias', 'text_model.encoder.layers.4.mlp.fc1.weight', 'text_model.encoder.layers.4.mlp.fc2.bias', 'text_model.encoder.layers.4.mlp.fc2.weight', 'text_model.encoder.layers.4.self_attn.k_proj.bias', 'text_model.encoder.layers.4.self_attn.k_proj.weight', 'text_model.encoder.layers.4.self_attn.out_proj.bias', 'text_model.encoder.layers.4.self_attn.out_proj.weight', 'text_model.encoder.layers.4.self_attn.q_proj.bias', 'text_model.encoder.layers.4.self_attn.q_proj.weight', 'text_model.encoder.layers.4.self_attn.v_proj.bias', 'text_model.encoder.layers.4.self_attn.v_proj.weight', 'text_model.encoder.layers.5.layer_norm1.bias', 'text_model.encoder.layers.5.layer_norm1.weight', 'text_model.encoder.layers.5.layer_norm2.bias', 'text_model.encoder.layers.5.layer_norm2.weight', 'text_model.encoder.layers.5.mlp.fc1.bias', 'text_model.encoder.layers.5.mlp.fc1.weight', 'text_model.encoder.layers.5.mlp.fc2.bias', 'text_model.encoder.layers.5.mlp.fc2.weight', 'text_model.encoder.layers.5.self_attn.k_proj.bias', 'text_model.encoder.layers.5.self_attn.k_proj.weight', 'text_model.encoder.layers.5.self_attn.out_proj.bias', 'text_model.encoder.layers.5.self_attn.out_proj.weight', 'text_model.encoder.layers.5.self_attn.q_proj.bias', 'text_model.encoder.layers.5.self_attn.q_proj.weight', 'text_model.encoder.layers.5.self_attn.v_proj.bias', 'text_model.encoder.layers.5.self_attn.v_proj.weight', 'text_model.encoder.layers.6.layer_norm1.bias', 'text_model.encoder.layers.6.layer_norm1.weight', 'text_model.encoder.layers.6.layer_norm2.bias', 'text_model.encoder.layers.6.layer_norm2.weight', 'text_model.encoder.layers.6.mlp.fc1.bias', 'text_model.encoder.layers.6.mlp.fc1.weight', 'text_model.encoder.layers.6.mlp.fc2.bias', 'text_model.encoder.layers.6.mlp.fc2.weight', 'text_model.encoder.layers.6.self_attn.k_proj.bias', 'text_model.encoder.layers.6.self_attn.k_proj.weight', 'text_model.encoder.layers.6.self_attn.out_proj.bias', 'text_model.encoder.layers.6.self_attn.out_proj.weight', 'text_model.encoder.layers.6.self_attn.q_proj.bias', 'text_model.encoder.layers.6.self_attn.q_proj.weight', 'text_model.encoder.layers.6.self_attn.v_proj.bias', 'text_model.encoder.layers.6.self_attn.v_proj.weight', 'text_model.encoder.layers.7.layer_norm1.bias', 'text_model.encoder.layers.7.layer_norm1.weight', 'text_model.encoder.layers.7.layer_norm2.bias', 'text_model.encoder.layers.7.layer_norm2.weight', 'text_model.encoder.layers.7.mlp.fc1.bias', 'text_model.encoder.layers.7.mlp.fc1.weight', 'text_model.encoder.layers.7.mlp.fc2.bias', 'text_model.encoder.layers.7.mlp.fc2.weight', 'text_model.encoder.layers.7.self_attn.k_proj.bias', 'text_model.encoder.layers.7.self_attn.k_proj.weight', 'text_model.encoder.layers.7.self_attn.out_proj.bias', 'text_model.encoder.layers.7.self_attn.out_proj.weight', 'text_model.encoder.layers.7.self_attn.q_proj.bias', 'text_model.encoder.layers.7.self_attn.q_proj.weight', 'text_model.encoder.layers.7.self_attn.v_proj.bias', 'text_model.encoder.layers.7.self_attn.v_proj.weight', 'text_model.encoder.layers.8.layer_norm1.bias', 'text_model.encoder.layers.8.layer_norm1.weight', 'text_model.encoder.layers.8.layer_norm2.bias', 'text_model.encoder.layers.8.layer_norm2.weight', 'text_model.encoder.layers.8.mlp.fc1.bias', 'text_model.encoder.layers.8.mlp.fc1.weight', 'text_model.encoder.layers.8.mlp.fc2.bias', 'text_model.encoder.layers.8.mlp.fc2.weight', 'text_model.encoder.layers.8.self_attn.k_proj.bias', 'text_model.encoder.layers.8.self_attn.k_proj.weight', 'text_model.encoder.layers.8.self_attn.out_proj.bias', 'text_model.encoder.layers.8.self_attn.out_proj.weight', 'text_model.encoder.layers.8.self_attn.q_proj.bias', 'text_model.encoder.layers.8.self_attn.q_proj.weight', 'text_model.encoder.layers.8.self_attn.v_proj.bias', 'text_model.encoder.layers.8.self_attn.v_proj.weight', 'text_model.encoder.layers.9.layer_norm1.bias', 'text_model.encoder.layers.9.layer_norm1.weight', 'text_model.encoder.layers.9.layer_norm2.bias', 'text_model.encoder.layers.9.layer_norm2.weight', 'text_model.encoder.layers.9.mlp.fc1.bias', 'text_model.encoder.layers.9.mlp.fc1.weight', 'text_model.encoder.layers.9.mlp.fc2.bias', 'text_model.encoder.layers.9.mlp.fc2.weight', 'text_model.encoder.layers.9.self_attn.k_proj.bias', 'text_model.encoder.layers.9.self_attn.k_proj.weight', 'text_model.encoder.layers.9.self_attn.out_proj.bias', 'text_model.encoder.layers.9.self_attn.out_proj.weight', 'text_model.encoder.layers.9.self_attn.q_proj.bias', 'text_model.encoder.layers.9.self_attn.q_proj.weight', 'text_model.encoder.layers.9.self_attn.v_proj.bias', 'text_model.encoder.layers.9.self_attn.v_proj.weight', 'text_model.final_layer_norm.bias', 'text_model.final_layer_norm.weight', 'text_model.head.bias', 'text_model.head.weight']
- This IS expected if you are initializing SiglipVisionModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing SiglipVisionModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2025-02-15 03:00:23,785 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of SiglipVisionModel were initialized from the model checkpoint at google/siglip-so400m-patch14-384.
If your task is similar to the task the model of the checkpoint was trained on, you can already use SiglipVisionModel for predictions without further training.
2025-02-15 03:00:23,970 - image_processing_base.py:375 - get_image_processor_dict - INFO - loading configuration file preprocessor_config.json from cache at /root/.cache/huggingface/hub/models--google--siglip-so400m-patch14-384/snapshots/9fdffc58afc957d1a03a25b10dba0329ab15c2a3/preprocessor_config.json
2025-02-15 03:00:23,971 - image_processing_base.py:429 - from_dict - INFO - Image processor SiglipImageProcessor {
  "do_convert_rgb": null,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.5,
    0.5,
    0.5
  ],
  "image_processor_type": "SiglipImageProcessor",
  "image_std": [
    0.5,
    0.5,
    0.5
  ],
  "processor_class": "SiglipProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "height": 384,
    "width": 384
  }
}

2025-02-15 03:00:24,332 - configuration_utils.py:733 - _get_config_dict - INFO - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/config.json
2025-02-15 03:00:24,336 - configuration_utils.py:800 - from_dict - INFO - Model config Dinov2Config {
  "apply_layernorm": true,
  "architectures": [
    "Dinov2Model"
  ],
  "attention_probs_dropout_prob": 0.0,
  "drop_path_rate": 0.0,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.0,
  "hidden_size": 1536,
  "image_size": 518,
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-06,
  "layerscale_value": 1.0,
  "mlp_ratio": 4,
  "model_type": "dinov2",
  "num_attention_heads": 24,
  "num_channels": 3,
  "num_hidden_layers": 40,
  "out_features": [
    "stage40"
  ],
  "out_indices": [
    40
  ],
  "patch_size": 14,
  "qkv_bias": true,
  "reshape_hidden_states": true,
  "stage_names": [
    "stem",
    "stage1",
    "stage2",
    "stage3",
    "stage4",
    "stage5",
    "stage6",
    "stage7",
    "stage8",
    "stage9",
    "stage10",
    "stage11",
    "stage12",
    "stage13",
    "stage14",
    "stage15",
    "stage16",
    "stage17",
    "stage18",
    "stage19",
    "stage20",
    "stage21",
    "stage22",
    "stage23",
    "stage24",
    "stage25",
    "stage26",
    "stage27",
    "stage28",
    "stage29",
    "stage30",
    "stage31",
    "stage32",
    "stage33",
    "stage34",
    "stage35",
    "stage36",
    "stage37",
    "stage38",
    "stage39",
    "stage40"
  ],
  "torch_dtype": "float32",
  "transformers_version": "4.43.1",
  "use_swiglu_ffn": true
}

2025-02-15 03:00:24,336 - modeling_utils.py:3621 - from_pretrained - INFO - loading weights file model.safetensors from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/model.safetensors
2025-02-15 03:00:24,681 - modeling_utils.py:4450 - _load_pretrained_model - INFO - All model checkpoint weights were used when initializing Dinov2Model.

2025-02-15 03:00:24,682 - modeling_utils.py:4458 - _load_pretrained_model - INFO - All the weights of Dinov2Model were initialized from the model checkpoint at facebook/dinov2-giant.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Dinov2Model for predictions without further training.
2025-02-15 03:00:24,861 - image_processing_base.py:375 - get_image_processor_dict - INFO - loading configuration file preprocessor_config.json from cache at /root/.cache/huggingface/hub/models--facebook--dinov2-giant/snapshots/611a9d42f2335e0f921f1e313ad3c1b7178d206d/preprocessor_config.json
2025-02-15 03:00:24,864 - image_processing_base.py:429 - from_dict - INFO - Image processor BitImageProcessor {
  "crop_size": {
    "height": 378,
    "width": 378
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.485,
    0.456,
    0.406
  ],
  "image_processor_type": "BitImageProcessor",
  "image_std": [
    0.229,
    0.224,
    0.225
  ],
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "shortest_edge": 378
  }
}

2025-02-15 03:00:25,675 - finetune_llama.py:1239 - train - INFO - Total params: 3264865280
2025-02-15 03:00:25,675 - finetune_llama.py:1240 - train - INFO - Trainable params: 12589056
2025-02-15 03:00:25,675 - finetune_llama.py:1241 - train - INFO - LM head params: 394002432
2025-02-15 03:00:28,319 - trainer_callback.py:423 - add_callback - WARNING - You are adding a <class 'transformers.integrations.integration_utils.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
2025-02-15 03:00:28,319 - trainer.py:648 - __init__ - INFO - Using auto half precision backend
2025-02-15 03:00:28,626 - trainer.py:2134 - _inner_training_loop - INFO - ***** Running training *****
2025-02-15 03:00:28,626 - trainer.py:2135 - _inner_training_loop - INFO -   Num examples = 2
2025-02-15 03:00:28,626 - trainer.py:2136 - _inner_training_loop - INFO -   Num Epochs = 2
2025-02-15 03:00:28,627 - trainer.py:2137 - _inner_training_loop - INFO -   Instantaneous batch size per device = 1
2025-02-15 03:00:28,627 - trainer.py:2140 - _inner_training_loop - INFO -   Total train batch size (w. parallel, distributed & accumulation) = 1
2025-02-15 03:00:28,627 - trainer.py:2141 - _inner_training_loop - INFO -   Gradient Accumulation steps = 1
2025-02-15 03:00:28,627 - trainer.py:2142 - _inner_training_loop - INFO -   Total optimization steps = 4
2025-02-15 03:00:28,628 - trainer.py:2143 - _inner_training_loop - INFO -   Number of trainable parameters = 406,591,488
2025-02-15 03:00:36,846 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:36,846 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:36,915 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:00:36,923 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:36,923 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 61, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:00:36,925 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:36,925 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 61, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:00:38,152 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:00:38,152 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:00:38,152 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.22 seconds
2025-02-15 03:00:38,152 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,153 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 11700.29 MB
2025-02-15 03:00:38,153 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  11949.72 MB
2025-02-15 03:00:38,153 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  249.43 MB
2025-02-15 03:00:38,153 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  12475.96 MB
2025-02-15 03:00:38,153 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14038.34 MB
2025-02-15 03:00:38,153 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  1562.38 MB
2025-02-15 03:00:38,153 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20545.37 MB
2025-02-15 03:00:38,183 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:00:38,183 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:00:38,183 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds
2025-02-15 03:00:38,183 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,183 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 11949.65 MB
2025-02-15 03:00:38,183 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  12054.38 MB
2025-02-15 03:00:38,183 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  104.72 MB
2025-02-15 03:00:38,183 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  14038.34 MB
2025-02-15 03:00:38,183 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   12691.96 MB
2025-02-15 03:00:38,183 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -1346.37 MB
2025-02-15 03:00:38,183 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         12378.25 MB
2025-02-15 03:00:38,566 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:00:38,566 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:00:38,566 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.38 seconds
2025-02-15 03:00:38,566 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,566 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12054.38 MB
2025-02-15 03:00:38,566 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  12135.33 MB
2025-02-15 03:00:38,566 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  80.95 MB
2025-02-15 03:00:38,566 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  12691.96 MB
2025-02-15 03:00:38,566 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   13075.74 MB
2025-02-15 03:00:38,566 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  383.78 MB
2025-02-15 03:00:38,567 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15950.82 MB
2025-02-15 03:00:38,576 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:00:38,576 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:00:38,576 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:38,576 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,576 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12135.33 MB
2025-02-15 03:00:38,576 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  12423.41 MB
2025-02-15 03:00:38,576 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  288.08 MB
2025-02-15 03:00:38,576 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  13075.74 MB
2025-02-15 03:00:38,576 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   13077.84 MB
2025-02-15 03:00:38,576 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2.10 MB
2025-02-15 03:00:38,576 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         12639.58 MB
2025-02-15 03:00:38,670 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:00:38,670 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:00:38,670 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.09 seconds
2025-02-15 03:00:38,670 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,670 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12423.41 MB
2025-02-15 03:00:38,670 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  12773.35 MB
2025-02-15 03:00:38,670 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  349.93 MB
2025-02-15 03:00:38,670 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  13077.84 MB
2025-02-15 03:00:38,670 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   13948.16 MB
2025-02-15 03:00:38,670 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  870.32 MB
2025-02-15 03:00:38,670 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         13613.94 MB
2025-02-15 03:00:38,671 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:00:38,671 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:00:38,671 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.10 seconds
2025-02-15 03:00:38,671 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,671 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 12135.33 MB
2025-02-15 03:00:38,671 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  12773.35 MB
2025-02-15 03:00:38,671 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  638.02 MB
2025-02-15 03:00:38,671 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  13075.74 MB
2025-02-15 03:00:38,671 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   13948.16 MB
2025-02-15 03:00:38,672 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  872.42 MB
2025-02-15 03:00:38,672 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         13613.94 MB
2025-02-15 03:00:38,723 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:00:38,723 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:00:38,723 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:00:38,723 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,723 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13111.15 MB
2025-02-15 03:00:38,723 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13258.89 MB
2025-02-15 03:00:38,723 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  147.74 MB
2025-02-15 03:00:38,723 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  13948.16 MB
2025-02-15 03:00:38,723 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14042.53 MB
2025-02-15 03:00:38,723 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  94.37 MB
2025-02-15 03:00:38,723 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         13366.83 MB
2025-02-15 03:00:38,789 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:00:38,789 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:00:38,789 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds
2025-02-15 03:00:38,789 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,789 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13351.85 MB
2025-02-15 03:00:38,789 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13498.95 MB
2025-02-15 03:00:38,789 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  147.10 MB
2025-02-15 03:00:38,789 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  14042.53 MB
2025-02-15 03:00:38,789 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14042.53 MB
2025-02-15 03:00:38,789 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:38,789 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         13498.95 MB
2025-02-15 03:00:38,791 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:00:38,791 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:00:38,791 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.86 seconds
2025-02-15 03:00:38,791 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:38,791 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 11487.76 MB
2025-02-15 03:00:38,791 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13630.79 MB
2025-02-15 03:00:38,791 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  2143.03 MB
2025-02-15 03:00:38,791 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  12475.96 MB
2025-02-15 03:00:38,792 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14042.53 MB
2025-02-15 03:00:38,792 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  1566.57 MB
2025-02-15 03:00:38,792 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         13630.79 MB
2025-02-15 03:00:38,857 - logging.py:328 - warning_once - WARNING - The attention layers in this model are transitioning from computing the RoPE embeddings internally through `position_ids` (2D tensor with the indexes of the tokens), to using externally computed `position_embeddings` (Tuple of tensors, containing cos and sin). In v4.45 `position_ids` will be removed and `position_embeddings` will be mandatory.
2025-02-15 03:00:39,054 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:00:39,054 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:00:39,054 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds
2025-02-15 03:00:39,054 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:39,054 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13630.79 MB
2025-02-15 03:00:39,054 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13854.27 MB
2025-02-15 03:00:39,054 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  223.48 MB
2025-02-15 03:00:39,054 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  14042.53 MB
2025-02-15 03:00:39,054 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14835.25 MB
2025-02-15 03:00:39,054 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  792.72 MB
2025-02-15 03:00:39,054 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14052.12 MB
2025-02-15 03:00:39,068 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5347, cut from 5349
2025-02-15 03:00:39,075 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:39,086 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:00:39,086 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:00:39,086 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds
2025-02-15 03:00:39,086 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:39,086 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13854.27 MB
2025-02-15 03:00:39,086 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19387.34 MB
2025-02-15 03:00:39,086 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  5533.07 MB
2025-02-15 03:00:39,086 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  14835.25 MB
2025-02-15 03:00:39,086 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21713.91 MB
2025-02-15 03:00:39,086 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6878.66 MB
2025-02-15 03:00:39,086 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19387.34 MB
2025-02-15 03:00:39,248 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5139]
2025-02-15 03:00:39,251 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:39,251 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:39,252 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:39,253 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:00:39,260 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:00:39,262 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:39,262 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:00:39,262 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:40,010 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:40,010 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:40,018 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:00:40,025 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:40,026 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 61, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:00:40,027 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:40,027 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 61, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:00:40,984 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:00:40,984 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:00:40,984 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.95 seconds
2025-02-15 03:00:40,984 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:40,984 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13394.29 MB
2025-02-15 03:00:40,984 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13610.16 MB
2025-02-15 03:00:40,984 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  215.88 MB
2025-02-15 03:00:40,984 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  27216.84 MB
2025-02-15 03:00:40,984 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15030.29 MB
2025-02-15 03:00:40,984 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -12186.55 MB
2025-02-15 03:00:40,984 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22205.81 MB
2025-02-15 03:00:40,991 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:00:40,992 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:00:40,992 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:40,992 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:40,992 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13610.16 MB
2025-02-15 03:00:40,992 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13714.76 MB
2025-02-15 03:00:40,992 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  104.59 MB
2025-02-15 03:00:40,992 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15030.29 MB
2025-02-15 03:00:40,992 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   14814.28 MB
2025-02-15 03:00:40,992 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -216.01 MB
2025-02-15 03:00:40,992 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14038.63 MB
2025-02-15 03:00:41,302 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:00:41,302 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:00:41,302 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.31 seconds
2025-02-15 03:00:41,302 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,302 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13714.76 MB
2025-02-15 03:00:41,302 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13795.71 MB
2025-02-15 03:00:41,302 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  80.95 MB
2025-02-15 03:00:41,302 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  14814.28 MB
2025-02-15 03:00:41,302 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15053.36 MB
2025-02-15 03:00:41,302 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  239.08 MB
2025-02-15 03:00:41,302 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         17611.20 MB
2025-02-15 03:00:41,311 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:00:41,311 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:00:41,311 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:41,311 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,311 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13795.64 MB
2025-02-15 03:00:41,311 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14083.73 MB
2025-02-15 03:00:41,311 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  288.08 MB
2025-02-15 03:00:41,311 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15053.36 MB
2025-02-15 03:00:41,311 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15055.45 MB
2025-02-15 03:00:41,311 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2.10 MB
2025-02-15 03:00:41,311 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14299.89 MB
2025-02-15 03:00:41,384 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:00:41,384 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:00:41,384 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds
2025-02-15 03:00:41,384 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,384 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14083.73 MB
2025-02-15 03:00:41,385 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14434.45 MB
2025-02-15 03:00:41,385 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  350.72 MB
2025-02-15 03:00:41,385 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15055.45 MB
2025-02-15 03:00:41,385 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15707.67 MB
2025-02-15 03:00:41,385 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  652.21 MB
2025-02-15 03:00:41,385 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15275.05 MB
2025-02-15 03:00:41,386 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:00:41,386 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:00:41,386 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds
2025-02-15 03:00:41,386 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,386 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13795.64 MB
2025-02-15 03:00:41,386 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14434.45 MB
2025-02-15 03:00:41,386 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  638.80 MB
2025-02-15 03:00:41,386 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15053.36 MB
2025-02-15 03:00:41,386 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15707.67 MB
2025-02-15 03:00:41,386 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  654.31 MB
2025-02-15 03:00:41,386 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15275.05 MB
2025-02-15 03:00:41,439 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:00:41,439 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:00:41,439 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:00:41,439 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,439 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14773.16 MB
2025-02-15 03:00:41,439 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14920.89 MB
2025-02-15 03:00:41,439 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  147.74 MB
2025-02-15 03:00:41,439 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15707.67 MB
2025-02-15 03:00:41,439 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15802.04 MB
2025-02-15 03:00:41,439 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  94.37 MB
2025-02-15 03:00:41,439 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15028.83 MB
2025-02-15 03:00:41,447 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:00:41,448 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:00:41,448 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:41,448 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,448 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15013.85 MB
2025-02-15 03:00:41,448 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15160.83 MB
2025-02-15 03:00:41,448 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  146.97 MB
2025-02-15 03:00:41,448 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15802.04 MB
2025-02-15 03:00:41,448 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15802.04 MB
2025-02-15 03:00:41,448 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:41,448 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15160.83 MB
2025-02-15 03:00:41,450 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:00:41,450 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:00:41,450 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.42 seconds
2025-02-15 03:00:41,450 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,450 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13181.76 MB
2025-02-15 03:00:41,450 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15292.67 MB
2025-02-15 03:00:41,450 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  2110.91 MB
2025-02-15 03:00:41,450 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  27216.84 MB
2025-02-15 03:00:41,450 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15802.04 MB
2025-02-15 03:00:41,450 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -11414.80 MB
2025-02-15 03:00:41,450 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15292.67 MB
2025-02-15 03:00:41,657 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:00:41,657 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:00:41,657 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.20 seconds
2025-02-15 03:00:41,657 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,657 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15292.67 MB
2025-02-15 03:00:41,658 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  17268.97 MB
2025-02-15 03:00:41,658 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1976.30 MB
2025-02-15 03:00:41,658 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15802.04 MB
2025-02-15 03:00:41,658 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   18444.45 MB
2025-02-15 03:00:41,658 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2642.41 MB
2025-02-15 03:00:41,658 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         17467.07 MB
2025-02-15 03:00:41,671 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5347, cut from 5349
2025-02-15 03:00:41,672 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:41,677 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:00:41,677 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:00:41,677 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:00:41,677 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:41,677 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 17268.97 MB
2025-02-15 03:00:41,677 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  22802.04 MB
2025-02-15 03:00:41,677 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  5533.07 MB
2025-02-15 03:00:41,677 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  18444.45 MB
2025-02-15 03:00:41,677 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   25323.11 MB
2025-02-15 03:00:41,678 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6878.66 MB
2025-02-15 03:00:41,678 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22802.04 MB
2025-02-15 03:00:41,843 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5139]
2025-02-15 03:00:41,846 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:41,846 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:41,847 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:41,848 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:00:41,855 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:00:41,857 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:41,857 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:00:41,857 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:53,237 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:53,237 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:53,242 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:00:53,243 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:53,243 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 60, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:00:53,244 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:53,244 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 60, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:00:54,169 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:00:54,169 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:00:54,169 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.92 seconds
2025-02-15 03:00:54,169 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,169 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18525.01 MB
2025-02-15 03:00:54,169 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18737.34 MB
2025-02-15 03:00:54,169 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  212.34 MB
2025-02-15 03:00:54,169 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  30828.13 MB
2025-02-15 03:00:54,169 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   22680.70 MB
2025-02-15 03:00:54,169 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -8147.44 MB
2025-02-15 03:00:54,169 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         27193.53 MB
2025-02-15 03:00:54,175 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:00:54,175 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:00:54,175 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:54,176 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,176 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18737.34 MB
2025-02-15 03:00:54,176 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18840.22 MB
2025-02-15 03:00:54,176 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  102.88 MB
2025-02-15 03:00:54,176 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  22680.70 MB
2025-02-15 03:00:54,176 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   19929.24 MB
2025-02-15 03:00:54,176 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -2751.46 MB
2025-02-15 03:00:54,176 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19158.79 MB
2025-02-15 03:00:54,465 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:00:54,465 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:00:54,465 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.29 seconds
2025-02-15 03:00:54,465 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,465 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18840.22 MB
2025-02-15 03:00:54,465 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18919.85 MB
2025-02-15 03:00:54,465 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  79.63 MB
2025-02-15 03:00:54,465 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  19929.24 MB
2025-02-15 03:00:54,465 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20306.72 MB
2025-02-15 03:00:54,465 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  377.49 MB
2025-02-15 03:00:54,465 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22672.47 MB
2025-02-15 03:00:54,470 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:00:54,470 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:00:54,470 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:54,470 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,470 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18919.78 MB
2025-02-15 03:00:54,470 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19203.14 MB
2025-02-15 03:00:54,470 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  283.36 MB
2025-02-15 03:00:54,470 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20306.72 MB
2025-02-15 03:00:54,470 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20308.82 MB
2025-02-15 03:00:54,470 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2.10 MB
2025-02-15 03:00:54,470 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19415.76 MB
2025-02-15 03:00:54,529 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:00:54,529 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:00:54,530 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds
2025-02-15 03:00:54,530 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,530 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19203.14 MB
2025-02-15 03:00:54,530 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19547.34 MB
2025-02-15 03:00:54,530 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  344.20 MB
2025-02-15 03:00:54,530 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20308.82 MB
2025-02-15 03:00:54,530 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20950.55 MB
2025-02-15 03:00:54,530 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  641.73 MB
2025-02-15 03:00:54,530 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20376.67 MB
2025-02-15 03:00:54,530 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:00:54,530 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:00:54,530 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds
2025-02-15 03:00:54,530 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,530 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18919.78 MB
2025-02-15 03:00:54,530 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19547.34 MB
2025-02-15 03:00:54,530 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  627.56 MB
2025-02-15 03:00:54,530 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20306.72 MB
2025-02-15 03:00:54,530 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20950.55 MB
2025-02-15 03:00:54,530 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  643.83 MB
2025-02-15 03:00:54,530 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20376.67 MB
2025-02-15 03:00:54,562 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:00:54,562 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:00:54,562 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds
2025-02-15 03:00:54,562 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,562 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19700.69 MB
2025-02-15 03:00:54,562 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19845.76 MB
2025-02-15 03:00:54,562 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  145.07 MB
2025-02-15 03:00:54,562 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20950.55 MB
2025-02-15 03:00:54,562 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21042.82 MB
2025-02-15 03:00:54,562 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  92.27 MB
2025-02-15 03:00:54,562 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19951.93 MB
2025-02-15 03:00:54,567 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:00:54,567 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:00:54,567 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:54,567 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,567 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19937.20 MB
2025-02-15 03:00:54,567 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20066.10 MB
2025-02-15 03:00:54,567 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  128.90 MB
2025-02-15 03:00:54,567 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21042.82 MB
2025-02-15 03:00:54,567 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21042.82 MB
2025-02-15 03:00:54,567 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:54,567 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20066.10 MB
2025-02-15 03:00:54,568 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:00:54,568 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:00:54,568 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.32 seconds
2025-02-15 03:00:54,568 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,568 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18315.96 MB
2025-02-15 03:00:54,568 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20195.88 MB
2025-02-15 03:00:54,568 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1879.91 MB
2025-02-15 03:00:54,568 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  30828.13 MB
2025-02-15 03:00:54,568 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21042.82 MB
2025-02-15 03:00:54,568 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -9785.31 MB
2025-02-15 03:00:54,568 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20195.88 MB
2025-02-15 03:00:54,730 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:00:54,730 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:00:54,730 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds
2025-02-15 03:00:54,730 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,730 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20195.88 MB
2025-02-15 03:00:54,730 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20260.72 MB
2025-02-15 03:00:54,730 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  64.84 MB
2025-02-15 03:00:54,730 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21042.82 MB
2025-02-15 03:00:54,730 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21218.98 MB
2025-02-15 03:00:54,730 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  176.16 MB
2025-02-15 03:00:54,730 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20649.78 MB
2025-02-15 03:00:54,742 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5263, cut from 5265
2025-02-15 03:00:54,742 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The video rate for this video is 2 (']
2025-02-15 03:00:54,747 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:00:54,747 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:00:54,747 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:00:54,747 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:54,747 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20260.72 MB
2025-02-15 03:00:54,747 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  21312.91 MB
2025-02-15 03:00:54,747 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1052.19 MB
2025-02-15 03:00:54,747 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21218.98 MB
2025-02-15 03:00:54,747 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   27988.59 MB
2025-02-15 03:00:54,747 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6769.61 MB
2025-02-15 03:00:54,747 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         24321.76 MB
2025-02-15 03:00:54,851 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5055]
2025-02-15 03:00:54,852 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,852 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,853 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,853 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:00:54,858 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:00:54,859 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,859 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:00:54,859 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The video rate for this video is 2 (']
2025-02-15 03:00:54,860 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,860 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,860 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,860 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,866 - mm_trainer.py:995 - prediction_step - DEBUG - Assistant token at position 295
2025-02-15 03:00:54,867 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,867 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,867 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,867 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,867 - mm_trainer.py:767 - evaluation_loop - DEBUG - main_input_name: input_ids
2025-02-15 03:00:54,868 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,868 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['input_ids']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,868 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,868 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['attention_mask']: [torch.Size([1, 8192]), torch.bool, cuda:0]
2025-02-15 03:00:54,868 - mm_trainer.py:773 - evaluation_loop - DEBUG - type(inputs_decode): <class 'torch.Tensor'>
2025-02-15 03:00:54,869 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,869 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs_decode: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,873 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,873 - resource_logging.py:45 - debug_tensor - DEBUG - Before accelerator.pad_across_processes: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,874 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,874 - resource_logging.py:45 - debug_tensor - DEBUG - Before gather_function: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,876 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,876 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:54,877 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,877 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,887 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,887 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:54,894 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:00:54,895 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,896 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 107, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:00:54,897 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:54,897 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 107, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:00:56,540 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:00:56,540 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:00:56,540 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.64 seconds
2025-02-15 03:00:56,540 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:56,540 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18974.17 MB
2025-02-15 03:00:56,540 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19352.84 MB
2025-02-15 03:00:56,540 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  378.67 MB
2025-02-15 03:00:56,540 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  27988.59 MB
2025-02-15 03:00:56,540 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20449.33 MB
2025-02-15 03:00:56,540 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -7539.26 MB
2025-02-15 03:00:56,540 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         28219.05 MB
2025-02-15 03:00:56,544 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:00:56,544 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:00:56,544 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:56,544 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:56,544 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19352.84 MB
2025-02-15 03:00:56,544 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19537.22 MB
2025-02-15 03:00:56,544 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  184.38 MB
2025-02-15 03:00:56,544 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20449.33 MB
2025-02-15 03:00:56,544 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20828.91 MB
2025-02-15 03:00:56,544 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  379.58 MB
2025-02-15 03:00:56,544 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20105.28 MB
2025-02-15 03:00:57,057 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:00:57,057 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:00:57,057 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.51 seconds
2025-02-15 03:00:57,057 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,057 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19537.22 MB
2025-02-15 03:00:57,057 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19679.22 MB
2025-02-15 03:00:57,057 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  142.00 MB
2025-02-15 03:00:57,057 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20828.91 MB
2025-02-15 03:00:57,057 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20828.91 MB
2025-02-15 03:00:57,057 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:57,057 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         23626.04 MB
2025-02-15 03:00:57,063 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:00:57,063 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:00:57,063 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:57,063 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,063 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19679.22 MB
2025-02-15 03:00:57,063 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20184.55 MB
2025-02-15 03:00:57,063 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  505.33 MB
2025-02-15 03:00:57,063 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20828.91 MB
2025-02-15 03:00:57,063 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21338.52 MB
2025-02-15 03:00:57,063 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  509.61 MB
2025-02-15 03:00:57,063 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20563.72 MB
2025-02-15 03:00:57,169 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:00:57,169 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:00:57,169 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.10 seconds
2025-02-15 03:00:57,169 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,169 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20184.55 MB
2025-02-15 03:00:57,169 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20798.32 MB
2025-02-15 03:00:57,170 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  613.77 MB
2025-02-15 03:00:57,170 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21338.52 MB
2025-02-15 03:00:57,170 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   22988.98 MB
2025-02-15 03:00:57,170 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  1650.46 MB
2025-02-15 03:00:57,170 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22267.34 MB
2025-02-15 03:00:57,170 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:00:57,170 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:00:57,170 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.11 seconds
2025-02-15 03:00:57,170 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,170 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19679.22 MB
2025-02-15 03:00:57,170 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20798.32 MB
2025-02-15 03:00:57,170 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1119.10 MB
2025-02-15 03:00:57,170 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20828.91 MB
2025-02-15 03:00:57,170 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   22988.98 MB
2025-02-15 03:00:57,170 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2160.07 MB
2025-02-15 03:00:57,170 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22267.34 MB
2025-02-15 03:00:57,220 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:00:57,220 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:00:57,220 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:00:57,220 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,220 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21071.80 MB
2025-02-15 03:00:57,220 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16404.61 MB
2025-02-15 03:00:57,220 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  -4667.18 MB
2025-02-15 03:00:57,220 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  22988.98 MB
2025-02-15 03:00:57,220 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   23060.28 MB
2025-02-15 03:00:57,221 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  71.30 MB
2025-02-15 03:00:57,221 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         21233.14 MB
2025-02-15 03:00:57,227 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:00:57,227 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:00:57,227 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:00:57,227 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,227 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16567.66 MB
2025-02-15 03:00:57,227 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16767.30 MB
2025-02-15 03:00:57,227 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  199.64 MB
2025-02-15 03:00:57,227 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  23060.28 MB
2025-02-15 03:00:57,227 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   23060.28 MB
2025-02-15 03:00:57,227 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:57,227 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         16767.30 MB
2025-02-15 03:00:57,228 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:00:57,229 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:00:57,229 - resource_logging.py:150 - __exit__ - DEBUG - Time: 2.33 seconds
2025-02-15 03:00:57,229 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,229 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 21929.31 MB
2025-02-15 03:00:57,229 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16968.20 MB
2025-02-15 03:00:57,229 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  -4961.11 MB
2025-02-15 03:00:57,229 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  27988.59 MB
2025-02-15 03:00:57,229 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   23060.28 MB
2025-02-15 03:00:57,229 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -4928.31 MB
2025-02-15 03:00:57,229 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         16968.20 MB
2025-02-15 03:00:57,490 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:00:57,490 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:00:57,490 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds
2025-02-15 03:00:57,490 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,490 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14061.01 MB
2025-02-15 03:00:57,490 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14161.39 MB
2025-02-15 03:00:57,490 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  100.38 MB
2025-02-15 03:00:57,490 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  23060.28 MB
2025-02-15 03:00:57,490 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   23060.28 MB
2025-02-15 03:00:57,490 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:00:57,490 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14763.67 MB
2025-02-15 03:00:57,508 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8155, cut from 8157
2025-02-15 03:00:57,508 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:57,514 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:00:57,514 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:00:57,514 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:00:57,514 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:00:57,514 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14161.39 MB
2025-02-15 03:00:57,514 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18352.28 MB
2025-02-15 03:00:57,514 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  4190.89 MB
2025-02-15 03:00:57,514 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  23060.28 MB
2025-02-15 03:00:57,514 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   31444.70 MB
2025-02-15 03:00:57,514 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  8384.41 MB
2025-02-15 03:00:57,515 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22542.66 MB
2025-02-15 03:00:57,675 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7947]
2025-02-15 03:00:57,677 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,677 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,677 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,677 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:00:57,682 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:00:57,683 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,683 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:00:57,683 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:00:57,684 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,684 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,684 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,684 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:57,690 - mm_trainer.py:995 - prediction_step - DEBUG - Assistant token at position 295
2025-02-15 03:00:57,691 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,691 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,691 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,691 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:57,691 - mm_trainer.py:767 - evaluation_loop - DEBUG - main_input_name: input_ids
2025-02-15 03:00:57,692 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,692 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['input_ids']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:57,692 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,692 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['attention_mask']: [torch.Size([1, 8192]), torch.bool, cuda:0]
2025-02-15 03:00:57,692 - mm_trainer.py:773 - evaluation_loop - DEBUG - type(inputs_decode): <class 'torch.Tensor'>
2025-02-15 03:00:57,693 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,693 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs_decode: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:57,696 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,696 - resource_logging.py:45 - debug_tensor - DEBUG - Before accelerator.pad_across_processes: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,697 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,697 - resource_logging.py:45 - debug_tensor - DEBUG - Before gather_function: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,698 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,698 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:00:57,699 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,699 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:00:57,805 - finetune_llama.py:467 - compute_metrics - INFO - In compute_metrics()
2025-02-15 03:00:57,806 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,806 - resource_logging.py:44 - debug_tensor - DEBUG - inputs[0]: [(8192,), int64, CPU]
2025-02-15 03:00:57,806 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,806 - resource_logging.py:44 - debug_tensor - DEBUG - inputs[1]: [(8192,), int64, CPU]
2025-02-15 03:00:57,807 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,807 - resource_logging.py:44 - debug_tensor - DEBUG - masks[0]: [(8192,), bool, CPU]
2025-02-15 03:00:57,807 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,807 - resource_logging.py:44 - debug_tensor - DEBUG - masks[1]: [(8192,), bool, CPU]
2025-02-15 03:00:57,808 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,808 - resource_logging.py:45 - debug_tensor - DEBUG - preds: [torch.Size([2, 237, 128256]), torch.float32, cpu]
2025-02-15 03:00:57,809 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,809 - resource_logging.py:45 - debug_tensor - DEBUG - labels: [torch.Size([2, 8192]), torch.int64, cpu]
2025-02-15 03:00:57,809 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,809 - resource_logging.py:45 - debug_tensor - DEBUG - attention_mask: [torch.Size([2, 8192]), torch.bool, cpu]
2025-02-15 03:00:57,810 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:00:57,810 - resource_logging.py:45 - debug_tensor - DEBUG - input_ids: [torch.Size([2, 8192]), torch.int64, cpu]
2025-02-15 03:00:57,811 - finetune_llama.py:501 - compute_metrics - DEBUG - batch 0: output_range=[225, 237]
2025-02-15 03:00:57,814 - finetune_llama.py:504 - compute_metrics - DEBUG - batch 0: cur_outputs=tensor([[   791,   2835,   4478,    369,    420,   2835,    374,    220,     17,
            320, 128009, 128006]])
2025-02-15 03:00:57,814 - finetune_llama.py:507 - compute_metrics - DEBUG - batch 0: decoded_outputs=['The video rate for this video is 2 (']
2025-02-15 03:00:57,814 - finetune_llama.py:509 - compute_metrics - DEBUG - batch 0: decoded_labels=['\n\nThe engagement label of the video is 2.']
2025-02-15 03:00:57,816 - finetune_llama.py:501 - compute_metrics - DEBUG - batch 1: output_range=[225, 237]
2025-02-15 03:00:57,818 - finetune_llama.py:504 - compute_metrics - DEBUG - batch 1: cur_outputs=tensor([[    17,   1620,   4478,    369,    420,   2835,    374,    220,     17,
            320, 128009, 128006]])
2025-02-15 03:00:57,818 - finetune_llama.py:507 - compute_metrics - DEBUG - batch 1: decoded_outputs=['2 final rate for this video is 2 (']
2025-02-15 03:00:57,818 - finetune_llama.py:509 - compute_metrics - DEBUG - batch 1: decoded_labels=['\n\nThe engagement label of the video is 2.']
2025-02-15 03:00:57,818 - finetune_llama.py:518 - compute_metrics - DEBUG - pred_labels=[2, 2]
2025-02-15 03:00:57,818 - finetune_llama.py:519 - compute_metrics - DEBUG - gold_labels=[2, 2]
2025-02-15 03:01:06,059 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:06,059 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:06,067 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:01:06,073 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:06,074 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 61, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:01:06,075 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:06,075 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 61, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:01:07,034 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:01:07,034 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:01:07,034 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.95 seconds
2025-02-15 03:01:07,034 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,034 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18686.48 MB
2025-02-15 03:01:07,034 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18902.36 MB
2025-02-15 03:01:07,034 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  215.88 MB
2025-02-15 03:01:07,034 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  31444.70 MB
2025-02-15 03:01:07,034 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20346.57 MB
2025-02-15 03:01:07,034 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -11098.13 MB
2025-02-15 03:01:07,034 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         27498.01 MB
2025-02-15 03:01:07,041 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:01:07,041 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:01:07,041 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:07,041 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,041 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18902.36 MB
2025-02-15 03:01:07,041 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19006.95 MB
2025-02-15 03:01:07,041 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  104.59 MB
2025-02-15 03:01:07,042 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20346.57 MB
2025-02-15 03:01:07,042 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   19966.98 MB
2025-02-15 03:01:07,042 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -379.58 MB
2025-02-15 03:01:07,042 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19330.83 MB
2025-02-15 03:01:07,347 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:01:07,348 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:01:07,348 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.30 seconds
2025-02-15 03:01:07,348 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,348 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19006.89 MB
2025-02-15 03:01:07,348 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19087.84 MB
2025-02-15 03:01:07,348 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  80.95 MB
2025-02-15 03:01:07,348 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  19966.98 MB
2025-02-15 03:01:07,348 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20206.06 MB
2025-02-15 03:01:07,348 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  239.08 MB
2025-02-15 03:01:07,348 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22903.94 MB
2025-02-15 03:01:07,355 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:01:07,355 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:01:07,355 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:07,355 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,355 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19087.84 MB
2025-02-15 03:01:07,355 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19375.92 MB
2025-02-15 03:01:07,355 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  288.08 MB
2025-02-15 03:01:07,356 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20206.06 MB
2025-02-15 03:01:07,356 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20206.06 MB
2025-02-15 03:01:07,356 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:07,356 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19592.09 MB
2025-02-15 03:01:07,430 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:01:07,430 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:01:07,430 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds
2025-02-15 03:01:07,430 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,430 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19375.92 MB
2025-02-15 03:01:07,430 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19727.13 MB
2025-02-15 03:01:07,430 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  351.21 MB
2025-02-15 03:01:07,430 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20206.06 MB
2025-02-15 03:01:07,430 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21074.28 MB
2025-02-15 03:01:07,430 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  868.22 MB
2025-02-15 03:01:07,430 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20567.73 MB
2025-02-15 03:01:07,432 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:01:07,432 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:01:07,432 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds
2025-02-15 03:01:07,432 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,432 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19087.84 MB
2025-02-15 03:01:07,432 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19727.13 MB
2025-02-15 03:01:07,432 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  639.29 MB
2025-02-15 03:01:07,432 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20206.06 MB
2025-02-15 03:01:07,432 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21074.28 MB
2025-02-15 03:01:07,432 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  868.22 MB
2025-02-15 03:01:07,432 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20567.73 MB
2025-02-15 03:01:07,482 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:01:07,482 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:01:07,482 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:01:07,482 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,482 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20065.84 MB
2025-02-15 03:01:07,482 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20213.58 MB
2025-02-15 03:01:07,482 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  147.74 MB
2025-02-15 03:01:07,483 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21074.28 MB
2025-02-15 03:01:07,483 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:07,483 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  73.40 MB
2025-02-15 03:01:07,483 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20321.52 MB
2025-02-15 03:01:07,491 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:01:07,491 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:01:07,491 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:07,491 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,491 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20306.54 MB
2025-02-15 03:01:07,491 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20453.38 MB
2025-02-15 03:01:07,491 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  146.84 MB
2025-02-15 03:01:07,491 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21147.68 MB
2025-02-15 03:01:07,491 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:07,491 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:07,491 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20453.38 MB
2025-02-15 03:01:07,493 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:01:07,493 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:01:07,493 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.41 seconds
2025-02-15 03:01:07,493 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,493 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18473.96 MB
2025-02-15 03:01:07,493 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20585.22 MB
2025-02-15 03:01:07,493 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  2111.26 MB
2025-02-15 03:01:07,493 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  31444.70 MB
2025-02-15 03:01:07,493 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:07,493 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -10297.02 MB
2025-02-15 03:01:07,493 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20585.22 MB
2025-02-15 03:01:07,687 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:01:07,687 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:01:07,687 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.19 seconds
2025-02-15 03:01:07,687 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,687 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18830.22 MB
2025-02-15 03:01:07,687 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20806.52 MB
2025-02-15 03:01:07,687 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1976.30 MB
2025-02-15 03:01:07,687 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21147.68 MB
2025-02-15 03:01:07,687 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21852.32 MB
2025-02-15 03:01:07,687 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  704.64 MB
2025-02-15 03:01:07,687 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         21004.37 MB
2025-02-15 03:01:07,701 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5347, cut from 5349
2025-02-15 03:01:07,701 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:07,707 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:01:07,707 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:01:07,707 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:01:07,707 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:07,707 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20806.52 MB
2025-02-15 03:01:07,707 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  26339.59 MB
2025-02-15 03:01:07,707 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  5533.07 MB
2025-02-15 03:01:07,707 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21852.32 MB
2025-02-15 03:01:07,707 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   28730.98 MB
2025-02-15 03:01:07,707 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6878.66 MB
2025-02-15 03:01:07,707 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         26339.59 MB
2025-02-15 03:01:07,872 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5139]
2025-02-15 03:01:07,874 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:07,874 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:07,876 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:07,876 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:01:07,883 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:01:07,885 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:07,886 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:01:07,886 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:08,481 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:08,481 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:08,488 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:01:08,493 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:08,493 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 61, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:01:08,495 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:08,495 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 61, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:01:09,460 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:01:09,460 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:01:09,460 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.96 seconds
2025-02-15 03:01:09,460 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,460 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18686.48 MB
2025-02-15 03:01:09,460 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18902.36 MB
2025-02-15 03:01:09,460 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  215.88 MB
2025-02-15 03:01:09,460 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  34233.91 MB
2025-02-15 03:01:09,460 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   19966.98 MB
2025-02-15 03:01:09,460 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -14266.93 MB
2025-02-15 03:01:09,460 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         27498.01 MB
2025-02-15 03:01:09,464 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:01:09,464 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:01:09,464 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:09,465 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,465 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18902.36 MB
2025-02-15 03:01:09,465 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19006.95 MB
2025-02-15 03:01:09,465 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  104.59 MB
2025-02-15 03:01:09,465 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  19966.98 MB
2025-02-15 03:01:09,465 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   19966.98 MB
2025-02-15 03:01:09,465 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:09,465 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19330.83 MB
2025-02-15 03:01:09,774 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:01:09,774 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:01:09,774 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.31 seconds
2025-02-15 03:01:09,774 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,775 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19006.95 MB
2025-02-15 03:01:09,775 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19087.91 MB
2025-02-15 03:01:09,775 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  80.95 MB
2025-02-15 03:01:09,775 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  19966.98 MB
2025-02-15 03:01:09,775 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20350.76 MB
2025-02-15 03:01:09,775 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  383.78 MB
2025-02-15 03:01:09,775 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22904.00 MB
2025-02-15 03:01:09,782 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:01:09,783 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:01:09,783 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:09,783 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,783 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19087.84 MB
2025-02-15 03:01:09,783 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19375.92 MB
2025-02-15 03:01:09,783 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  288.08 MB
2025-02-15 03:01:09,783 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20350.76 MB
2025-02-15 03:01:09,783 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   20350.76 MB
2025-02-15 03:01:09,783 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:09,783 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19592.09 MB
2025-02-15 03:01:09,860 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:01:09,860 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:01:09,860 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.07 seconds
2025-02-15 03:01:09,860 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,860 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19375.92 MB
2025-02-15 03:01:09,860 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19726.64 MB
2025-02-15 03:01:09,860 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  350.72 MB
2025-02-15 03:01:09,860 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20350.76 MB
2025-02-15 03:01:09,860 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21074.28 MB
2025-02-15 03:01:09,860 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  723.52 MB
2025-02-15 03:01:09,860 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20567.73 MB
2025-02-15 03:01:09,862 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:01:09,862 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:01:09,862 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.08 seconds
2025-02-15 03:01:09,862 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,862 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 19087.84 MB
2025-02-15 03:01:09,862 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  19726.64 MB
2025-02-15 03:01:09,862 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  638.80 MB
2025-02-15 03:01:09,862 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  20350.76 MB
2025-02-15 03:01:09,862 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21074.28 MB
2025-02-15 03:01:09,862 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  723.52 MB
2025-02-15 03:01:09,862 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20567.73 MB
2025-02-15 03:01:09,913 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:01:09,913 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:01:09,913 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:01:09,913 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,913 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20065.35 MB
2025-02-15 03:01:09,913 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20213.09 MB
2025-02-15 03:01:09,913 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  147.74 MB
2025-02-15 03:01:09,913 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21074.28 MB
2025-02-15 03:01:09,913 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:09,913 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  73.40 MB
2025-02-15 03:01:09,913 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20321.03 MB
2025-02-15 03:01:09,921 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:01:09,921 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:01:09,921 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:09,921 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,921 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20306.05 MB
2025-02-15 03:01:09,921 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20452.76 MB
2025-02-15 03:01:09,921 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  146.71 MB
2025-02-15 03:01:09,921 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21147.68 MB
2025-02-15 03:01:09,921 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:09,921 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:09,921 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20452.76 MB
2025-02-15 03:01:09,923 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:01:09,923 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:01:09,923 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.42 seconds
2025-02-15 03:01:09,923 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:09,923 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 18473.96 MB
2025-02-15 03:01:09,923 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20584.60 MB
2025-02-15 03:01:09,923 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  2110.64 MB
2025-02-15 03:01:09,923 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  34233.91 MB
2025-02-15 03:01:09,923 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21147.68 MB
2025-02-15 03:01:09,923 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -13086.23 MB
2025-02-15 03:01:09,924 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         20584.60 MB
2025-02-15 03:01:10,119 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:01:10,119 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:01:10,119 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.19 seconds
2025-02-15 03:01:10,119 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:10,119 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20584.60 MB
2025-02-15 03:01:10,119 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  20806.52 MB
2025-02-15 03:01:10,119 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  221.92 MB
2025-02-15 03:01:10,119 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21147.68 MB
2025-02-15 03:01:10,119 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   21940.40 MB
2025-02-15 03:01:10,119 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  792.72 MB
2025-02-15 03:01:10,119 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         21243.85 MB
2025-02-15 03:01:10,133 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5347, cut from 5349
2025-02-15 03:01:10,133 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:10,138 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:01:10,138 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:01:10,138 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:01:10,139 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:10,139 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 20806.52 MB
2025-02-15 03:01:10,139 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  26340.04 MB
2025-02-15 03:01:10,139 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  5533.53 MB
2025-02-15 03:01:10,139 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  21940.40 MB
2025-02-15 03:01:10,139 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   28819.06 MB
2025-02-15 03:01:10,139 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6878.66 MB
2025-02-15 03:01:10,139 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         26340.04 MB
2025-02-15 03:01:10,305 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5139]
2025-02-15 03:01:10,308 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:10,308 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:10,310 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:10,310 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:01:10,317 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:01:10,319 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:10,319 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:01:10,319 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:10,908 - trainer.py:3503 - _save - INFO - Saving model checkpoint to ./checkpoints/cambrian_llama3_2/checkpoint-4
2025-02-15 03:01:10,913 - configuration_utils.py:472 - save_pretrained - INFO - Configuration saved in ./checkpoints/cambrian_llama3_2/checkpoint-4/config.json
2025-02-15 03:01:10,913 - configuration_utils.py:807 - save_pretrained - INFO - Configuration saved in ./checkpoints/cambrian_llama3_2/checkpoint-4/generation_config.json
2025-02-15 03:01:19,804 - modeling_utils.py:2750 - save_pretrained - INFO - The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 2 checkpoint shards. You can find where each parameters has been saved in the index located at ./checkpoints/cambrian_llama3_2/checkpoint-4/model.safetensors.index.json.
2025-02-15 03:01:19,807 - tokenization_utils_base.py:2702 - save_pretrained - INFO - tokenizer config file saved in ./checkpoints/cambrian_llama3_2/checkpoint-4/tokenizer_config.json
2025-02-15 03:01:19,807 - tokenization_utils_base.py:2711 - save_pretrained - INFO - Special tokens file saved in ./checkpoints/cambrian_llama3_2/checkpoint-4/special_tokens_map.json
2025-02-15 03:01:31,436 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:31,436 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:31,441 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:01:31,443 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:31,443 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 60, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:01:31,444 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:31,444 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 60, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:01:32,355 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:01:32,355 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:01:32,355 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.91 seconds
2025-02-15 03:01:32,355 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,355 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13600.06 MB
2025-02-15 03:01:32,355 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13812.39 MB
2025-02-15 03:01:32,355 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  212.34 MB
2025-02-15 03:01:32,355 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  34321.99 MB
2025-02-15 03:01:32,355 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15466.50 MB
2025-02-15 03:01:32,355 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -18855.49 MB
2025-02-15 03:01:32,355 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22267.86 MB
2025-02-15 03:01:32,360 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:01:32,360 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:01:32,360 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:32,360 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,360 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13812.33 MB
2025-02-15 03:01:32,360 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13915.20 MB
2025-02-15 03:01:32,360 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  102.88 MB
2025-02-15 03:01:32,360 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15466.50 MB
2025-02-15 03:01:32,360 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15089.01 MB
2025-02-15 03:01:32,360 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -377.49 MB
2025-02-15 03:01:32,360 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14233.77 MB
2025-02-15 03:01:32,648 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:01:32,648 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:01:32,648 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.29 seconds
2025-02-15 03:01:32,648 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,648 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13915.20 MB
2025-02-15 03:01:32,648 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13994.83 MB
2025-02-15 03:01:32,648 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  79.63 MB
2025-02-15 03:01:32,648 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15089.01 MB
2025-02-15 03:01:32,648 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15321.79 MB
2025-02-15 03:01:32,648 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  232.78 MB
2025-02-15 03:01:32,648 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         17746.57 MB
2025-02-15 03:01:32,653 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:01:32,653 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:01:32,653 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:32,653 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,653 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13994.83 MB
2025-02-15 03:01:32,653 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14278.19 MB
2025-02-15 03:01:32,653 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  283.36 MB
2025-02-15 03:01:32,653 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15321.79 MB
2025-02-15 03:01:32,653 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15321.79 MB
2025-02-15 03:01:32,653 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:32,653 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14490.81 MB
2025-02-15 03:01:32,712 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:01:32,712 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:01:32,712 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds
2025-02-15 03:01:32,712 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,712 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14278.19 MB
2025-02-15 03:01:32,712 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14622.39 MB
2025-02-15 03:01:32,712 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  344.20 MB
2025-02-15 03:01:32,712 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15321.79 MB
2025-02-15 03:01:32,712 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15892.22 MB
2025-02-15 03:01:32,712 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  570.43 MB
2025-02-15 03:01:32,712 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15450.31 MB
2025-02-15 03:01:32,713 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:01:32,713 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:01:32,713 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.06 seconds
2025-02-15 03:01:32,713 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,713 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13994.83 MB
2025-02-15 03:01:32,713 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14622.39 MB
2025-02-15 03:01:32,713 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  627.56 MB
2025-02-15 03:01:32,713 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15321.79 MB
2025-02-15 03:01:32,713 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15892.22 MB
2025-02-15 03:01:32,713 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  570.43 MB
2025-02-15 03:01:32,713 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15450.31 MB
2025-02-15 03:01:32,742 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:01:32,742 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:01:32,742 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.03 seconds
2025-02-15 03:01:32,742 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,742 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14775.74 MB
2025-02-15 03:01:32,742 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14920.29 MB
2025-02-15 03:01:32,742 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  144.54 MB
2025-02-15 03:01:32,742 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15892.22 MB
2025-02-15 03:01:32,742 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15984.49 MB
2025-02-15 03:01:32,742 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  92.27 MB
2025-02-15 03:01:32,742 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15026.45 MB
2025-02-15 03:01:32,747 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:01:32,747 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:01:32,747 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:32,747 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,747 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15011.72 MB
2025-02-15 03:01:32,747 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15140.50 MB
2025-02-15 03:01:32,747 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  128.78 MB
2025-02-15 03:01:32,747 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15984.49 MB
2025-02-15 03:01:32,747 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15984.49 MB
2025-02-15 03:01:32,747 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:32,747 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15140.50 MB
2025-02-15 03:01:32,748 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:01:32,749 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:01:32,749 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.30 seconds
2025-02-15 03:01:32,749 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,749 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13391.01 MB
2025-02-15 03:01:32,749 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15270.27 MB
2025-02-15 03:01:32,749 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1879.26 MB
2025-02-15 03:01:32,749 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  34321.99 MB
2025-02-15 03:01:32,749 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15984.49 MB
2025-02-15 03:01:32,749 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -18337.50 MB
2025-02-15 03:01:32,749 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15270.27 MB
2025-02-15 03:01:32,907 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:01:32,907 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:01:32,907 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.16 seconds
2025-02-15 03:01:32,907 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,907 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13615.16 MB
2025-02-15 03:01:32,908 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  13680.01 MB
2025-02-15 03:01:32,908 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  64.84 MB
2025-02-15 03:01:32,908 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15984.49 MB
2025-02-15 03:01:32,908 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15984.49 MB
2025-02-15 03:01:32,908 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:32,908 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14069.07 MB
2025-02-15 03:01:32,919 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 5263, cut from 5265
2025-02-15 03:01:32,920 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['The video rate for this video is 2 (']
2025-02-15 03:01:32,924 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:01:32,924 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:01:32,924 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.01 seconds
2025-02-15 03:01:32,924 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:32,924 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13680.01 MB
2025-02-15 03:01:32,924 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16387.43 MB
2025-02-15 03:01:32,924 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  2707.42 MB
2025-02-15 03:01:32,924 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15984.49 MB
2025-02-15 03:01:32,924 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   22754.10 MB
2025-02-15 03:01:32,924 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  6769.61 MB
2025-02-15 03:01:32,924 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         19094.86 MB
2025-02-15 03:01:33,025 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 5055]
2025-02-15 03:01:33,027 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,027 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,028 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,028 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:01:33,032 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:01:33,033 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,033 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:01:33,033 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['The video rate for this video is 2 (']
2025-02-15 03:01:33,034 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,034 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,035 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,035 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,040 - mm_trainer.py:995 - prediction_step - DEBUG - Assistant token at position 295
2025-02-15 03:01:33,041 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,041 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,041 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,041 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,041 - mm_trainer.py:767 - evaluation_loop - DEBUG - main_input_name: input_ids
2025-02-15 03:01:33,042 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,042 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['input_ids']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,042 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,042 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['attention_mask']: [torch.Size([1, 8192]), torch.bool, cuda:0]
2025-02-15 03:01:33,042 - mm_trainer.py:773 - evaluation_loop - DEBUG - type(inputs_decode): <class 'torch.Tensor'>
2025-02-15 03:01:33,043 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,043 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs_decode: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,045 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,045 - resource_logging.py:45 - debug_tensor - DEBUG - Before accelerator.pad_across_processes: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,046 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,046 - resource_logging.py:45 - debug_tensor - DEBUG - Before gather_function: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,047 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,047 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:33,048 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,048 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,056 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,056 - resource_logging.py:45 - debug_tensor - DEBUG - In compute_loss(): inputs['labels']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:33,063 - mm_trainer.py:618 - compute_loss - DEBUG - In compute_loss(): assistant token at position 224
2025-02-15 03:01:33,065 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,065 - resource_logging.py:45 - debug_tensor - DEBUG - images_0: [torch.Size([1, 107, 3, 384, 384]), torch.float32, cuda:0]
2025-02-15 03:01:33,067 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:33,067 - resource_logging.py:45 - debug_tensor - DEBUG - images_1: [torch.Size([1, 107, 3, 378, 378]), torch.float32, cuda:0]
2025-02-15 03:01:34,696 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:dino
2025-02-15 03:01:34,696 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 871
2025-02-15 03:01:34,696 - resource_logging.py:150 - __exit__ - DEBUG - Time: 1.63 seconds
2025-02-15 03:01:34,696 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:34,696 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14049.29 MB
2025-02-15 03:01:34,696 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14428.87 MB
2025-02-15 03:01:34,696 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  379.58 MB
2025-02-15 03:01:34,696 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  22754.10 MB
2025-02-15 03:01:34,696 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15428.75 MB
2025-02-15 03:01:34,696 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -7325.35 MB
2025-02-15 03:01:34,696 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         23294.16 MB
2025-02-15 03:01:34,699 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> select_frame
2025-02-15 03:01:34,699 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 877
2025-02-15 03:01:34,699 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:34,699 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:34,699 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14428.87 MB
2025-02-15 03:01:34,699 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14612.33 MB
2025-02-15 03:01:34,699 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  183.46 MB
2025-02-15 03:01:34,699 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15428.75 MB
2025-02-15 03:01:34,699 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15808.33 MB
2025-02-15 03:01:34,699 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  379.58 MB
2025-02-15 03:01:34,699 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15181.32 MB
2025-02-15 03:01:35,216 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> encode_images:siglip
2025-02-15 03:01:35,216 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 892
2025-02-15 03:01:35,216 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.52 seconds
2025-02-15 03:01:35,216 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,216 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14612.33 MB
2025-02-15 03:01:35,216 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14754.33 MB
2025-02-15 03:01:35,216 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  142.00 MB
2025-02-15 03:01:35,216 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15808.33 MB
2025-02-15 03:01:35,216 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   15428.75 MB
2025-02-15 03:01:35,216 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -379.58 MB
2025-02-15 03:01:35,216 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         18701.16 MB
2025-02-15 03:01:35,222 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> mm_projector_aux_0/1
2025-02-15 03:01:35,222 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 933
2025-02-15 03:01:35,222 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:35,222 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,222 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14754.33 MB
2025-02-15 03:01:35,222 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15259.66 MB
2025-02-15 03:01:35,222 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  505.33 MB
2025-02-15 03:01:35,222 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15428.75 MB
2025-02-15 03:01:35,222 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   16190.01 MB
2025-02-15 03:01:35,222 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  761.27 MB
2025-02-15 03:01:35,223 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         15638.83 MB
2025-02-15 03:01:35,328 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA -> query_group
2025-02-15 03:01:35,328 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 951
2025-02-15 03:01:35,328 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.10 seconds
2025-02-15 03:01:35,328 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,328 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 15259.66 MB
2025-02-15 03:01:35,328 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15873.43 MB
2025-02-15 03:01:35,328 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  613.77 MB
2025-02-15 03:01:35,328 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  16190.01 MB
2025-02-15 03:01:35,328 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   17968.40 MB
2025-02-15 03:01:35,328 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  1778.38 MB
2025-02-15 03:01:35,328 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         17342.45 MB
2025-02-15 03:01:35,329 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> SVA
2025-02-15 03:01:35,329 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 928
2025-02-15 03:01:35,329 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.11 seconds
2025-02-15 03:01:35,329 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,329 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14754.33 MB
2025-02-15 03:01:35,329 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  15873.43 MB
2025-02-15 03:01:35,329 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  1119.10 MB
2025-02-15 03:01:35,329 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  15428.75 MB
2025-02-15 03:01:35,329 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   17968.40 MB
2025-02-15 03:01:35,329 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  2539.65 MB
2025-02-15 03:01:35,329 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         17342.45 MB
2025-02-15 03:01:35,379 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> rearrange_vision_tower+padding
2025-02-15 03:01:35,379 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1094
2025-02-15 03:01:35,379 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.05 seconds
2025-02-15 03:01:35,379 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,379 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16146.91 MB
2025-02-15 03:01:35,379 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16404.68 MB
2025-02-15 03:01:35,379 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  257.77 MB
2025-02-15 03:01:35,379 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  17968.40 MB
2025-02-15 03:01:35,379 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   18134.07 MB
2025-02-15 03:01:35,379 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  165.68 MB
2025-02-15 03:01:35,379 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         16594.01 MB
2025-02-15 03:01:35,385 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianMetaForCausalLM -> prepare_inputs_labels_for_multimodal -> Embedding+Cross-modal+STC
2025-02-15 03:01:35,385 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/cambrian_arch.py, Line: 1395
2025-02-15 03:01:35,385 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.00 seconds
2025-02-15 03:01:35,385 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,385 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 16567.73 MB
2025-02-15 03:01:35,385 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16767.37 MB
2025-02-15 03:01:35,385 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  199.64 MB
2025-02-15 03:01:35,385 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  18134.07 MB
2025-02-15 03:01:35,385 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   18134.07 MB
2025-02-15 03:01:35,385 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:35,385 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         16767.37 MB
2025-02-15 03:01:35,386 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> prepare_inputs_labels_for_multimodal
2025-02-15 03:01:35,386 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 309
2025-02-15 03:01:35,386 - resource_logging.py:150 - __exit__ - DEBUG - Time: 2.32 seconds
2025-02-15 03:01:35,386 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,386 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 13676.49 MB
2025-02-15 03:01:35,387 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  16968.27 MB
2025-02-15 03:01:35,387 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  3291.78 MB
2025-02-15 03:01:35,387 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  22754.10 MB
2025-02-15 03:01:35,387 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   18134.07 MB
2025-02-15 03:01:35,387 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  -4620.03 MB
2025-02-15 03:01:35,387 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         16968.27 MB
2025-02-15 03:01:35,645 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> model.forward
2025-02-15 03:01:35,645 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 390
2025-02-15 03:01:35,645 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.26 seconds
2025-02-15 03:01:35,645 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,645 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14061.07 MB
2025-02-15 03:01:35,645 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  14161.45 MB
2025-02-15 03:01:35,645 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  100.38 MB
2025-02-15 03:01:35,645 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  18134.07 MB
2025-02-15 03:01:35,645 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   18134.07 MB
2025-02-15 03:01:35,645 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  0.00 MB
2025-02-15 03:01:35,645 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         14763.74 MB
2025-02-15 03:01:35,663 - cambrian_llama.py:481 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): Found assistant token at index 8155, cut from 8157
2025-02-15 03:01:35,663 - cambrian_llama.py:487 - forward - INFO - In CambrianLlamaForCausalLM.forward(): Decoded assistant outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:35,669 - resource_logging.py:148 - __exit__ - DEBUG - Section name: CambrianLlamaForCausalLM -> forward -> lm_head, logits
2025-02-15 03:01:35,669 - resource_logging.py:149 - __exit__ - DEBUG - File: /root/hcmus/LongVidLLaMA/longvu/language_model/cambrian_llama.py, Line: 456
2025-02-15 03:01:35,669 - resource_logging.py:150 - __exit__ - DEBUG - Time: 0.02 seconds
2025-02-15 03:01:35,669 - resource_logging.py:151 - __exit__ - DEBUG - Device: cuda:0
2025-02-15 03:01:35,669 - resource_logging.py:152 - __exit__ - DEBUG - Allocated before block: 14161.45 MB
2025-02-15 03:01:35,669 - resource_logging.py:153 - __exit__ - DEBUG - Allocated after block:  18352.35 MB
2025-02-15 03:01:35,669 - resource_logging.py:154 - __exit__ - DEBUG - Net allocated change:  4190.89 MB
2025-02-15 03:01:35,669 - resource_logging.py:155 - __exit__ - DEBUG - Reserved before block:  18134.07 MB
2025-02-15 03:01:35,669 - resource_logging.py:156 - __exit__ - DEBUG - Reserved after block:   28615.64 MB
2025-02-15 03:01:35,669 - resource_logging.py:157 - __exit__ - DEBUG - Net reserved change:  10481.57 MB
2025-02-15 03:01:35,669 - resource_logging.py:158 - __exit__ - DEBUG - Peak allocated:         22542.73 MB
2025-02-15 03:01:35,828 - cambrian_llama.py:512 - forward - DEBUG - sample 0: correct range [16, 7947]
2025-02-15 03:01:35,830 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,830 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,831 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,831 - resource_logging.py:45 - debug_tensor - DEBUG - In CambrianLlamaForCausalLM.forward(): orig_labels: [torch.Size([1, 238]), torch.int64, cuda:0]
2025-02-15 03:01:35,836 - cambrian_llama.py:529 - forward - DEBUG - In CambrianLlamaForCausalLM.forward(): sample 0: output range: [225, 237]
2025-02-15 03:01:35,838 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,838 - resource_logging.py:45 - debug_tensor - DEBUG - outs: [torch.Size([1, 12]), torch.int64, cuda:0]
2025-02-15 03:01:35,838 - cambrian_llama.py:533 - forward - INFO - sample 0: decoded outputs: ['2 final rate for this video is 2 (']
2025-02-15 03:01:35,839 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,839 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,839 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,839 - resource_logging.py:45 - debug_tensor - DEBUG - In prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:35,846 - mm_trainer.py:995 - prediction_step - DEBUG - Assistant token at position 295
2025-02-15 03:01:35,846 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,846 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,847 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,847 - resource_logging.py:45 - debug_tensor - DEBUG - After prediction_step: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:35,847 - mm_trainer.py:767 - evaluation_loop - DEBUG - main_input_name: input_ids
2025-02-15 03:01:35,847 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,847 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['input_ids']: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:35,848 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,848 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs['attention_mask']: [torch.Size([1, 8192]), torch.bool, cuda:0]
2025-02-15 03:01:35,848 - mm_trainer.py:773 - evaluation_loop - DEBUG - type(inputs_decode): <class 'torch.Tensor'>
2025-02-15 03:01:35,848 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,848 - resource_logging.py:45 - debug_tensor - DEBUG - In evaluation_loop(): inputs_decode: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:35,851 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,851 - resource_logging.py:45 - debug_tensor - DEBUG - Before accelerator.pad_across_processes: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,851 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,851 - resource_logging.py:45 - debug_tensor - DEBUG - Before gather_function: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,852 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,853 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: logits: [torch.Size([1, 237, 128256]), torch.float32, cuda:0]
2025-02-15 03:01:35,854 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,854 - resource_logging.py:45 - debug_tensor - DEBUG - Add to all_preds: labels: [torch.Size([1, 8192]), torch.int64, cuda:0]
2025-02-15 03:01:35,956 - finetune_llama.py:467 - compute_metrics - INFO - In compute_metrics()
2025-02-15 03:01:35,957 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,957 - resource_logging.py:44 - debug_tensor - DEBUG - inputs[0]: [(8192,), int64, CPU]
2025-02-15 03:01:35,958 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,958 - resource_logging.py:44 - debug_tensor - DEBUG - inputs[1]: [(8192,), int64, CPU]
2025-02-15 03:01:35,958 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,958 - resource_logging.py:44 - debug_tensor - DEBUG - masks[0]: [(8192,), bool, CPU]
2025-02-15 03:01:35,959 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,959 - resource_logging.py:44 - debug_tensor - DEBUG - masks[1]: [(8192,), bool, CPU]
2025-02-15 03:01:35,960 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,960 - resource_logging.py:45 - debug_tensor - DEBUG - preds: [torch.Size([2, 237, 128256]), torch.float32, cpu]
2025-02-15 03:01:35,960 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,960 - resource_logging.py:45 - debug_tensor - DEBUG - labels: [torch.Size([2, 8192]), torch.int64, cpu]
2025-02-15 03:01:35,961 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,961 - resource_logging.py:45 - debug_tensor - DEBUG - attention_mask: [torch.Size([2, 8192]), torch.bool, cpu]
2025-02-15 03:01:35,961 - resource_logging.py:42 - debug_tensor - DEBUG - File: Unknown, Line: Unknown
2025-02-15 03:01:35,961 - resource_logging.py:45 - debug_tensor - DEBUG - input_ids: [torch.Size([2, 8192]), torch.int64, cpu]
2025-02-15 03:01:35,962 - finetune_llama.py:501 - compute_metrics - DEBUG - batch 0: output_range=[225, 237]
2025-02-15 03:01:35,965 - finetune_llama.py:504 - compute_metrics - DEBUG - batch 0: cur_outputs=tensor([[   791,   2835,   4478,    369,    420,   2835,    374,    220,     17,
            320, 128009, 128006]])
2025-02-15 03:01:35,965 - finetune_llama.py:507 - compute_metrics - DEBUG - batch 0: decoded_outputs=['The video rate for this video is 2 (']
2025-02-15 03:01:35,965 - finetune_llama.py:509 - compute_metrics - DEBUG - batch 0: decoded_labels=['\n\nThe engagement label of the video is 2.']
2025-02-15 03:01:35,967 - finetune_llama.py:501 - compute_metrics - DEBUG - batch 1: output_range=[225, 237]
2025-02-15 03:01:35,969 - finetune_llama.py:504 - compute_metrics - DEBUG - batch 1: cur_outputs=tensor([[    17,   1620,   4478,    369,    420,   2835,    374,    220,     17,
            320, 128009, 128006]])
2025-02-15 03:01:35,969 - finetune_llama.py:507 - compute_metrics - DEBUG - batch 1: decoded_outputs=['2 final rate for this video is 2 (']
2025-02-15 03:01:35,969 - finetune_llama.py:509 - compute_metrics - DEBUG - batch 1: decoded_labels=['\n\nThe engagement label of the video is 2.']
2025-02-15 03:01:35,969 - finetune_llama.py:518 - compute_metrics - DEBUG - pred_labels=[2, 2]
2025-02-15 03:01:35,969 - finetune_llama.py:519 - compute_metrics - DEBUG - gold_labels=[2, 2]
2025-02-15 03:01:35,975 - trainer.py:2394 - _inner_training_loop - INFO - 

Training completed. Do not forget to share your model on huggingface.co/models =)


2025-02-15 03:01:35,986 - configuration_utils.py:472 - save_pretrained - INFO - Configuration saved in ./checkpoints/cambrian_llama3_2/config.json