|
[2025-03-20 23:52:38,528][00031] Saving configuration to /kaggle/working/train_dir/default_experiment/config.json... |
|
[2025-03-20 23:52:38,530][00031] Rollout worker 0 uses device cpu |
|
[2025-03-20 23:52:38,531][00031] Rollout worker 1 uses device cpu |
|
[2025-03-20 23:52:38,532][00031] Rollout worker 2 uses device cpu |
|
[2025-03-20 23:52:38,533][00031] Rollout worker 3 uses device cpu |
|
[2025-03-20 23:52:38,533][00031] Rollout worker 4 uses device cpu |
|
[2025-03-20 23:52:38,535][00031] Rollout worker 5 uses device cpu |
|
[2025-03-20 23:52:38,536][00031] Rollout worker 6 uses device cpu |
|
[2025-03-20 23:52:38,536][00031] Rollout worker 7 uses device cpu |
|
[2025-03-20 23:52:38,665][00031] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-20 23:52:38,666][00031] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-03-20 23:52:38,711][00031] Starting all processes... |
|
[2025-03-20 23:52:38,712][00031] Starting process learner_proc0 |
|
[2025-03-20 23:52:38,804][00031] Starting all processes... |
|
[2025-03-20 23:52:38,811][00031] Starting process inference_proc0-0 |
|
[2025-03-20 23:52:38,812][00031] Starting process rollout_proc0 |
|
[2025-03-20 23:52:38,813][00031] Starting process rollout_proc1 |
|
[2025-03-20 23:52:38,813][00031] Starting process rollout_proc2 |
|
[2025-03-20 23:52:38,813][00031] Starting process rollout_proc3 |
|
[2025-03-20 23:52:38,814][00031] Starting process rollout_proc4 |
|
[2025-03-20 23:52:38,815][00031] Starting process rollout_proc5 |
|
[2025-03-20 23:52:38,815][00031] Starting process rollout_proc6 |
|
[2025-03-20 23:52:38,818][00031] Starting process rollout_proc7 |
|
[2025-03-20 23:52:46,134][00213] Worker 2 uses CPU cores [2] |
|
[2025-03-20 23:52:46,654][00196] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-20 23:52:46,657][00196] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-03-20 23:52:46,704][00196] Num visible devices: 1 |
|
[2025-03-20 23:52:46,717][00196] Starting seed is not provided |
|
[2025-03-20 23:52:46,718][00196] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-20 23:52:46,718][00196] Initializing actor-critic model on device cuda:0 |
|
[2025-03-20 23:52:46,719][00196] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-20 23:52:46,725][00196] RunningMeanStd input shape: (1,) |
|
[2025-03-20 23:52:46,793][00196] ConvEncoder: input_channels=3 |
|
[2025-03-20 23:52:46,980][00210] Worker 0 uses CPU cores [0] |
|
[2025-03-20 23:52:47,234][00217] Worker 7 uses CPU cores [3] |
|
[2025-03-20 23:52:47,376][00212] Worker 1 uses CPU cores [1] |
|
[2025-03-20 23:52:47,394][00196] Conv encoder output size: 512 |
|
[2025-03-20 23:52:47,394][00196] Policy head output size: 512 |
|
[2025-03-20 23:52:47,488][00196] Created Actor Critic model with architecture: |
|
[2025-03-20 23:52:47,488][00196] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-03-20 23:52:47,496][00215] Worker 5 uses CPU cores [1] |
|
[2025-03-20 23:52:47,501][00211] Worker 3 uses CPU cores [3] |
|
[2025-03-20 23:52:47,539][00209] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-20 23:52:47,539][00209] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-03-20 23:52:47,568][00216] Worker 6 uses CPU cores [2] |
|
[2025-03-20 23:52:47,571][00209] Num visible devices: 1 |
|
[2025-03-20 23:52:47,592][00214] Worker 4 uses CPU cores [0] |
|
[2025-03-20 23:52:47,798][00196] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-03-20 23:52:49,651][00196] No checkpoints found |
|
[2025-03-20 23:52:49,651][00196] Did not load from checkpoint, starting from scratch! |
|
[2025-03-20 23:52:49,651][00196] Initialized policy 0 weights for model version 0 |
|
[2025-03-20 23:52:49,655][00196] LearnerWorker_p0 finished initialization! |
|
[2025-03-20 23:52:49,656][00196] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-20 23:52:49,751][00209] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-20 23:52:49,752][00209] RunningMeanStd input shape: (1,) |
|
[2025-03-20 23:52:49,765][00209] ConvEncoder: input_channels=3 |
|
[2025-03-20 23:52:49,879][00209] Conv encoder output size: 512 |
|
[2025-03-20 23:52:49,880][00209] Policy head output size: 512 |
|
[2025-03-20 23:52:49,956][00031] Inference worker 0-0 is ready! |
|
[2025-03-20 23:52:49,956][00031] All inference workers are ready! Signal rollout workers to start! |
|
[2025-03-20 23:52:50,070][00213] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,078][00212] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,078][00211] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,076][00215] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,080][00214] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,080][00216] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,081][00217] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,086][00210] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-20 23:52:50,671][00210] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:50,671][00217] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:50,769][00212] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:51,014][00213] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:51,023][00216] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:51,102][00217] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:51,202][00212] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:51,406][00210] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:51,559][00217] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:51,818][00214] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:51,820][00213] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:51,934][00217] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:51,947][00216] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:52,278][00212] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:52,420][00211] Decorrelating experience for 0 frames... |
|
[2025-03-20 23:52:52,632][00213] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:52,724][00214] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:52,868][00211] Decorrelating experience for 32 frames... |
|
[2025-03-20 23:52:52,903][00212] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:53,299][00214] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:53,344][00216] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:53,405][00211] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:53,545][00213] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:53,890][00031] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 12. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-20 23:52:53,892][00031] Avg episode reward: [(0, '1.280')] |
|
[2025-03-20 23:52:53,999][00210] Decorrelating experience for 64 frames... |
|
[2025-03-20 23:52:54,225][00211] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:54,372][00214] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:54,852][00216] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:55,269][00210] Decorrelating experience for 96 frames... |
|
[2025-03-20 23:52:55,749][00196] Signal inference workers to stop experience collection... |
|
[2025-03-20 23:52:55,759][00209] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-03-20 23:52:57,987][00196] Signal inference workers to resume experience collection... |
|
[2025-03-20 23:52:57,988][00209] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-03-20 23:52:58,654][00031] Heartbeat connected on Batcher_0 |
|
[2025-03-20 23:52:58,658][00031] Heartbeat connected on LearnerWorker_p0 |
|
[2025-03-20 23:52:58,671][00031] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-03-20 23:52:58,684][00031] Heartbeat connected on RolloutWorker_w0 |
|
[2025-03-20 23:52:58,693][00031] Heartbeat connected on RolloutWorker_w2 |
|
[2025-03-20 23:52:58,706][00031] Heartbeat connected on RolloutWorker_w4 |
|
[2025-03-20 23:52:58,711][00031] Heartbeat connected on RolloutWorker_w6 |
|
[2025-03-20 23:52:58,726][00031] Heartbeat connected on RolloutWorker_w1 |
|
[2025-03-20 23:52:58,727][00031] Heartbeat connected on RolloutWorker_w3 |
|
[2025-03-20 23:52:58,731][00031] Heartbeat connected on RolloutWorker_w7 |
|
[2025-03-20 23:52:58,892][00031] Fps is (10 sec: 2456.4, 60 sec: 2456.4, 300 sec: 2456.4). Total num frames: 12288. Throughput: 0: 473.8. Samples: 2382. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) |
|
[2025-03-20 23:52:58,895][00031] Avg episode reward: [(0, '2.841')] |
|
[2025-03-20 23:53:02,646][00209] Updated weights for policy 0, policy_version 10 (0.0151) |
|
[2025-03-20 23:53:03,890][00031] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 49152. Throughput: 0: 952.6. Samples: 9538. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:53:03,894][00031] Avg episode reward: [(0, '4.181')] |
|
[2025-03-20 23:53:07,505][00209] Updated weights for policy 0, policy_version 20 (0.0015) |
|
[2025-03-20 23:53:08,890][00031] Fps is (10 sec: 8194.0, 60 sec: 6280.5, 300 sec: 6280.5). Total num frames: 94208. Throughput: 0: 1473.3. Samples: 22112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:53:08,895][00031] Avg episode reward: [(0, '4.555')] |
|
[2025-03-20 23:53:12,075][00209] Updated weights for policy 0, policy_version 30 (0.0019) |
|
[2025-03-20 23:53:13,890][00031] Fps is (10 sec: 8601.6, 60 sec: 6758.4, 300 sec: 6758.4). Total num frames: 135168. Throughput: 0: 1432.6. Samples: 28664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:53:13,891][00031] Avg episode reward: [(0, '4.422')] |
|
[2025-03-20 23:53:13,951][00196] Saving new best policy, reward=4.422! |
|
[2025-03-20 23:53:16,779][00209] Updated weights for policy 0, policy_version 40 (0.0015) |
|
[2025-03-20 23:53:18,890][00031] Fps is (10 sec: 8601.1, 60 sec: 7208.8, 300 sec: 7208.8). Total num frames: 180224. Throughput: 0: 1680.6. Samples: 42028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:53:18,892][00031] Avg episode reward: [(0, '4.515')] |
|
[2025-03-20 23:53:18,894][00196] Saving new best policy, reward=4.515! |
|
[2025-03-20 23:53:21,353][00209] Updated weights for policy 0, policy_version 50 (0.0015) |
|
[2025-03-20 23:53:23,890][00031] Fps is (10 sec: 9010.9, 60 sec: 7509.2, 300 sec: 7509.2). Total num frames: 225280. Throughput: 0: 1842.3. Samples: 55282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:53:23,892][00031] Avg episode reward: [(0, '4.373')] |
|
[2025-03-20 23:53:26,024][00209] Updated weights for policy 0, policy_version 60 (0.0018) |
|
[2025-03-20 23:53:28,890][00031] Fps is (10 sec: 8602.0, 60 sec: 7606.9, 300 sec: 7606.9). Total num frames: 266240. Throughput: 0: 1769.1. Samples: 61930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:53:28,891][00031] Avg episode reward: [(0, '4.372')] |
|
[2025-03-20 23:53:31,341][00209] Updated weights for policy 0, policy_version 70 (0.0017) |
|
[2025-03-20 23:53:33,890][00031] Fps is (10 sec: 8192.4, 60 sec: 7680.0, 300 sec: 7680.0). Total num frames: 307200. Throughput: 0: 1840.9. Samples: 73648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:53:33,891][00031] Avg episode reward: [(0, '4.470')] |
|
[2025-03-20 23:53:35,957][00209] Updated weights for policy 0, policy_version 80 (0.0017) |
|
[2025-03-20 23:53:38,890][00031] Fps is (10 sec: 8601.6, 60 sec: 7827.9, 300 sec: 7827.9). Total num frames: 352256. Throughput: 0: 1932.4. Samples: 86972. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-03-20 23:53:38,892][00031] Avg episode reward: [(0, '4.562')] |
|
[2025-03-20 23:53:38,893][00196] Saving new best policy, reward=4.562! |
|
[2025-03-20 23:53:40,696][00209] Updated weights for policy 0, policy_version 90 (0.0020) |
|
[2025-03-20 23:53:43,890][00031] Fps is (10 sec: 8601.5, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 393216. Throughput: 0: 2021.0. Samples: 93324. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:53:43,892][00031] Avg episode reward: [(0, '4.680')] |
|
[2025-03-20 23:53:43,956][00196] Saving new best policy, reward=4.680! |
|
[2025-03-20 23:53:45,357][00209] Updated weights for policy 0, policy_version 100 (0.0016) |
|
[2025-03-20 23:53:48,890][00031] Fps is (10 sec: 8601.6, 60 sec: 7968.6, 300 sec: 7968.6). Total num frames: 438272. Throughput: 0: 2155.6. Samples: 106540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-20 23:53:48,892][00031] Avg episode reward: [(0, '4.689')] |
|
[2025-03-20 23:53:48,894][00196] Saving new best policy, reward=4.689! |
|
[2025-03-20 23:53:50,095][00209] Updated weights for policy 0, policy_version 110 (0.0017) |
|
[2025-03-20 23:53:53,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8055.5, 300 sec: 8055.5). Total num frames: 483328. Throughput: 0: 2172.0. Samples: 119850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:53:53,892][00031] Avg episode reward: [(0, '4.561')] |
|
[2025-03-20 23:53:54,569][00209] Updated weights for policy 0, policy_version 120 (0.0016) |
|
[2025-03-20 23:53:58,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8601.9, 300 sec: 8129.0). Total num frames: 528384. Throughput: 0: 2177.5. Samples: 126652. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2025-03-20 23:53:58,891][00031] Avg episode reward: [(0, '4.622')] |
|
[2025-03-20 23:53:59,067][00209] Updated weights for policy 0, policy_version 130 (0.0015) |
|
[2025-03-20 23:54:03,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8075.0). Total num frames: 565248. Throughput: 0: 2145.1. Samples: 138556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:54:03,891][00031] Avg episode reward: [(0, '4.946')] |
|
[2025-03-20 23:54:03,900][00196] Saving new best policy, reward=4.946! |
|
[2025-03-20 23:54:04,401][00209] Updated weights for policy 0, policy_version 140 (0.0014) |
|
[2025-03-20 23:54:08,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8137.4). Total num frames: 610304. Throughput: 0: 2142.7. Samples: 151702. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-20 23:54:08,892][00031] Avg episode reward: [(0, '4.952')] |
|
[2025-03-20 23:54:08,895][00196] Saving new best policy, reward=4.952! |
|
[2025-03-20 23:54:09,162][00209] Updated weights for policy 0, policy_version 150 (0.0018) |
|
[2025-03-20 23:54:13,618][00209] Updated weights for policy 0, policy_version 160 (0.0017) |
|
[2025-03-20 23:54:13,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8669.9, 300 sec: 8192.0). Total num frames: 655360. Throughput: 0: 2140.7. Samples: 158262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:54:13,891][00031] Avg episode reward: [(0, '5.600')] |
|
[2025-03-20 23:54:13,896][00196] Saving new best policy, reward=5.600! |
|
[2025-03-20 23:54:18,170][00209] Updated weights for policy 0, policy_version 170 (0.0016) |
|
[2025-03-20 23:54:18,891][00031] Fps is (10 sec: 9010.4, 60 sec: 8669.8, 300 sec: 8240.1). Total num frames: 700416. Throughput: 0: 2180.9. Samples: 171792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:54:18,893][00031] Avg episode reward: [(0, '5.819')] |
|
[2025-03-20 23:54:18,899][00196] Saving new best policy, reward=5.819! |
|
[2025-03-20 23:54:22,770][00209] Updated weights for policy 0, policy_version 180 (0.0023) |
|
[2025-03-20 23:54:23,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8283.0). Total num frames: 745472. Throughput: 0: 2185.1. Samples: 185302. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:54:23,892][00031] Avg episode reward: [(0, '5.945')] |
|
[2025-03-20 23:54:23,899][00196] Saving new best policy, reward=5.945! |
|
[2025-03-20 23:54:27,279][00209] Updated weights for policy 0, policy_version 190 (0.0016) |
|
[2025-03-20 23:54:28,890][00031] Fps is (10 sec: 9012.1, 60 sec: 8738.1, 300 sec: 8321.3). Total num frames: 790528. Throughput: 0: 2194.8. Samples: 192088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-20 23:54:28,892][00031] Avg episode reward: [(0, '5.375')] |
|
[2025-03-20 23:54:31,971][00209] Updated weights for policy 0, policy_version 200 (0.0015) |
|
[2025-03-20 23:54:33,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8314.9). Total num frames: 831488. Throughput: 0: 2198.5. Samples: 205472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:54:33,893][00031] Avg episode reward: [(0, '6.221')] |
|
[2025-03-20 23:54:33,901][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000203_831488.pth... |
|
[2025-03-20 23:54:33,985][00196] Saving new best policy, reward=6.221! |
|
[2025-03-20 23:54:37,191][00209] Updated weights for policy 0, policy_version 210 (0.0017) |
|
[2025-03-20 23:54:38,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8309.0). Total num frames: 872448. Throughput: 0: 2168.2. Samples: 217418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:54:38,891][00031] Avg episode reward: [(0, '6.641')] |
|
[2025-03-20 23:54:38,893][00196] Saving new best policy, reward=6.641! |
|
[2025-03-20 23:54:41,755][00209] Updated weights for policy 0, policy_version 220 (0.0016) |
|
[2025-03-20 23:54:43,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8340.9). Total num frames: 917504. Throughput: 0: 2163.4. Samples: 224004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:54:43,891][00031] Avg episode reward: [(0, '5.867')] |
|
[2025-03-20 23:54:46,276][00209] Updated weights for policy 0, policy_version 230 (0.0016) |
|
[2025-03-20 23:54:48,890][00031] Fps is (10 sec: 9011.0, 60 sec: 8738.1, 300 sec: 8370.1). Total num frames: 962560. Throughput: 0: 2201.8. Samples: 237638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:54:48,892][00031] Avg episode reward: [(0, '6.875')] |
|
[2025-03-20 23:54:48,895][00196] Saving new best policy, reward=6.875! |
|
[2025-03-20 23:54:50,909][00209] Updated weights for policy 0, policy_version 240 (0.0023) |
|
[2025-03-20 23:54:53,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8396.8). Total num frames: 1007616. Throughput: 0: 2204.4. Samples: 250900. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:54:53,891][00031] Avg episode reward: [(0, '7.446')] |
|
[2025-03-20 23:54:53,900][00196] Saving new best policy, reward=7.446! |
|
[2025-03-20 23:54:55,549][00209] Updated weights for policy 0, policy_version 250 (0.0019) |
|
[2025-03-20 23:54:58,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8738.1, 300 sec: 8421.4). Total num frames: 1052672. Throughput: 0: 2206.6. Samples: 257560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:54:58,892][00031] Avg episode reward: [(0, '8.269')] |
|
[2025-03-20 23:54:58,893][00196] Saving new best policy, reward=8.269! |
|
[2025-03-20 23:55:00,292][00209] Updated weights for policy 0, policy_version 260 (0.0017) |
|
[2025-03-20 23:55:03,890][00031] Fps is (10 sec: 8601.0, 60 sec: 8806.3, 300 sec: 8412.5). Total num frames: 1093632. Throughput: 0: 2197.9. Samples: 270698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:55:03,892][00031] Avg episode reward: [(0, '7.845')] |
|
[2025-03-20 23:55:04,872][00209] Updated weights for policy 0, policy_version 270 (0.0017) |
|
[2025-03-20 23:55:08,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8738.1, 300 sec: 8404.4). Total num frames: 1134592. Throughput: 0: 2157.3. Samples: 282380. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:55:08,892][00031] Avg episode reward: [(0, '8.440')] |
|
[2025-03-20 23:55:08,895][00196] Saving new best policy, reward=8.440! |
|
[2025-03-20 23:55:10,203][00209] Updated weights for policy 0, policy_version 280 (0.0015) |
|
[2025-03-20 23:55:13,890][00031] Fps is (10 sec: 8601.9, 60 sec: 8738.1, 300 sec: 8426.0). Total num frames: 1179648. Throughput: 0: 2151.3. Samples: 288898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:55:13,892][00031] Avg episode reward: [(0, '8.213')] |
|
[2025-03-20 23:55:14,817][00209] Updated weights for policy 0, policy_version 290 (0.0014) |
|
[2025-03-20 23:55:18,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8738.3, 300 sec: 8446.2). Total num frames: 1224704. Throughput: 0: 2156.3. Samples: 302504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:55:18,891][00031] Avg episode reward: [(0, '8.699')] |
|
[2025-03-20 23:55:18,893][00196] Saving new best policy, reward=8.699! |
|
[2025-03-20 23:55:19,371][00209] Updated weights for policy 0, policy_version 300 (0.0019) |
|
[2025-03-20 23:55:23,890][00031] Fps is (10 sec: 8601.9, 60 sec: 8669.9, 300 sec: 8437.8). Total num frames: 1265664. Throughput: 0: 2185.8. Samples: 315780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:55:23,891][00031] Avg episode reward: [(0, '9.882')] |
|
[2025-03-20 23:55:23,940][00196] Saving new best policy, reward=9.882! |
|
[2025-03-20 23:55:23,942][00209] Updated weights for policy 0, policy_version 310 (0.0014) |
|
[2025-03-20 23:55:28,487][00209] Updated weights for policy 0, policy_version 320 (0.0017) |
|
[2025-03-20 23:55:28,890][00031] Fps is (10 sec: 8601.3, 60 sec: 8669.8, 300 sec: 8456.2). Total num frames: 1310720. Throughput: 0: 2188.0. Samples: 322464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:55:28,891][00031] Avg episode reward: [(0, '9.444')] |
|
[2025-03-20 23:55:33,125][00209] Updated weights for policy 0, policy_version 330 (0.0016) |
|
[2025-03-20 23:55:33,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8473.6). Total num frames: 1355776. Throughput: 0: 2181.2. Samples: 335790. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:55:33,891][00031] Avg episode reward: [(0, '8.868')] |
|
[2025-03-20 23:55:37,741][00209] Updated weights for policy 0, policy_version 340 (0.0015) |
|
[2025-03-20 23:55:38,890][00031] Fps is (10 sec: 8601.8, 60 sec: 8738.1, 300 sec: 8465.1). Total num frames: 1396736. Throughput: 0: 2184.3. Samples: 349194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-20 23:55:38,891][00031] Avg episode reward: [(0, '9.003')] |
|
[2025-03-20 23:55:42,967][00209] Updated weights for policy 0, policy_version 350 (0.0019) |
|
[2025-03-20 23:55:43,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8457.0). Total num frames: 1437696. Throughput: 0: 2152.1. Samples: 354404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:55:43,892][00031] Avg episode reward: [(0, '8.760')] |
|
[2025-03-20 23:55:47,523][00209] Updated weights for policy 0, policy_version 360 (0.0017) |
|
[2025-03-20 23:55:48,890][00031] Fps is (10 sec: 8601.3, 60 sec: 8669.8, 300 sec: 8472.9). Total num frames: 1482752. Throughput: 0: 2160.4. Samples: 367914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:55:48,893][00031] Avg episode reward: [(0, '9.336')] |
|
[2025-03-20 23:55:52,207][00209] Updated weights for policy 0, policy_version 370 (0.0015) |
|
[2025-03-20 23:55:53,890][00031] Fps is (10 sec: 9010.9, 60 sec: 8669.8, 300 sec: 8487.8). Total num frames: 1527808. Throughput: 0: 2195.0. Samples: 381154. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:55:53,892][00031] Avg episode reward: [(0, '9.449')] |
|
[2025-03-20 23:55:56,655][00209] Updated weights for policy 0, policy_version 380 (0.0018) |
|
[2025-03-20 23:55:58,890][00031] Fps is (10 sec: 9011.5, 60 sec: 8669.9, 300 sec: 8502.0). Total num frames: 1572864. Throughput: 0: 2200.8. Samples: 387934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:55:58,893][00031] Avg episode reward: [(0, '10.815')] |
|
[2025-03-20 23:55:58,894][00196] Saving new best policy, reward=10.815! |
|
[2025-03-20 23:56:01,423][00209] Updated weights for policy 0, policy_version 390 (0.0015) |
|
[2025-03-20 23:56:03,890][00031] Fps is (10 sec: 9011.4, 60 sec: 8738.2, 300 sec: 8515.4). Total num frames: 1617920. Throughput: 0: 2193.2. Samples: 401198. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:56:03,892][00031] Avg episode reward: [(0, '12.911')] |
|
[2025-03-20 23:56:03,902][00196] Saving new best policy, reward=12.911! |
|
[2025-03-20 23:56:06,074][00209] Updated weights for policy 0, policy_version 400 (0.0019) |
|
[2025-03-20 23:56:08,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8528.1). Total num frames: 1662976. Throughput: 0: 2191.9. Samples: 414414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-20 23:56:08,892][00031] Avg episode reward: [(0, '13.627')] |
|
[2025-03-20 23:56:08,893][00196] Saving new best policy, reward=13.627! |
|
[2025-03-20 23:56:10,793][00209] Updated weights for policy 0, policy_version 410 (0.0015) |
|
[2025-03-20 23:56:13,890][00031] Fps is (10 sec: 8192.1, 60 sec: 8669.9, 300 sec: 8499.2). Total num frames: 1699840. Throughput: 0: 2182.9. Samples: 420694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:56:13,891][00031] Avg episode reward: [(0, '13.192')] |
|
[2025-03-20 23:56:15,967][00209] Updated weights for policy 0, policy_version 420 (0.0018) |
|
[2025-03-20 23:56:18,890][00031] Fps is (10 sec: 8191.8, 60 sec: 8669.8, 300 sec: 8511.7). Total num frames: 1744896. Throughput: 0: 2159.7. Samples: 432978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:56:18,893][00031] Avg episode reward: [(0, '12.600')] |
|
[2025-03-20 23:56:20,412][00209] Updated weights for policy 0, policy_version 430 (0.0018) |
|
[2025-03-20 23:56:23,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8523.6). Total num frames: 1789952. Throughput: 0: 2161.6. Samples: 446468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:56:23,891][00031] Avg episode reward: [(0, '13.233')] |
|
[2025-03-20 23:56:25,083][00209] Updated weights for policy 0, policy_version 440 (0.0019) |
|
[2025-03-20 23:56:28,890][00031] Fps is (10 sec: 9011.4, 60 sec: 8738.2, 300 sec: 8534.9). Total num frames: 1835008. Throughput: 0: 2197.2. Samples: 453276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:56:28,894][00031] Avg episode reward: [(0, '14.886')] |
|
[2025-03-20 23:56:28,896][00196] Saving new best policy, reward=14.886! |
|
[2025-03-20 23:56:29,641][00209] Updated weights for policy 0, policy_version 450 (0.0020) |
|
[2025-03-20 23:56:33,890][00031] Fps is (10 sec: 9011.1, 60 sec: 8738.1, 300 sec: 8545.7). Total num frames: 1880064. Throughput: 0: 2193.6. Samples: 466624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:56:33,893][00031] Avg episode reward: [(0, '14.378')] |
|
[2025-03-20 23:56:33,907][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000459_1880064.pth... |
|
[2025-03-20 23:56:34,225][00209] Updated weights for policy 0, policy_version 460 (0.0021) |
|
[2025-03-20 23:56:38,841][00209] Updated weights for policy 0, policy_version 470 (0.0018) |
|
[2025-03-20 23:56:38,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8556.1). Total num frames: 1925120. Throughput: 0: 2193.9. Samples: 479878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:56:38,892][00031] Avg episode reward: [(0, '14.785')] |
|
[2025-03-20 23:56:43,539][00209] Updated weights for policy 0, policy_version 480 (0.0018) |
|
[2025-03-20 23:56:43,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8806.4, 300 sec: 8548.2). Total num frames: 1966080. Throughput: 0: 2190.4. Samples: 486502. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:56:43,894][00031] Avg episode reward: [(0, '15.454')] |
|
[2025-03-20 23:56:43,907][00196] Saving new best policy, reward=15.454! |
|
[2025-03-20 23:56:48,608][00209] Updated weights for policy 0, policy_version 490 (0.0016) |
|
[2025-03-20 23:56:48,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8738.2, 300 sec: 8540.6). Total num frames: 2007040. Throughput: 0: 2162.8. Samples: 498522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-20 23:56:48,891][00031] Avg episode reward: [(0, '16.388')] |
|
[2025-03-20 23:56:48,893][00196] Saving new best policy, reward=16.388! |
|
[2025-03-20 23:56:53,287][00209] Updated weights for policy 0, policy_version 500 (0.0018) |
|
[2025-03-20 23:56:53,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.2, 300 sec: 8550.4). Total num frames: 2052096. Throughput: 0: 2164.8. Samples: 511828. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:56:53,891][00031] Avg episode reward: [(0, '16.824')] |
|
[2025-03-20 23:56:53,898][00196] Saving new best policy, reward=16.824! |
|
[2025-03-20 23:56:57,932][00209] Updated weights for policy 0, policy_version 510 (0.0017) |
|
[2025-03-20 23:56:58,890][00031] Fps is (10 sec: 9010.7, 60 sec: 8738.1, 300 sec: 8559.8). Total num frames: 2097152. Throughput: 0: 2175.8. Samples: 518604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:56:58,893][00031] Avg episode reward: [(0, '17.144')] |
|
[2025-03-20 23:56:58,898][00196] Saving new best policy, reward=17.144! |
|
[2025-03-20 23:57:02,564][00209] Updated weights for policy 0, policy_version 520 (0.0017) |
|
[2025-03-20 23:57:03,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8552.4). Total num frames: 2138112. Throughput: 0: 2196.2. Samples: 531806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:57:03,892][00031] Avg episode reward: [(0, '17.762')] |
|
[2025-03-20 23:57:03,903][00196] Saving new best policy, reward=17.762! |
|
[2025-03-20 23:57:07,113][00209] Updated weights for policy 0, policy_version 530 (0.0015) |
|
[2025-03-20 23:57:08,890][00031] Fps is (10 sec: 8602.0, 60 sec: 8669.9, 300 sec: 8561.4). Total num frames: 2183168. Throughput: 0: 2193.9. Samples: 545194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:57:08,893][00031] Avg episode reward: [(0, '17.381')] |
|
[2025-03-20 23:57:11,879][00209] Updated weights for policy 0, policy_version 540 (0.0015) |
|
[2025-03-20 23:57:13,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8570.1). Total num frames: 2228224. Throughput: 0: 2186.3. Samples: 551658. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) |
|
[2025-03-20 23:57:13,893][00031] Avg episode reward: [(0, '16.917')] |
|
[2025-03-20 23:57:16,463][00209] Updated weights for policy 0, policy_version 550 (0.0017) |
|
[2025-03-20 23:57:18,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.2, 300 sec: 8563.0). Total num frames: 2269184. Throughput: 0: 2175.1. Samples: 564504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:57:18,892][00031] Avg episode reward: [(0, '16.238')] |
|
[2025-03-20 23:57:21,613][00209] Updated weights for policy 0, policy_version 560 (0.0017) |
|
[2025-03-20 23:57:23,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8571.3). Total num frames: 2314240. Throughput: 0: 2158.6. Samples: 577014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) |
|
[2025-03-20 23:57:23,892][00031] Avg episode reward: [(0, '18.369')] |
|
[2025-03-20 23:57:23,900][00196] Saving new best policy, reward=18.369! |
|
[2025-03-20 23:57:26,205][00209] Updated weights for policy 0, policy_version 570 (0.0015) |
|
[2025-03-20 23:57:28,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8564.4). Total num frames: 2355200. Throughput: 0: 2160.6. Samples: 583730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:57:28,892][00031] Avg episode reward: [(0, '19.551')] |
|
[2025-03-20 23:57:28,894][00196] Saving new best policy, reward=19.551! |
|
[2025-03-20 23:57:30,803][00209] Updated weights for policy 0, policy_version 580 (0.0022) |
|
[2025-03-20 23:57:33,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8572.3). Total num frames: 2400256. Throughput: 0: 2190.8. Samples: 597106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:57:33,891][00031] Avg episode reward: [(0, '16.258')] |
|
[2025-03-20 23:57:35,353][00209] Updated weights for policy 0, policy_version 590 (0.0016) |
|
[2025-03-20 23:57:38,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8580.0). Total num frames: 2445312. Throughput: 0: 2196.6. Samples: 610674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:57:38,892][00031] Avg episode reward: [(0, '17.396')] |
|
[2025-03-20 23:57:39,921][00209] Updated weights for policy 0, policy_version 600 (0.0015) |
|
[2025-03-20 23:57:43,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8587.5). Total num frames: 2490368. Throughput: 0: 2193.1. Samples: 617294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:57:43,892][00031] Avg episode reward: [(0, '19.002')] |
|
[2025-03-20 23:57:44,471][00209] Updated weights for policy 0, policy_version 610 (0.0019) |
|
[2025-03-20 23:57:48,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8594.7). Total num frames: 2535424. Throughput: 0: 2199.3. Samples: 630776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:57:48,891][00031] Avg episode reward: [(0, '18.454')] |
|
[2025-03-20 23:57:49,083][00209] Updated weights for policy 0, policy_version 620 (0.0017) |
|
[2025-03-20 23:57:53,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8691.9). Total num frames: 2576384. Throughput: 0: 2163.7. Samples: 642562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:57:53,891][00031] Avg episode reward: [(0, '19.292')] |
|
[2025-03-20 23:57:54,296][00209] Updated weights for policy 0, policy_version 630 (0.0015) |
|
[2025-03-20 23:57:58,876][00209] Updated weights for policy 0, policy_version 640 (0.0015) |
|
[2025-03-20 23:57:58,892][00031] Fps is (10 sec: 8599.4, 60 sec: 8737.8, 300 sec: 8719.5). Total num frames: 2621440. Throughput: 0: 2171.7. Samples: 649390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:57:58,894][00031] Avg episode reward: [(0, '20.406')] |
|
[2025-03-20 23:57:58,899][00196] Saving new best policy, reward=20.406! |
|
[2025-03-20 23:58:03,580][00209] Updated weights for policy 0, policy_version 650 (0.0017) |
|
[2025-03-20 23:58:03,890][00031] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 2662400. Throughput: 0: 2181.0. Samples: 662650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:03,891][00031] Avg episode reward: [(0, '22.802')] |
|
[2025-03-20 23:58:03,902][00196] Saving new best policy, reward=22.802! |
|
[2025-03-20 23:58:08,216][00209] Updated weights for policy 0, policy_version 660 (0.0017) |
|
[2025-03-20 23:58:08,890][00031] Fps is (10 sec: 8603.8, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 2707456. Throughput: 0: 2200.3. Samples: 676026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:08,893][00031] Avg episode reward: [(0, '22.575')] |
|
[2025-03-20 23:58:12,783][00209] Updated weights for policy 0, policy_version 670 (0.0016) |
|
[2025-03-20 23:58:13,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 2752512. Throughput: 0: 2195.7. Samples: 682538. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:58:13,891][00031] Avg episode reward: [(0, '22.777')] |
|
[2025-03-20 23:58:17,311][00209] Updated weights for policy 0, policy_version 680 (0.0016) |
|
[2025-03-20 23:58:18,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8719.6). Total num frames: 2797568. Throughput: 0: 2200.3. Samples: 696118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:18,891][00031] Avg episode reward: [(0, '21.434')] |
|
[2025-03-20 23:58:21,942][00209] Updated weights for policy 0, policy_version 690 (0.0017) |
|
[2025-03-20 23:58:23,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 2838528. Throughput: 0: 2178.3. Samples: 708698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:58:23,891][00031] Avg episode reward: [(0, '23.660')] |
|
[2025-03-20 23:58:23,901][00196] Saving new best policy, reward=23.660! |
|
[2025-03-20 23:58:27,108][00209] Updated weights for policy 0, policy_version 700 (0.0018) |
|
[2025-03-20 23:58:28,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 2879488. Throughput: 0: 2167.2. Samples: 714820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:28,891][00031] Avg episode reward: [(0, '23.117')] |
|
[2025-03-20 23:58:31,697][00209] Updated weights for policy 0, policy_version 710 (0.0018) |
|
[2025-03-20 23:58:33,890][00031] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 2924544. Throughput: 0: 2163.2. Samples: 728122. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:58:33,891][00031] Avg episode reward: [(0, '22.923')] |
|
[2025-03-20 23:58:33,902][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000714_2924544.pth... |
|
[2025-03-20 23:58:34,003][00196] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000203_831488.pth |
|
[2025-03-20 23:58:36,269][00209] Updated weights for policy 0, policy_version 720 (0.0014) |
|
[2025-03-20 23:58:38,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8733.5). Total num frames: 2969600. Throughput: 0: 2200.7. Samples: 741594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:38,891][00031] Avg episode reward: [(0, '22.964')] |
|
[2025-03-20 23:58:40,971][00209] Updated weights for policy 0, policy_version 730 (0.0020) |
|
[2025-03-20 23:58:43,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8733.5). Total num frames: 3014656. Throughput: 0: 2193.5. Samples: 748092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-20 23:58:43,892][00031] Avg episode reward: [(0, '23.219')] |
|
[2025-03-20 23:58:45,580][00209] Updated weights for policy 0, policy_version 740 (0.0018) |
|
[2025-03-20 23:58:48,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8733.5). Total num frames: 3059712. Throughput: 0: 2195.2. Samples: 761432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:58:48,891][00031] Avg episode reward: [(0, '23.719')] |
|
[2025-03-20 23:58:48,895][00196] Saving new best policy, reward=23.719! |
|
[2025-03-20 23:58:50,278][00209] Updated weights for policy 0, policy_version 750 (0.0019) |
|
[2025-03-20 23:58:53,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 3100672. Throughput: 0: 2188.4. Samples: 774506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:58:53,892][00031] Avg episode reward: [(0, '22.639')] |
|
[2025-03-20 23:58:55,247][00209] Updated weights for policy 0, policy_version 760 (0.0016) |
|
[2025-03-20 23:58:58,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8670.2, 300 sec: 8733.5). Total num frames: 3141632. Throughput: 0: 2166.0. Samples: 780008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:58:58,892][00031] Avg episode reward: [(0, '21.204')] |
|
[2025-03-20 23:59:00,219][00209] Updated weights for policy 0, policy_version 770 (0.0018) |
|
[2025-03-20 23:59:03,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 3182592. Throughput: 0: 2150.4. Samples: 792886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:59:03,892][00031] Avg episode reward: [(0, '19.295')] |
|
[2025-03-20 23:59:04,761][00209] Updated weights for policy 0, policy_version 780 (0.0016) |
|
[2025-03-20 23:59:08,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8733.5). Total num frames: 3231744. Throughput: 0: 2171.4. Samples: 806410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:59:08,891][00031] Avg episode reward: [(0, '19.920')] |
|
[2025-03-20 23:59:09,491][00209] Updated weights for policy 0, policy_version 790 (0.0017) |
|
[2025-03-20 23:59:13,890][00031] Fps is (10 sec: 9011.1, 60 sec: 8669.8, 300 sec: 8719.6). Total num frames: 3272704. Throughput: 0: 2181.6. Samples: 812994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:59:13,892][00031] Avg episode reward: [(0, '22.246')] |
|
[2025-03-20 23:59:13,973][00209] Updated weights for policy 0, policy_version 800 (0.0015) |
|
[2025-03-20 23:59:18,496][00209] Updated weights for policy 0, policy_version 810 (0.0017) |
|
[2025-03-20 23:59:18,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 3317760. Throughput: 0: 2187.4. Samples: 826554. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-20 23:59:18,891][00031] Avg episode reward: [(0, '24.452')] |
|
[2025-03-20 23:59:18,927][00196] Saving new best policy, reward=24.452! |
|
[2025-03-20 23:59:23,206][00209] Updated weights for policy 0, policy_version 820 (0.0019) |
|
[2025-03-20 23:59:23,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 3362816. Throughput: 0: 2182.0. Samples: 839784. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-20 23:59:23,891][00031] Avg episode reward: [(0, '23.698')] |
|
[2025-03-20 23:59:28,103][00209] Updated weights for policy 0, policy_version 830 (0.0018) |
|
[2025-03-20 23:59:28,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 3403776. Throughput: 0: 2186.6. Samples: 846490. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-20 23:59:28,895][00031] Avg episode reward: [(0, '23.043')] |
|
[2025-03-20 23:59:33,012][00209] Updated weights for policy 0, policy_version 840 (0.0020) |
|
[2025-03-20 23:59:33,890][00031] Fps is (10 sec: 8601.4, 60 sec: 8738.1, 300 sec: 8733.5). Total num frames: 3448832. Throughput: 0: 2154.5. Samples: 858384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:59:33,892][00031] Avg episode reward: [(0, '22.872')] |
|
[2025-03-20 23:59:37,599][00209] Updated weights for policy 0, policy_version 850 (0.0018) |
|
[2025-03-20 23:59:38,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 3489792. Throughput: 0: 2162.5. Samples: 871818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-20 23:59:38,892][00031] Avg episode reward: [(0, '21.611')] |
|
[2025-03-20 23:59:42,356][00209] Updated weights for policy 0, policy_version 860 (0.0020) |
|
[2025-03-20 23:59:43,890][00031] Fps is (10 sec: 8601.8, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 3534848. Throughput: 0: 2181.8. Samples: 878188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:59:43,891][00031] Avg episode reward: [(0, '19.733')] |
|
[2025-03-20 23:59:47,120][00209] Updated weights for policy 0, policy_version 870 (0.0015) |
|
[2025-03-20 23:59:48,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8705.7). Total num frames: 3575808. Throughput: 0: 2187.0. Samples: 891302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:59:48,891][00031] Avg episode reward: [(0, '17.790')] |
|
[2025-03-20 23:59:51,811][00209] Updated weights for policy 0, policy_version 880 (0.0017) |
|
[2025-03-20 23:59:53,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 3620864. Throughput: 0: 2175.7. Samples: 904316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-20 23:59:53,892][00031] Avg episode reward: [(0, '21.014')] |
|
[2025-03-20 23:59:56,362][00209] Updated weights for policy 0, policy_version 890 (0.0016) |
|
[2025-03-20 23:59:58,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 3665920. Throughput: 0: 2177.4. Samples: 910978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-20 23:59:58,894][00031] Avg episode reward: [(0, '24.367')] |
|
[2025-03-21 00:00:01,498][00209] Updated weights for policy 0, policy_version 900 (0.0017) |
|
[2025-03-21 00:00:03,890][00031] Fps is (10 sec: 8191.9, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 3702784. Throughput: 0: 2142.4. Samples: 922960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:00:03,893][00031] Avg episode reward: [(0, '26.571')] |
|
[2025-03-21 00:00:03,902][00196] Saving new best policy, reward=26.571! |
|
[2025-03-21 00:00:06,280][00209] Updated weights for policy 0, policy_version 910 (0.0017) |
|
[2025-03-21 00:00:08,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8705.7). Total num frames: 3747840. Throughput: 0: 2142.2. Samples: 936184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:00:08,891][00031] Avg episode reward: [(0, '26.168')] |
|
[2025-03-21 00:00:10,874][00209] Updated weights for policy 0, policy_version 920 (0.0019) |
|
[2025-03-21 00:00:13,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 3792896. Throughput: 0: 2138.9. Samples: 942740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 3.0) |
|
[2025-03-21 00:00:13,891][00031] Avg episode reward: [(0, '24.613')] |
|
[2025-03-21 00:00:15,418][00209] Updated weights for policy 0, policy_version 930 (0.0016) |
|
[2025-03-21 00:00:18,890][00031] Fps is (10 sec: 9011.1, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 3837952. Throughput: 0: 2174.3. Samples: 956226. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-21 00:00:18,892][00031] Avg episode reward: [(0, '26.720')] |
|
[2025-03-21 00:00:18,895][00196] Saving new best policy, reward=26.720! |
|
[2025-03-21 00:00:20,113][00209] Updated weights for policy 0, policy_version 940 (0.0016) |
|
[2025-03-21 00:00:23,891][00031] Fps is (10 sec: 9010.5, 60 sec: 8669.7, 300 sec: 8719.6). Total num frames: 3883008. Throughput: 0: 2169.3. Samples: 969438. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-21 00:00:23,892][00031] Avg episode reward: [(0, '25.791')] |
|
[2025-03-21 00:00:24,729][00209] Updated weights for policy 0, policy_version 950 (0.0016) |
|
[2025-03-21 00:00:28,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 3928064. Throughput: 0: 2177.3. Samples: 976166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:00:28,894][00031] Avg episode reward: [(0, '25.840')] |
|
[2025-03-21 00:00:29,403][00209] Updated weights for policy 0, policy_version 960 (0.0019) |
|
[2025-03-21 00:00:33,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8669.8, 300 sec: 8719.6). Total num frames: 3969024. Throughput: 0: 2180.4. Samples: 989422. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:00:33,893][00031] Avg episode reward: [(0, '26.366')] |
|
[2025-03-21 00:00:33,902][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000969_3969024.pth... |
|
[2025-03-21 00:00:33,996][00196] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000459_1880064.pth |
|
[2025-03-21 00:00:34,571][00209] Updated weights for policy 0, policy_version 970 (0.0015) |
|
[2025-03-21 00:00:38,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8719.6). Total num frames: 4009984. Throughput: 0: 2155.2. Samples: 1001298. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:00:38,891][00031] Avg episode reward: [(0, '25.201')] |
|
[2025-03-21 00:00:39,210][00209] Updated weights for policy 0, policy_version 980 (0.0018) |
|
[2025-03-21 00:00:43,890][00031] Fps is (10 sec: 8192.6, 60 sec: 8601.6, 300 sec: 8705.7). Total num frames: 4050944. Throughput: 0: 2153.6. Samples: 1007892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:00:43,892][00031] Avg episode reward: [(0, '25.953')] |
|
[2025-03-21 00:00:43,918][00209] Updated weights for policy 0, policy_version 990 (0.0015) |
|
[2025-03-21 00:00:48,415][00209] Updated weights for policy 0, policy_version 1000 (0.0016) |
|
[2025-03-21 00:00:48,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 4100096. Throughput: 0: 2187.6. Samples: 1021400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:00:48,894][00031] Avg episode reward: [(0, '25.095')] |
|
[2025-03-21 00:00:53,041][00209] Updated weights for policy 0, policy_version 1010 (0.0016) |
|
[2025-03-21 00:00:53,891][00031] Fps is (10 sec: 9419.3, 60 sec: 8737.9, 300 sec: 8719.6). Total num frames: 4145152. Throughput: 0: 2191.3. Samples: 1034798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:00:53,893][00031] Avg episode reward: [(0, '26.122')] |
|
[2025-03-21 00:00:57,515][00209] Updated weights for policy 0, policy_version 1020 (0.0015) |
|
[2025-03-21 00:00:58,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8738.2, 300 sec: 8719.6). Total num frames: 4190208. Throughput: 0: 2195.9. Samples: 1041556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:00:58,891][00031] Avg episode reward: [(0, '27.882')] |
|
[2025-03-21 00:00:58,897][00196] Saving new best policy, reward=27.882! |
|
[2025-03-21 00:01:02,208][00209] Updated weights for policy 0, policy_version 1030 (0.0016) |
|
[2025-03-21 00:01:03,890][00031] Fps is (10 sec: 8602.9, 60 sec: 8806.4, 300 sec: 8705.7). Total num frames: 4231168. Throughput: 0: 2191.6. Samples: 1054848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:01:03,892][00031] Avg episode reward: [(0, '29.140')] |
|
[2025-03-21 00:01:03,900][00196] Saving new best policy, reward=29.140! |
|
[2025-03-21 00:01:07,419][00209] Updated weights for policy 0, policy_version 1040 (0.0019) |
|
[2025-03-21 00:01:08,890][00031] Fps is (10 sec: 8191.9, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 4272128. Throughput: 0: 2160.8. Samples: 1066674. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-21 00:01:08,894][00031] Avg episode reward: [(0, '27.887')] |
|
[2025-03-21 00:01:12,087][00209] Updated weights for policy 0, policy_version 1050 (0.0018) |
|
[2025-03-21 00:01:13,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 4313088. Throughput: 0: 2157.5. Samples: 1073254. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-21 00:01:13,892][00031] Avg episode reward: [(0, '24.894')] |
|
[2025-03-21 00:01:16,687][00209] Updated weights for policy 0, policy_version 1060 (0.0018) |
|
[2025-03-21 00:01:18,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 4358144. Throughput: 0: 2160.6. Samples: 1086648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:01:18,894][00031] Avg episode reward: [(0, '23.185')] |
|
[2025-03-21 00:01:21,376][00209] Updated weights for policy 0, policy_version 1070 (0.0018) |
|
[2025-03-21 00:01:23,892][00031] Fps is (10 sec: 9008.9, 60 sec: 8669.6, 300 sec: 8705.7). Total num frames: 4403200. Throughput: 0: 2191.8. Samples: 1099934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:01:23,894][00031] Avg episode reward: [(0, '26.089')] |
|
[2025-03-21 00:01:25,877][00209] Updated weights for policy 0, policy_version 1080 (0.0018) |
|
[2025-03-21 00:01:28,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 4448256. Throughput: 0: 2194.6. Samples: 1106650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:01:28,891][00031] Avg episode reward: [(0, '26.792')] |
|
[2025-03-21 00:01:30,500][00209] Updated weights for policy 0, policy_version 1090 (0.0018) |
|
[2025-03-21 00:01:33,890][00031] Fps is (10 sec: 9013.0, 60 sec: 8738.2, 300 sec: 8705.7). Total num frames: 4493312. Throughput: 0: 2190.3. Samples: 1119964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-21 00:01:33,893][00031] Avg episode reward: [(0, '27.977')] |
|
[2025-03-21 00:01:35,076][00209] Updated weights for policy 0, policy_version 1100 (0.0018) |
|
[2025-03-21 00:01:38,891][00031] Fps is (10 sec: 8600.4, 60 sec: 8737.9, 300 sec: 8705.7). Total num frames: 4534272. Throughput: 0: 2189.0. Samples: 1133304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:01:38,894][00031] Avg episode reward: [(0, '28.550')] |
|
[2025-03-21 00:01:40,332][00209] Updated weights for policy 0, policy_version 1110 (0.0021) |
|
[2025-03-21 00:01:43,890][00031] Fps is (10 sec: 8192.3, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 4575232. Throughput: 0: 2155.8. Samples: 1138566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:01:43,892][00031] Avg episode reward: [(0, '28.701')] |
|
[2025-03-21 00:01:44,888][00209] Updated weights for policy 0, policy_version 1120 (0.0016) |
|
[2025-03-21 00:01:48,890][00031] Fps is (10 sec: 8602.4, 60 sec: 8669.8, 300 sec: 8705.7). Total num frames: 4620288. Throughput: 0: 2160.4. Samples: 1152066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:01:48,892][00031] Avg episode reward: [(0, '25.380')] |
|
[2025-03-21 00:01:49,565][00209] Updated weights for policy 0, policy_version 1130 (0.0017) |
|
[2025-03-21 00:01:53,890][00031] Fps is (10 sec: 9011.4, 60 sec: 8670.1, 300 sec: 8705.8). Total num frames: 4665344. Throughput: 0: 2189.7. Samples: 1165212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-21 00:01:53,891][00031] Avg episode reward: [(0, '24.286')] |
|
[2025-03-21 00:01:54,227][00209] Updated weights for policy 0, policy_version 1140 (0.0014) |
|
[2025-03-21 00:01:58,758][00209] Updated weights for policy 0, policy_version 1150 (0.0016) |
|
[2025-03-21 00:01:58,890][00031] Fps is (10 sec: 9011.4, 60 sec: 8669.8, 300 sec: 8719.6). Total num frames: 4710400. Throughput: 0: 2192.9. Samples: 1171936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-21 00:01:58,892][00031] Avg episode reward: [(0, '26.960')] |
|
[2025-03-21 00:02:03,439][00209] Updated weights for policy 0, policy_version 1160 (0.0015) |
|
[2025-03-21 00:02:03,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 4751360. Throughput: 0: 2187.5. Samples: 1185084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:03,892][00031] Avg episode reward: [(0, '30.019')] |
|
[2025-03-21 00:02:03,900][00196] Saving new best policy, reward=30.019! |
|
[2025-03-21 00:02:08,171][00209] Updated weights for policy 0, policy_version 1170 (0.0018) |
|
[2025-03-21 00:02:08,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 4796416. Throughput: 0: 2186.2. Samples: 1198308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-21 00:02:08,892][00031] Avg episode reward: [(0, '28.972')] |
|
[2025-03-21 00:02:13,363][00209] Updated weights for policy 0, policy_version 1180 (0.0018) |
|
[2025-03-21 00:02:13,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 4837376. Throughput: 0: 2176.4. Samples: 1204590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:13,891][00031] Avg episode reward: [(0, '25.194')] |
|
[2025-03-21 00:02:17,927][00209] Updated weights for policy 0, policy_version 1190 (0.0015) |
|
[2025-03-21 00:02:18,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 4882432. Throughput: 0: 2156.2. Samples: 1216994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 3.0) |
|
[2025-03-21 00:02:18,892][00031] Avg episode reward: [(0, '23.589')] |
|
[2025-03-21 00:02:22,520][00209] Updated weights for policy 0, policy_version 1200 (0.0020) |
|
[2025-03-21 00:02:23,890][00031] Fps is (10 sec: 9011.0, 60 sec: 8738.5, 300 sec: 8719.6). Total num frames: 4927488. Throughput: 0: 2157.2. Samples: 1230376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:23,892][00031] Avg episode reward: [(0, '23.189')] |
|
[2025-03-21 00:02:27,084][00209] Updated weights for policy 0, policy_version 1210 (0.0017) |
|
[2025-03-21 00:02:28,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8719.6). Total num frames: 4972544. Throughput: 0: 2190.6. Samples: 1237144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:28,891][00031] Avg episode reward: [(0, '25.520')] |
|
[2025-03-21 00:02:31,795][00209] Updated weights for policy 0, policy_version 1220 (0.0016) |
|
[2025-03-21 00:02:33,890][00031] Fps is (10 sec: 8601.8, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 5013504. Throughput: 0: 2183.5. Samples: 1250322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-21 00:02:33,891][00031] Avg episode reward: [(0, '25.585')] |
|
[2025-03-21 00:02:33,901][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth... |
|
[2025-03-21 00:02:33,900][00031] Components not started: RolloutWorker_w5, wait_time=600.0 seconds |
|
[2025-03-21 00:02:34,000][00196] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000714_2924544.pth |
|
[2025-03-21 00:02:36,344][00209] Updated weights for policy 0, policy_version 1230 (0.0019) |
|
[2025-03-21 00:02:38,890][00031] Fps is (10 sec: 8601.5, 60 sec: 8738.3, 300 sec: 8705.7). Total num frames: 5058560. Throughput: 0: 2187.8. Samples: 1263662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:38,892][00031] Avg episode reward: [(0, '26.212')] |
|
[2025-03-21 00:02:40,957][00209] Updated weights for policy 0, policy_version 1240 (0.0017) |
|
[2025-03-21 00:02:43,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8738.2, 300 sec: 8691.8). Total num frames: 5099520. Throughput: 0: 2184.8. Samples: 1270252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:02:43,895][00031] Avg episode reward: [(0, '27.944')] |
|
[2025-03-21 00:02:46,282][00209] Updated weights for policy 0, policy_version 1250 (0.0016) |
|
[2025-03-21 00:02:48,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8691.9). Total num frames: 5140480. Throughput: 0: 2157.2. Samples: 1282160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:48,891][00031] Avg episode reward: [(0, '28.365')] |
|
[2025-03-21 00:02:50,953][00209] Updated weights for policy 0, policy_version 1260 (0.0018) |
|
[2025-03-21 00:02:53,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8691.9). Total num frames: 5185536. Throughput: 0: 2156.5. Samples: 1295350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:02:53,892][00031] Avg episode reward: [(0, '27.965')] |
|
[2025-03-21 00:02:55,486][00209] Updated weights for policy 0, policy_version 1270 (0.0017) |
|
[2025-03-21 00:02:58,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 5230592. Throughput: 0: 2167.2. Samples: 1302116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:02:58,892][00031] Avg episode reward: [(0, '27.812')] |
|
[2025-03-21 00:03:00,065][00209] Updated weights for policy 0, policy_version 1280 (0.0017) |
|
[2025-03-21 00:03:03,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 5275648. Throughput: 0: 2187.2. Samples: 1315418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:03:03,893][00031] Avg episode reward: [(0, '25.359')] |
|
[2025-03-21 00:03:04,783][00209] Updated weights for policy 0, policy_version 1290 (0.0018) |
|
[2025-03-21 00:03:08,890][00031] Fps is (10 sec: 9011.3, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 5320704. Throughput: 0: 2187.6. Samples: 1328818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:03:08,893][00031] Avg episode reward: [(0, '23.290')] |
|
[2025-03-21 00:03:09,429][00209] Updated weights for policy 0, policy_version 1300 (0.0017) |
|
[2025-03-21 00:03:13,890][00031] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8691.9). Total num frames: 5361664. Throughput: 0: 2183.9. Samples: 1335418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:03:13,891][00031] Avg episode reward: [(0, '23.815')] |
|
[2025-03-21 00:03:13,936][00209] Updated weights for policy 0, policy_version 1310 (0.0015) |
|
[2025-03-21 00:03:18,890][00031] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8691.9). Total num frames: 5402624. Throughput: 0: 2173.2. Samples: 1348116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:03:18,892][00031] Avg episode reward: [(0, '22.840')] |
|
[2025-03-21 00:03:19,197][00209] Updated weights for policy 0, policy_version 1320 (0.0021) |
|
[2025-03-21 00:03:23,685][00209] Updated weights for policy 0, policy_version 1330 (0.0017) |
|
[2025-03-21 00:03:23,890][00031] Fps is (10 sec: 8601.5, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 5447680. Throughput: 0: 2157.0. Samples: 1360728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:03:23,891][00031] Avg episode reward: [(0, '25.047')] |
|
[2025-03-21 00:03:28,336][00209] Updated weights for policy 0, policy_version 1340 (0.0017) |
|
[2025-03-21 00:03:28,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8705.7). Total num frames: 5492736. Throughput: 0: 2161.6. Samples: 1367522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:03:28,891][00031] Avg episode reward: [(0, '27.608')] |
|
[2025-03-21 00:03:32,931][00209] Updated weights for policy 0, policy_version 1350 (0.0017) |
|
[2025-03-21 00:03:33,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 5537792. Throughput: 0: 2193.4. Samples: 1380862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:03:33,892][00031] Avg episode reward: [(0, '27.232')] |
|
[2025-03-21 00:03:37,458][00209] Updated weights for policy 0, policy_version 1360 (0.0017) |
|
[2025-03-21 00:03:38,890][00031] Fps is (10 sec: 9010.9, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 5582848. Throughput: 0: 2199.8. Samples: 1394340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:03:38,892][00031] Avg episode reward: [(0, '23.707')] |
|
[2025-03-21 00:03:42,131][00209] Updated weights for policy 0, policy_version 1370 (0.0018) |
|
[2025-03-21 00:03:43,890][00031] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8691.8). Total num frames: 5623808. Throughput: 0: 2195.7. Samples: 1400922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:03:43,891][00031] Avg episode reward: [(0, '22.824')] |
|
[2025-03-21 00:03:46,674][00209] Updated weights for policy 0, policy_version 1380 (0.0017) |
|
[2025-03-21 00:03:48,890][00031] Fps is (10 sec: 8601.8, 60 sec: 8806.4, 300 sec: 8705.7). Total num frames: 5668864. Throughput: 0: 2198.8. Samples: 1414362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:03:48,892][00031] Avg episode reward: [(0, '24.680')] |
|
[2025-03-21 00:03:51,975][00209] Updated weights for policy 0, policy_version 1390 (0.0018) |
|
[2025-03-21 00:03:53,890][00031] Fps is (10 sec: 8601.2, 60 sec: 8738.1, 300 sec: 8705.7). Total num frames: 5709824. Throughput: 0: 2161.7. Samples: 1426098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:03:53,891][00031] Avg episode reward: [(0, '25.548')] |
|
[2025-03-21 00:03:56,647][00209] Updated weights for policy 0, policy_version 1400 (0.0015) |
|
[2025-03-21 00:03:58,890][00031] Fps is (10 sec: 8191.8, 60 sec: 8669.8, 300 sec: 8705.7). Total num frames: 5750784. Throughput: 0: 2161.3. Samples: 1432678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-21 00:03:58,892][00031] Avg episode reward: [(0, '26.836')] |
|
[2025-03-21 00:04:01,353][00209] Updated weights for policy 0, policy_version 1410 (0.0021) |
|
[2025-03-21 00:04:03,890][00031] Fps is (10 sec: 8602.0, 60 sec: 8669.9, 300 sec: 8691.8). Total num frames: 5795840. Throughput: 0: 2167.0. Samples: 1445630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-21 00:04:03,891][00031] Avg episode reward: [(0, '27.570')] |
|
[2025-03-21 00:04:06,186][00209] Updated weights for policy 0, policy_version 1420 (0.0017) |
|
[2025-03-21 00:04:08,890][00031] Fps is (10 sec: 8601.8, 60 sec: 8601.6, 300 sec: 8691.9). Total num frames: 5836800. Throughput: 0: 2177.2. Samples: 1458702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:04:08,892][00031] Avg episode reward: [(0, '27.254')] |
|
[2025-03-21 00:04:10,817][00209] Updated weights for policy 0, policy_version 1430 (0.0017) |
|
[2025-03-21 00:04:13,890][00031] Fps is (10 sec: 8601.6, 60 sec: 8669.8, 300 sec: 8691.8). Total num frames: 5881856. Throughput: 0: 2171.7. Samples: 1465250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:04:13,891][00031] Avg episode reward: [(0, '27.562')] |
|
[2025-03-21 00:04:15,379][00209] Updated weights for policy 0, policy_version 1440 (0.0018) |
|
[2025-03-21 00:04:18,890][00031] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8691.9). Total num frames: 5926912. Throughput: 0: 2173.6. Samples: 1478672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-21 00:04:18,891][00031] Avg episode reward: [(0, '29.306')] |
|
[2025-03-21 00:04:20,052][00209] Updated weights for policy 0, policy_version 1450 (0.0016) |
|
[2025-03-21 00:04:23,890][00031] Fps is (10 sec: 8601.0, 60 sec: 8669.8, 300 sec: 8691.8). Total num frames: 5967872. Throughput: 0: 2148.6. Samples: 1491028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-21 00:04:23,894][00031] Avg episode reward: [(0, '29.435')] |
|
[2025-03-21 00:04:25,197][00209] Updated weights for policy 0, policy_version 1460 (0.0023) |
|
[2025-03-21 00:04:27,864][00196] Stopping Batcher_0... |
|
[2025-03-21 00:04:27,864][00196] Loop batcher_evt_loop terminating... |
|
[2025-03-21 00:04:27,865][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... |
|
[2025-03-21 00:04:27,864][00031] Component Batcher_0 stopped! |
|
[2025-03-21 00:04:27,866][00031] Component RolloutWorker_w5 process died already! Don't wait for it. |
|
[2025-03-21 00:04:27,898][00209] Weights refcount: 2 0 |
|
[2025-03-21 00:04:27,900][00209] Stopping InferenceWorker_p0-w0... |
|
[2025-03-21 00:04:27,900][00209] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-03-21 00:04:27,901][00031] Component InferenceWorker_p0-w0 stopped! |
|
[2025-03-21 00:04:27,931][00031] Component RolloutWorker_w1 stopped! |
|
[2025-03-21 00:04:27,934][00212] Stopping RolloutWorker_w1... |
|
[2025-03-21 00:04:27,935][00212] Loop rollout_proc1_evt_loop terminating... |
|
[2025-03-21 00:04:27,951][00196] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000969_3969024.pth |
|
[2025-03-21 00:04:27,965][00196] Saving new best policy, reward=30.402! |
|
[2025-03-21 00:04:27,975][00216] Stopping RolloutWorker_w6... |
|
[2025-03-21 00:04:27,976][00216] Loop rollout_proc6_evt_loop terminating... |
|
[2025-03-21 00:04:27,975][00031] Component RolloutWorker_w6 stopped! |
|
[2025-03-21 00:04:27,980][00213] Stopping RolloutWorker_w2... |
|
[2025-03-21 00:04:27,980][00031] Component RolloutWorker_w2 stopped! |
|
[2025-03-21 00:04:27,981][00213] Loop rollout_proc2_evt_loop terminating... |
|
[2025-03-21 00:04:28,076][00196] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... |
|
[2025-03-21 00:04:28,123][00214] Stopping RolloutWorker_w4... |
|
[2025-03-21 00:04:28,124][00214] Loop rollout_proc4_evt_loop terminating... |
|
[2025-03-21 00:04:28,123][00031] Component RolloutWorker_w4 stopped! |
|
[2025-03-21 00:04:28,131][00210] Stopping RolloutWorker_w0... |
|
[2025-03-21 00:04:28,131][00210] Loop rollout_proc0_evt_loop terminating... |
|
[2025-03-21 00:04:28,131][00031] Component RolloutWorker_w0 stopped! |
|
[2025-03-21 00:04:28,138][00211] Stopping RolloutWorker_w3... |
|
[2025-03-21 00:04:28,139][00211] Loop rollout_proc3_evt_loop terminating... |
|
[2025-03-21 00:04:28,138][00031] Component RolloutWorker_w3 stopped! |
|
[2025-03-21 00:04:28,158][00031] Component RolloutWorker_w7 stopped! |
|
[2025-03-21 00:04:28,159][00217] Stopping RolloutWorker_w7... |
|
[2025-03-21 00:04:28,160][00217] Loop rollout_proc7_evt_loop terminating... |
|
[2025-03-21 00:04:28,197][00196] Stopping LearnerWorker_p0... |
|
[2025-03-21 00:04:28,198][00196] Loop learner_proc0_evt_loop terminating... |
|
[2025-03-21 00:04:28,197][00031] Component LearnerWorker_p0 stopped! |
|
[2025-03-21 00:04:28,198][00031] Waiting for process learner_proc0 to stop... |
|
[2025-03-21 00:04:29,505][00031] Waiting for process inference_proc0-0 to join... |
|
[2025-03-21 00:04:29,510][00031] Waiting for process rollout_proc0 to join... |
|
[2025-03-21 00:04:29,816][00031] Waiting for process rollout_proc1 to join... |
|
[2025-03-21 00:04:29,819][00031] Waiting for process rollout_proc2 to join... |
|
[2025-03-21 00:04:29,934][00031] Waiting for process rollout_proc3 to join... |
|
[2025-03-21 00:04:29,936][00031] Waiting for process rollout_proc4 to join... |
|
[2025-03-21 00:04:29,937][00031] Waiting for process rollout_proc5 to join... |
|
[2025-03-21 00:04:29,938][00031] Waiting for process rollout_proc6 to join... |
|
[2025-03-21 00:04:29,962][00031] Waiting for process rollout_proc7 to join... |
|
[2025-03-21 00:04:29,963][00031] Batcher 0 profile tree view: |
|
batching: 36.8708, releasing_batches: 0.0387 |
|
[2025-03-21 00:04:29,964][00031] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 18.5605 |
|
update_model: 9.7350 |
|
weight_update: 0.0015 |
|
one_step: 0.0028 |
|
handle_policy_step: 632.3479 |
|
deserialize: 18.6638, stack: 4.1556, obs_to_device_normalize: 152.6385, forward: 301.7508, send_messages: 31.4225 |
|
prepare_outputs: 91.5719 |
|
to_cpu: 57.8595 |
|
[2025-03-21 00:04:29,965][00031] Learner 0 profile tree view: |
|
misc: 0.0078, prepare_batch: 17.4161 |
|
train: 73.2391 |
|
epoch_init: 0.0087, minibatch_init: 0.0091, losses_postprocess: 0.7866, kl_divergence: 0.7831, after_optimizer: 32.9973 |
|
calculate_losses: 24.3027 |
|
losses_init: 0.0057, forward_head: 1.2705, bptt_initial: 16.9615, tail: 1.0647, advantages_returns: 0.2736, losses: 2.3759 |
|
bptt: 2.0419 |
|
bptt_forward_core: 1.9426 |
|
update: 13.7216 |
|
clip: 1.2010 |
|
[2025-03-21 00:04:29,967][00031] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.3301, enqueue_policy_requests: 14.4142, env_step: 514.5388, overhead: 13.0106, complete_rollouts: 1.8910 |
|
save_policy_outputs: 18.1673 |
|
split_output_tensors: 7.2651 |
|
[2025-03-21 00:04:29,967][00031] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.3332, enqueue_policy_requests: 15.0279, env_step: 495.7572, overhead: 13.8787, complete_rollouts: 2.2632 |
|
save_policy_outputs: 18.9713 |
|
split_output_tensors: 7.6620 |
|
[2025-03-21 00:04:29,968][00031] Loop Runner_EvtLoop terminating... |
|
[2025-03-21 00:04:29,970][00031] Runner profile tree view: |
|
main_loop: 711.2594 |
|
[2025-03-21 00:04:29,971][00031] Collected {0: 6004736}, FPS: 8442.4 |
|
[2025-03-21 00:08:33,688][00031] Loading existing experiment configuration from /kaggle/working/train_dir/default_experiment/config.json |
|
[2025-03-21 00:08:33,689][00031] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-21 00:08:33,690][00031] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-21 00:08:33,691][00031] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-21 00:08:33,691][00031] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-21 00:08:33,692][00031] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-21 00:08:33,693][00031] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-21 00:08:33,694][00031] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-21 00:08:33,696][00031] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-21 00:08:33,696][00031] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-21 00:08:33,697][00031] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-21 00:08:33,698][00031] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-21 00:08:33,699][00031] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-21 00:08:33,700][00031] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-21 00:08:33,700][00031] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-21 00:08:33,730][00031] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-21 00:08:33,733][00031] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-21 00:08:33,735][00031] RunningMeanStd input shape: (1,) |
|
[2025-03-21 00:08:33,749][00031] ConvEncoder: input_channels=3 |
|
[2025-03-21 00:08:33,848][00031] Conv encoder output size: 512 |
|
[2025-03-21 00:08:33,848][00031] Policy head output size: 512 |
|
[2025-03-21 00:08:34,076][00031] Loading state from checkpoint /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... |
|
[2025-03-21 00:08:34,917][00031] Num frames 100... |
|
[2025-03-21 00:08:35,051][00031] Num frames 200... |
|
[2025-03-21 00:08:35,204][00031] Num frames 300... |
|
[2025-03-21 00:08:35,334][00031] Num frames 400... |
|
[2025-03-21 00:08:35,472][00031] Num frames 500... |
|
[2025-03-21 00:08:35,606][00031] Num frames 600... |
|
[2025-03-21 00:08:35,720][00031] Avg episode rewards: #0: 11.400, true rewards: #0: 6.400 |
|
[2025-03-21 00:08:35,721][00031] Avg episode reward: 11.400, avg true_objective: 6.400 |
|
[2025-03-21 00:08:35,804][00031] Num frames 700... |
|
[2025-03-21 00:08:35,939][00031] Num frames 800... |
|
[2025-03-21 00:08:36,074][00031] Num frames 900... |
|
[2025-03-21 00:08:36,188][00031] Avg episode rewards: #0: 7.705, true rewards: #0: 4.705 |
|
[2025-03-21 00:08:36,189][00031] Avg episode reward: 7.705, avg true_objective: 4.705 |
|
[2025-03-21 00:08:36,257][00031] Num frames 1000... |
|
[2025-03-21 00:08:36,371][00031] Num frames 1100... |
|
[2025-03-21 00:08:36,492][00031] Num frames 1200... |
|
[2025-03-21 00:08:36,614][00031] Num frames 1300... |
|
[2025-03-21 00:08:36,698][00031] Avg episode rewards: #0: 6.417, true rewards: #0: 4.417 |
|
[2025-03-21 00:08:36,699][00031] Avg episode reward: 6.417, avg true_objective: 4.417 |
|
[2025-03-21 00:08:36,794][00031] Num frames 1400... |
|
[2025-03-21 00:08:36,921][00031] Num frames 1500... |
|
[2025-03-21 00:08:37,037][00031] Num frames 1600... |
|
[2025-03-21 00:08:37,164][00031] Num frames 1700... |
|
[2025-03-21 00:08:37,289][00031] Num frames 1800... |
|
[2025-03-21 00:08:37,404][00031] Num frames 1900... |
|
[2025-03-21 00:08:37,527][00031] Num frames 2000... |
|
[2025-03-21 00:08:37,644][00031] Num frames 2100... |
|
[2025-03-21 00:08:37,764][00031] Num frames 2200... |
|
[2025-03-21 00:08:37,883][00031] Num frames 2300... |
|
[2025-03-21 00:08:38,006][00031] Num frames 2400... |
|
[2025-03-21 00:08:38,127][00031] Num frames 2500... |
|
[2025-03-21 00:08:38,250][00031] Num frames 2600... |
|
[2025-03-21 00:08:38,321][00031] Avg episode rewards: #0: 12.283, true rewards: #0: 6.532 |
|
[2025-03-21 00:08:38,323][00031] Avg episode reward: 12.283, avg true_objective: 6.532 |
|
[2025-03-21 00:08:38,426][00031] Num frames 2700... |
|
[2025-03-21 00:08:38,545][00031] Num frames 2800... |
|
[2025-03-21 00:08:38,671][00031] Num frames 2900... |
|
[2025-03-21 00:08:38,791][00031] Num frames 3000... |
|
[2025-03-21 00:08:38,921][00031] Num frames 3100... |
|
[2025-03-21 00:08:39,050][00031] Num frames 3200... |
|
[2025-03-21 00:08:39,194][00031] Num frames 3300... |
|
[2025-03-21 00:08:39,314][00031] Num frames 3400... |
|
[2025-03-21 00:08:39,437][00031] Num frames 3500... |
|
[2025-03-21 00:08:39,557][00031] Num frames 3600... |
|
[2025-03-21 00:08:39,675][00031] Num frames 3700... |
|
[2025-03-21 00:08:39,793][00031] Num frames 3800... |
|
[2025-03-21 00:08:39,915][00031] Num frames 3900... |
|
[2025-03-21 00:08:40,035][00031] Num frames 4000... |
|
[2025-03-21 00:08:40,159][00031] Num frames 4100... |
|
[2025-03-21 00:08:40,283][00031] Num frames 4200... |
|
[2025-03-21 00:08:40,405][00031] Num frames 4300... |
|
[2025-03-21 00:08:40,524][00031] Num frames 4400... |
|
[2025-03-21 00:08:40,646][00031] Num frames 4500... |
|
[2025-03-21 00:08:40,768][00031] Num frames 4600... |
|
[2025-03-21 00:08:40,891][00031] Num frames 4700... |
|
[2025-03-21 00:08:40,962][00031] Avg episode rewards: #0: 20.426, true rewards: #0: 9.426 |
|
[2025-03-21 00:08:40,963][00031] Avg episode reward: 20.426, avg true_objective: 9.426 |
|
[2025-03-21 00:08:41,065][00031] Num frames 4800... |
|
[2025-03-21 00:08:41,188][00031] Num frames 4900... |
|
[2025-03-21 00:08:41,313][00031] Num frames 5000... |
|
[2025-03-21 00:08:41,436][00031] Num frames 5100... |
|
[2025-03-21 00:08:41,558][00031] Num frames 5200... |
|
[2025-03-21 00:08:41,683][00031] Num frames 5300... |
|
[2025-03-21 00:08:41,811][00031] Num frames 5400... |
|
[2025-03-21 00:08:41,937][00031] Num frames 5500... |
|
[2025-03-21 00:08:42,064][00031] Num frames 5600... |
|
[2025-03-21 00:08:42,183][00031] Num frames 5700... |
|
[2025-03-21 00:08:42,303][00031] Num frames 5800... |
|
[2025-03-21 00:08:42,423][00031] Num frames 5900... |
|
[2025-03-21 00:08:42,548][00031] Num frames 6000... |
|
[2025-03-21 00:08:42,673][00031] Num frames 6100... |
|
[2025-03-21 00:08:42,801][00031] Num frames 6200... |
|
[2025-03-21 00:08:42,927][00031] Num frames 6300... |
|
[2025-03-21 00:08:43,055][00031] Num frames 6400... |
|
[2025-03-21 00:08:43,180][00031] Num frames 6500... |
|
[2025-03-21 00:08:43,307][00031] Num frames 6600... |
|
[2025-03-21 00:08:43,403][00031] Avg episode rewards: #0: 25.221, true rewards: #0: 11.055 |
|
[2025-03-21 00:08:43,404][00031] Avg episode reward: 25.221, avg true_objective: 11.055 |
|
[2025-03-21 00:08:43,481][00031] Num frames 6700... |
|
[2025-03-21 00:08:43,601][00031] Num frames 6800... |
|
[2025-03-21 00:08:43,720][00031] Num frames 6900... |
|
[2025-03-21 00:08:43,840][00031] Num frames 7000... |
|
[2025-03-21 00:08:43,966][00031] Num frames 7100... |
|
[2025-03-21 00:08:44,094][00031] Num frames 7200... |
|
[2025-03-21 00:08:44,218][00031] Num frames 7300... |
|
[2025-03-21 00:08:44,342][00031] Num frames 7400... |
|
[2025-03-21 00:08:44,466][00031] Num frames 7500... |
|
[2025-03-21 00:08:44,584][00031] Num frames 7600... |
|
[2025-03-21 00:08:44,703][00031] Num frames 7700... |
|
[2025-03-21 00:08:44,820][00031] Num frames 7800... |
|
[2025-03-21 00:08:44,935][00031] Avg episode rewards: #0: 25.356, true rewards: #0: 11.213 |
|
[2025-03-21 00:08:44,936][00031] Avg episode reward: 25.356, avg true_objective: 11.213 |
|
[2025-03-21 00:08:44,996][00031] Num frames 7900... |
|
[2025-03-21 00:08:45,115][00031] Num frames 8000... |
|
[2025-03-21 00:08:45,235][00031] Num frames 8100... |
|
[2025-03-21 00:08:45,360][00031] Num frames 8200... |
|
[2025-03-21 00:08:45,484][00031] Num frames 8300... |
|
[2025-03-21 00:08:45,612][00031] Num frames 8400... |
|
[2025-03-21 00:08:45,738][00031] Num frames 8500... |
|
[2025-03-21 00:08:45,861][00031] Num frames 8600... |
|
[2025-03-21 00:08:45,988][00031] Num frames 8700... |
|
[2025-03-21 00:08:46,113][00031] Num frames 8800... |
|
[2025-03-21 00:08:46,233][00031] Num frames 8900... |
|
[2025-03-21 00:08:46,350][00031] Num frames 9000... |
|
[2025-03-21 00:08:46,482][00031] Avg episode rewards: #0: 26.457, true rewards: #0: 11.332 |
|
[2025-03-21 00:08:46,483][00031] Avg episode reward: 26.457, avg true_objective: 11.332 |
|
[2025-03-21 00:08:46,524][00031] Num frames 9100... |
|
[2025-03-21 00:08:46,636][00031] Num frames 9200... |
|
[2025-03-21 00:08:46,754][00031] Num frames 9300... |
|
[2025-03-21 00:08:46,871][00031] Num frames 9400... |
|
[2025-03-21 00:08:46,987][00031] Num frames 9500... |
|
[2025-03-21 00:08:47,103][00031] Num frames 9600... |
|
[2025-03-21 00:08:47,224][00031] Num frames 9700... |
|
[2025-03-21 00:08:47,287][00031] Avg episode rewards: #0: 25.007, true rewards: #0: 10.784 |
|
[2025-03-21 00:08:47,288][00031] Avg episode reward: 25.007, avg true_objective: 10.784 |
|
[2025-03-21 00:08:47,398][00031] Num frames 9800... |
|
[2025-03-21 00:08:47,525][00031] Num frames 9900... |
|
[2025-03-21 00:08:47,649][00031] Num frames 10000... |
|
[2025-03-21 00:08:47,773][00031] Num frames 10100... |
|
[2025-03-21 00:08:47,898][00031] Num frames 10200... |
|
[2025-03-21 00:08:48,022][00031] Num frames 10300... |
|
[2025-03-21 00:08:48,145][00031] Num frames 10400... |
|
[2025-03-21 00:08:48,268][00031] Num frames 10500... |
|
[2025-03-21 00:08:48,388][00031] Num frames 10600... |
|
[2025-03-21 00:08:48,508][00031] Num frames 10700... |
|
[2025-03-21 00:08:48,610][00031] Avg episode rewards: #0: 24.938, true rewards: #0: 10.738 |
|
[2025-03-21 00:08:48,611][00031] Avg episode reward: 24.938, avg true_objective: 10.738 |
|
[2025-03-21 00:09:25,266][00031] Replay video saved to /kaggle/working/train_dir/default_experiment/replay.mp4! |
|
[2025-03-21 00:10:17,506][00031] Loading existing experiment configuration from /kaggle/working/train_dir/default_experiment/config.json |
|
[2025-03-21 00:10:17,507][00031] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-21 00:10:17,508][00031] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-21 00:10:17,509][00031] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-21 00:10:17,510][00031] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-21 00:10:17,511][00031] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-21 00:10:17,512][00031] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-03-21 00:10:17,513][00031] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-21 00:10:17,514][00031] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-03-21 00:10:17,515][00031] Adding new argument 'hf_repository'='salym/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-03-21 00:10:17,516][00031] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-21 00:10:17,517][00031] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-21 00:10:17,517][00031] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-21 00:10:17,518][00031] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-21 00:10:17,519][00031] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-21 00:10:17,542][00031] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-21 00:10:17,544][00031] RunningMeanStd input shape: (1,) |
|
[2025-03-21 00:10:17,555][00031] ConvEncoder: input_channels=3 |
|
[2025-03-21 00:10:17,590][00031] Conv encoder output size: 512 |
|
[2025-03-21 00:10:17,591][00031] Policy head output size: 512 |
|
[2025-03-21 00:10:17,610][00031] Loading state from checkpoint /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... |
|
[2025-03-21 00:10:18,061][00031] Num frames 100... |
|
[2025-03-21 00:10:18,191][00031] Num frames 200... |
|
[2025-03-21 00:10:18,311][00031] Num frames 300... |
|
[2025-03-21 00:10:18,435][00031] Num frames 400... |
|
[2025-03-21 00:10:18,557][00031] Num frames 500... |
|
[2025-03-21 00:10:18,681][00031] Num frames 600... |
|
[2025-03-21 00:10:18,803][00031] Num frames 700... |
|
[2025-03-21 00:10:18,939][00031] Num frames 800... |
|
[2025-03-21 00:10:19,070][00031] Num frames 900... |
|
[2025-03-21 00:10:19,200][00031] Num frames 1000... |
|
[2025-03-21 00:10:19,321][00031] Num frames 1100... |
|
[2025-03-21 00:10:19,438][00031] Num frames 1200... |
|
[2025-03-21 00:10:19,553][00031] Num frames 1300... |
|
[2025-03-21 00:10:19,669][00031] Num frames 1400... |
|
[2025-03-21 00:10:19,784][00031] Num frames 1500... |
|
[2025-03-21 00:10:19,879][00031] Avg episode rewards: #0: 32.360, true rewards: #0: 15.360 |
|
[2025-03-21 00:10:19,880][00031] Avg episode reward: 32.360, avg true_objective: 15.360 |
|
[2025-03-21 00:10:19,953][00031] Num frames 1600... |
|
[2025-03-21 00:10:20,069][00031] Num frames 1700... |
|
[2025-03-21 00:10:20,188][00031] Num frames 1800... |
|
[2025-03-21 00:10:20,303][00031] Num frames 1900... |
|
[2025-03-21 00:10:20,419][00031] Num frames 2000... |
|
[2025-03-21 00:10:20,539][00031] Num frames 2100... |
|
[2025-03-21 00:10:20,661][00031] Num frames 2200... |
|
[2025-03-21 00:10:20,783][00031] Num frames 2300... |
|
[2025-03-21 00:10:20,907][00031] Num frames 2400... |
|
[2025-03-21 00:10:21,031][00031] Num frames 2500... |
|
[2025-03-21 00:10:21,160][00031] Num frames 2600... |
|
[2025-03-21 00:10:21,283][00031] Avg episode rewards: #0: 30.775, true rewards: #0: 13.275 |
|
[2025-03-21 00:10:21,284][00031] Avg episode reward: 30.775, avg true_objective: 13.275 |
|
[2025-03-21 00:10:21,339][00031] Num frames 2700... |
|
[2025-03-21 00:10:21,456][00031] Num frames 2800... |
|
[2025-03-21 00:10:21,571][00031] Num frames 2900... |
|
[2025-03-21 00:10:21,692][00031] Num frames 3000... |
|
[2025-03-21 00:10:21,813][00031] Num frames 3100... |
|
[2025-03-21 00:10:21,931][00031] Num frames 3200... |
|
[2025-03-21 00:10:22,050][00031] Num frames 3300... |
|
[2025-03-21 00:10:22,167][00031] Num frames 3400... |
|
[2025-03-21 00:10:22,283][00031] Num frames 3500... |
|
[2025-03-21 00:10:22,401][00031] Num frames 3600... |
|
[2025-03-21 00:10:22,521][00031] Num frames 3700... |
|
[2025-03-21 00:10:22,638][00031] Num frames 3800... |
|
[2025-03-21 00:10:22,757][00031] Num frames 3900... |
|
[2025-03-21 00:10:22,882][00031] Avg episode rewards: #0: 32.197, true rewards: #0: 13.197 |
|
[2025-03-21 00:10:22,882][00031] Avg episode reward: 32.197, avg true_objective: 13.197 |
|
[2025-03-21 00:10:22,930][00031] Num frames 4000... |
|
[2025-03-21 00:10:23,046][00031] Num frames 4100... |
|
[2025-03-21 00:10:23,162][00031] Num frames 4200... |
|
[2025-03-21 00:10:23,280][00031] Num frames 4300... |
|
[2025-03-21 00:10:23,398][00031] Num frames 4400... |
|
[2025-03-21 00:10:23,519][00031] Num frames 4500... |
|
[2025-03-21 00:10:23,635][00031] Num frames 4600... |
|
[2025-03-21 00:10:23,799][00031] Avg episode rewards: #0: 27.488, true rewards: #0: 11.737 |
|
[2025-03-21 00:10:23,800][00031] Avg episode reward: 27.488, avg true_objective: 11.737 |
|
[2025-03-21 00:10:23,807][00031] Num frames 4700... |
|
[2025-03-21 00:10:23,928][00031] Num frames 4800... |
|
[2025-03-21 00:10:24,044][00031] Num frames 4900... |
|
[2025-03-21 00:10:24,162][00031] Num frames 5000... |
|
[2025-03-21 00:10:24,278][00031] Num frames 5100... |
|
[2025-03-21 00:10:24,394][00031] Num frames 5200... |
|
[2025-03-21 00:10:24,509][00031] Num frames 5300... |
|
[2025-03-21 00:10:24,625][00031] Num frames 5400... |
|
[2025-03-21 00:10:24,743][00031] Num frames 5500... |
|
[2025-03-21 00:10:24,867][00031] Num frames 5600... |
|
[2025-03-21 00:10:24,970][00031] Avg episode rewards: #0: 26.274, true rewards: #0: 11.274 |
|
[2025-03-21 00:10:24,971][00031] Avg episode reward: 26.274, avg true_objective: 11.274 |
|
[2025-03-21 00:10:25,048][00031] Num frames 5700... |
|
[2025-03-21 00:10:25,185][00031] Num frames 5800... |
|
[2025-03-21 00:10:25,305][00031] Num frames 5900... |
|
[2025-03-21 00:10:25,423][00031] Num frames 6000... |
|
[2025-03-21 00:10:25,545][00031] Num frames 6100... |
|
[2025-03-21 00:10:25,671][00031] Num frames 6200... |
|
[2025-03-21 00:10:25,800][00031] Num frames 6300... |
|
[2025-03-21 00:10:25,928][00031] Num frames 6400... |
|
[2025-03-21 00:10:26,052][00031] Num frames 6500... |
|
[2025-03-21 00:10:26,181][00031] Num frames 6600... |
|
[2025-03-21 00:10:26,307][00031] Num frames 6700... |
|
[2025-03-21 00:10:26,433][00031] Num frames 6800... |
|
[2025-03-21 00:10:26,557][00031] Num frames 6900... |
|
[2025-03-21 00:10:26,676][00031] Num frames 7000... |
|
[2025-03-21 00:10:26,800][00031] Num frames 7100... |
|
[2025-03-21 00:10:26,920][00031] Num frames 7200... |
|
[2025-03-21 00:10:27,039][00031] Num frames 7300... |
|
[2025-03-21 00:10:27,160][00031] Num frames 7400... |
|
[2025-03-21 00:10:27,276][00031] Num frames 7500... |
|
[2025-03-21 00:10:27,395][00031] Num frames 7600... |
|
[2025-03-21 00:10:27,512][00031] Num frames 7700... |
|
[2025-03-21 00:10:27,609][00031] Avg episode rewards: #0: 31.395, true rewards: #0: 12.895 |
|
[2025-03-21 00:10:27,610][00031] Avg episode reward: 31.395, avg true_objective: 12.895 |
|
[2025-03-21 00:10:27,681][00031] Num frames 7800... |
|
[2025-03-21 00:10:27,797][00031] Num frames 7900... |
|
[2025-03-21 00:10:27,912][00031] Num frames 8000... |
|
[2025-03-21 00:10:28,026][00031] Num frames 8100... |
|
[2025-03-21 00:10:28,106][00031] Avg episode rewards: #0: 27.459, true rewards: #0: 11.601 |
|
[2025-03-21 00:10:28,106][00031] Avg episode reward: 27.459, avg true_objective: 11.601 |
|
[2025-03-21 00:10:28,197][00031] Num frames 8200... |
|
[2025-03-21 00:10:28,318][00031] Num frames 8300... |
|
[2025-03-21 00:10:28,438][00031] Num frames 8400... |
|
[2025-03-21 00:10:28,559][00031] Num frames 8500... |
|
[2025-03-21 00:10:28,684][00031] Num frames 8600... |
|
[2025-03-21 00:10:28,803][00031] Num frames 8700... |
|
[2025-03-21 00:10:28,929][00031] Num frames 8800... |
|
[2025-03-21 00:10:29,062][00031] Num frames 8900... |
|
[2025-03-21 00:10:29,194][00031] Num frames 9000... |
|
[2025-03-21 00:10:29,311][00031] Num frames 9100... |
|
[2025-03-21 00:10:29,426][00031] Num frames 9200... |
|
[2025-03-21 00:10:29,545][00031] Num frames 9300... |
|
[2025-03-21 00:10:29,671][00031] Num frames 9400... |
|
[2025-03-21 00:10:29,795][00031] Num frames 9500... |
|
[2025-03-21 00:10:29,914][00031] Num frames 9600... |
|
[2025-03-21 00:10:30,032][00031] Num frames 9700... |
|
[2025-03-21 00:10:30,154][00031] Num frames 9800... |
|
[2025-03-21 00:10:30,282][00031] Num frames 9900... |
|
[2025-03-21 00:10:30,404][00031] Num frames 10000... |
|
[2025-03-21 00:10:30,528][00031] Num frames 10100... |
|
[2025-03-21 00:10:30,662][00031] Num frames 10200... |
|
[2025-03-21 00:10:30,745][00031] Avg episode rewards: #0: 31.276, true rewards: #0: 12.776 |
|
[2025-03-21 00:10:30,746][00031] Avg episode reward: 31.276, avg true_objective: 12.776 |
|
[2025-03-21 00:10:30,853][00031] Num frames 10300... |
|
[2025-03-21 00:10:30,982][00031] Num frames 10400... |
|
[2025-03-21 00:10:31,110][00031] Num frames 10500... |
|
[2025-03-21 00:10:31,234][00031] Num frames 10600... |
|
[2025-03-21 00:10:31,359][00031] Num frames 10700... |
|
[2025-03-21 00:10:31,485][00031] Num frames 10800... |
|
[2025-03-21 00:10:31,651][00031] Avg episode rewards: #0: 29.325, true rewards: #0: 12.103 |
|
[2025-03-21 00:10:31,652][00031] Avg episode reward: 29.325, avg true_objective: 12.103 |
|
[2025-03-21 00:10:31,661][00031] Num frames 10900... |
|
[2025-03-21 00:10:31,787][00031] Num frames 11000... |
|
[2025-03-21 00:10:31,907][00031] Num frames 11100... |
|
[2025-03-21 00:10:32,030][00031] Num frames 11200... |
|
[2025-03-21 00:10:32,148][00031] Num frames 11300... |
|
[2025-03-21 00:10:32,268][00031] Num frames 11400... |
|
[2025-03-21 00:10:32,386][00031] Num frames 11500... |
|
[2025-03-21 00:10:32,510][00031] Num frames 11600... |
|
[2025-03-21 00:10:32,635][00031] Num frames 11700... |
|
[2025-03-21 00:10:32,757][00031] Num frames 11800... |
|
[2025-03-21 00:10:32,879][00031] Num frames 11900... |
|
[2025-03-21 00:10:33,001][00031] Num frames 12000... |
|
[2025-03-21 00:10:33,129][00031] Num frames 12100... |
|
[2025-03-21 00:10:33,281][00031] Avg episode rewards: #0: 29.273, true rewards: #0: 12.173 |
|
[2025-03-21 00:10:33,282][00031] Avg episode reward: 29.273, avg true_objective: 12.173 |
|
[2025-03-21 00:11:13,840][00031] Replay video saved to /kaggle/working/train_dir/default_experiment/replay.mp4! |
|
|