rajveer43
/

titan-transformer

PyTorch

google

google_titan

Model card Files Files and versions Community

rajveer43 commited on Jan 18

Commit

dfb71eb

verified ·

1 Parent(s): 8fe26d1

Upload wandb/output.log with huggingface_hub

Browse files

Files changed (1) hide show

wandb/output.log +146 -0

wandb/output.log ADDED Viewed

	@@ -0,0 +1,146 @@

+Epoch 0: 100%|██████████| 1/1 [00:09<00:00,  9.95s/it, loss=2.3673]
+Epoch 0 metrics: {'epoch': 0, 'loss': 2.3673205375671387, 'val_loss': 2.359609603881836}
+Epoch 1: 100%|██████████| 1/1 [00:05<00:00,  5.72s/it, loss=2.3390]
+Epoch 1 metrics: {'epoch': 1, 'loss': 2.33897066116333, 'val_loss': 2.3596091270446777}
+Epoch 2: 100%|██████████| 1/1 [00:07<00:00,  7.63s/it, loss=2.3395]
+Epoch 2 metrics: {'epoch': 2, 'loss': 2.3394534587860107, 'val_loss': 2.359609365463257}
+Epoch 3: 100%|██████████| 1/1 [00:07<00:00,  7.26s/it, loss=2.3447]
+Epoch 3 metrics: {'epoch': 3, 'loss': 2.3446695804595947, 'val_loss': 2.359609365463257}
+Epoch 4: 100%|██████████| 1/1 [00:05<00:00,  5.76s/it, loss=2.3448]
+Epoch 4 metrics: {'epoch': 4, 'loss': 2.3447532653808594, 'val_loss': 2.359609365463257}
+Epoch 5: 100%|██████████| 1/1 [00:05<00:00,  5.67s/it, loss=2.3693]
+Epoch 5 metrics: {'epoch': 5, 'loss': 2.3692800998687744, 'val_loss': 2.3596091270446777}
+Epoch 6: 100%|██████████| 1/1 [00:06<00:00,  6.38s/it, loss=2.3505]
+Epoch 6 metrics: {'epoch': 6, 'loss': 2.3504748344421387, 'val_loss': 2.359609603881836}
+Epoch 7: 100%|██████████| 1/1 [00:06<00:00,  6.72s/it, loss=2.3486]
+Epoch 7 metrics: {'epoch': 7, 'loss': 2.3485639095306396, 'val_loss': 2.359609603881836}
+Epoch 8: 100%|██████████| 1/1 [00:05<00:00,  5.68s/it, loss=2.3788]
+Epoch 8 metrics: {'epoch': 8, 'loss': 2.3788001537323, 'val_loss': 2.359609365463257}
+Epoch 9: 100%|██████████| 1/1 [00:06<00:00,  6.99s/it, loss=2.3707]
+Epoch 9 metrics: {'epoch': 9, 'loss': 2.3706822395324707, 'val_loss': 2.359609365463257}
+Traceback (most recent call last):
+  File "<string>", line 1, in <module>
+NameError: name 'prepare_model_for_serving' is not defined
+<ipython-input-38-01a39a3e998e>:31: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+  model = torch.load(model_path)
+<ipython-input-39-9183a6f3edb8>:31: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+  model = torch.load(model_path)
+/bin/bash: line 1: torch-model-archiver: command not found
+Collecting torchserve
+  Downloading torchserve-0.12.0-py3-none-any.whl.metadata (1.4 kB)
+Collecting torch-model-archiver
+  Downloading torch_model_archiver-0.12.0-py3-none-any.whl.metadata (1.4 kB)
+Requirement already satisfied: Pillow in /usr/local/lib/python3.11/dist-packages (from torchserve) (11.1.0)
+Requirement already satisfied: psutil in /usr/local/lib/python3.11/dist-packages (from torchserve) (5.9.5)
+Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from torchserve) (24.2)
+Requirement already satisfied: wheel in /usr/local/lib/python3.11/dist-packages (from torchserve) (0.45.1)
+Collecting enum-compat (from torch-model-archiver)
+  Downloading enum_compat-0.0.3-py3-none-any.whl.metadata (954 bytes)
+Downloading torchserve-0.12.0-py3-none-any.whl (42.2 MB)
+[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.2/42.2 MB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
+[?25hDownloading torch_model_archiver-0.12.0-py3-none-any.whl (16 kB)
+Downloading enum_compat-0.0.3-py3-none-any.whl (1.3 kB)
+Installing collected packages: enum-compat, torchserve, torch-model-archiver
+Successfully installed enum-compat-0.0.3 torch-model-archiver-0.12.0 torchserve-0.12.0
+Traceback (most recent call last):
+  File "/usr/local/bin/torch-model-archiver", line 8, in <module>
+    sys.exit(generate_model_archive())
+             ^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/usr/local/lib/python3.11/dist-packages/model_archiver/model_packaging.py", line 72, in generate_model_archive
+    package_model(config, manifest=manifest)
+  File "/usr/local/lib/python3.11/dist-packages/model_archiver/model_packaging.py", line 45, in package_model
+    model_path = ModelExportUtils.copy_artifacts(
+                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/usr/local/lib/python3.11/dist-packages/model_archiver/model_packaging_utils.py", line 215, in copy_artifacts
+    shutil.copy(path, model_path)
+  File "/usr/lib/python3.11/shutil.py", line 431, in copy
+    copyfile(src, dst, follow_symlinks=follow_symlinks)
+  File "/usr/lib/python3.11/shutil.py", line 256, in copyfile
+    with open(src, 'rb') as fsrc:
+         ^^^^^^^^^^^^^^^
+FileNotFoundError: [Errno 2] No such file or directory: 'titan_handler.py'
+<ipython-input-43-06d5ff2de07d>:18: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+  model = torch.load(model_path)
+WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
+2025-01-18T18:01:30,224 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
+nvidia-smi not available or failed: Cannot run program "nvidia-smi": error=2, No such file or directory
+2025-01-18T18:01:30,334 [DEBUG] main org.pytorch.serve.util.ConfigManager - xpu-smi not available or failed: Cannot run program "xpu-smi": error=2, No such file or directory
+2025-01-18T18:01:30,338 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
+2025-01-18T18:01:30,489 [INFO ] main org.pytorch.serve.util.TokenAuthorization -
+######
+TorchServe now enforces token authorization by default.
+This requires the correct token to be provided when calling an API.
+Key file located at /content/key_file.json
+Check token authorization documenation for information: https://github.com/pytorch/serve/blob/master/docs/token_authorization_api.md
+######
+2025-01-18T18:01:30,489 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
+2025-01-18T18:01:30,882 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /usr/local/lib/python3.11/dist-packages/ts/configs/metrics.yaml
+2025-01-18T18:01:31,174 [INFO ] main org.pytorch.serve.ModelServer -
+Torchserve version: 0.12.0
+TS Home: /usr/local/lib/python3.11/dist-packages
+Current directory: /content
+Temp directory: /tmp
+Metrics config path: /usr/local/lib/python3.11/dist-packages/ts/configs/metrics.yaml
+Number of GPUs: 0
+Number of CPUs: 2
+Max heap size: 3246 M
+Python executable: /usr/bin/python3
+Config file: N/A
+Inference address: http://127.0.0.1:8080
+Management address: http://127.0.0.1:8081
+Metrics address: http://127.0.0.1:8082
+Model Store: /content/model_store
+Initial Models: titan=titan.mar
+Log dir: /content/logs
+Metrics dir: /content/logs
+Netty threads: 0
+Netty client threads: 0
+Default workers per model: 2
+Blacklist Regex: N/A
+Maximum Response Size: 6553500
+Maximum Request Size: 6553500
+Limit Maximum Image Pixels: true
+Prefer direct buffer: false
+Allowed Urls: [file://.*|http(s)?://.*]
+Custom python dependency for model allowed: false
+Enable metrics API: true
+Metrics mode: LOG
+Disable system metrics: false
+Workflow Store: /content/model_store
+CPP log config: N/A
+Model config: N/A
+System metrics command: default
+Model API enabled: false
+2025-01-18T18:01:31,235 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: titan.mar
+2025-01-18T18:02:24,534 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model titan
+2025-01-18T18:02:24,535 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model titan
+2025-01-18T18:02:24,535 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model titan loaded.
+2025-01-18T18:02:24,536 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: titan, count: 2
+2025-01-18T18:02:24,575 [DEBUG] W-9000-titan_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/usr/bin/python3, /usr/local/lib/python3.11/dist-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /tmp/.ts.sock.9000, --metrics-config, /usr/local/lib/python3.11/dist-packages/ts/configs/metrics.yaml]
+2025-01-18T18:02:24,581 [DEBUG] W-9001-titan_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/usr/bin/python3, /usr/local/lib/python3.11/dist-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /tmp/.ts.sock.9001, --metrics-config, /usr/local/lib/python3.11/dist-packages/ts/configs/metrics.yaml]
+2025-01-18T18:02:24,602 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
+2025-01-18T18:02:26,923 [INFO ] main org.pytorch.serve.ModelServer - Torchserve stopped.
+java.io.IOException: Failed to bind to address: http://127.0.0.1:8080
+	at org.pytorch.serve.ModelServer.initializeServer(ModelServer.java:354)
+	at org.pytorch.serve.ModelServer.startRESTserver(ModelServer.java:415)
+	at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:124)
+	at org.pytorch.serve.ModelServer.main(ModelServer.java:105)
+Caused by: io.netty.channel.unix.Errors$NativeIoException: bind(..) failed: Address already in use
+COMMAND PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
+node      6 root   21u  IPv6   19939      0t0  TCP *:8080 (LISTEN)
+node      6 root   26u  IPv6 1016000      0t0  TCP 17c4989733c5:8080->172.28.0.1:39786 (ESTABLISHED)
+node      6 root   28u  IPv6 1017108      0t0  TCP 17c4989733c5:8080->172.28.0.1:45550 (ESTABLISHED)
+node      6 root   29u  IPv6   25417      0t0  TCP 17c4989733c5:8080->172.28.0.1:55124 (ESTABLISHED)
+node      6 root   31u  IPv6  867221      0t0  TCP 17c4989733c5:8080->172.28.0.1:50646 (ESTABLISHED)
+usage: torchserve [-h] [-v | --start | --stop] [--ts-config TS_CONFIG] [--model-store MODEL_STORE]
+                  [--workflow-store WORKFLOW_STORE]
+                  [--models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]]
+                  [--log-config LOG_CONFIG] [--cpp-log-config CPP_LOG_CONFIG] [--foreground]
+                  [--no-config-snapshots] [--plugins-path PLUGINS_PATH] [--disable-token-auth]
+                  [--enable-model-api]
+torchserve: error: unrecognized arguments: --inference-address http://127.0.0.1:8083
+/bin/bash: line 1: nano: command not found
+/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_deprecation.py:131: FutureWarning: 'Repository' (from 'huggingface_hub.repository') is deprecated and will be removed from version '1.0'. Please prefer the http-based alternatives instead. Given its large adoption in legacy code, the complete removal is only planned on next major release.
+For more details, please read https://huggingface.co/docs/huggingface_hub/concepts/git_vs_http.
+  warnings.warn(warning_message, FutureWarning)