runtime error

Exit code: 1. Reason: .............................................................. llama_context: constructing llama_context llama_context: n_seq_max = 1 llama_context: n_ctx = 2048 llama_context: n_ctx_per_seq = 2048 llama_context: n_batch = 2048 llama_context: n_ubatch = 512 llama_context: causal_attn = 1 llama_context: flash_attn = 0 llama_context: kv_unified = false llama_context: freq_base = 10000.0 llama_context: freq_scale = 1 llama_context: n_ctx_per_seq (2048) < n_ctx_train (1048576) -- the full capacity of the model will not be utilized llama_context: CPU output buffer size = 0.38 MiB llama_kv_cache: the V embeddings have different sizes across layers and FA is not enabled - padding V cache to 512 llama_kv_cache: CPU KV buffer size = 16.00 MiB llama_kv_cache: size = 16.00 MiB ( 2048 cells, 4 layers, 1/1 seqs), K (f16): 8.00 MiB, V (f16): 8.00 MiB llama_memory_recurrent: CPU RS buffer size = 73.79 MiB llama_memory_recurrent: size = 73.79 MiB ( 1 cells, 40 layers, 1 seqs), R (f32): 1.79 MiB, S (f32): 72.00 MiB llama_context: CPU compute buffer size = 210.01 MiB llama_context: graph nodes = 2310 llama_context: graph splits = 1 common_init_from_params: added <|end_of_text|> logit bias = -inf common_init_from_params: added <|fim_pad|> logit bias = -inf common_init_from_params: setting dry_penalty_last_n to ctx_size = 2048 common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable) * Running on local URL: http://0.0.0.0:7860, with SSR ⚡ (experimental, to disable set `ssr_mode=False` in `launch()`) Traceback (most recent call last): File "/home/user/app/app.py", line 205, in <module> demo.queue().launch() File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2709, in launch raise ValueError( ValueError: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost.

Container logs:

Fetching error logs...