Santa Cruz AI Community
community
AI & ML interests
None defined yet.
SantaCruzAI's activity
Post
464
I'm getting this all of a sudden, even generated a new token but still get a 401. anyone else seeing this?
Exception:401 Client Error: Unauthorized for url: https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.2-3B-Instruct/v1/chat/completions (Request ID: Root=1-67dc6b20-3a4697761ad9315c06ca928a;d914bcf1-063a-4df2-acc2-8e0170ddccb3)
Exception:401 Client Error: Unauthorized for url: https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.2-3B-Instruct/v1/chat/completions (Request ID: Root=1-67dc6b20-3a4697761ad9315c06ca928a;d914bcf1-063a-4df2-acc2-8e0170ddccb3)
Post
636
Model is always disabled?
#script...
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2",
token="xxxxxx")
That loads the model fine. But if used by index returned from VectorStoreIndex for QDrant like this:
#script...
query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)
response = query_engine.query(
"What is formula 1?"
)
response.print_response_stream()
It errors out with a disabled error:
AssertionError Traceback (most recent call last)
Cell In[34], line 1
----> 1 query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)
3 response = query_engine.query(
4 "What is formula 1?"
5 )
7 response.print_response_stream()
File ~/miniconda/lib/python3.9/site-packages/llama_index/core/indices/base.py:376, in BaseIndex.as_query_engine(self, llm, **kwargs)
370 from llama_index.core.query_engine.retriever_query_engine import (
371 RetrieverQueryEngine,
372 )
374 retriever = self.as_retriever(**kwargs)
375 llm = (
--> 376 resolve_llm(llm, callback_manager=self._callback_manager)
377 if llm
378 else Settings.llm
379 )
381 return RetrieverQueryEngine.from_args(
382 retriever,
383 llm=llm,
384 **kwargs,
385 )
File ~/miniconda/lib/python3.9/site-packages/llama_index/core/llms/utils.py:102, in resolve_llm(llm, callback_manager)
99 print("LLM is explicitly disabled. Using MockLLM.")
100 llm = MockLLM()
--> 102 assert isinstance(llm, LLM)
104 llm.callback_manager = callback_manager or Settings.callback_manager
106 return llm
AssertionError:
So why is the LLM disabled?
Thanks!
#script...
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2",
token="xxxxxx")
That loads the model fine. But if used by index returned from VectorStoreIndex for QDrant like this:
#script...
query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)
response = query_engine.query(
"What is formula 1?"
)
response.print_response_stream()
It errors out with a disabled error:
AssertionError Traceback (most recent call last)
Cell In[34], line 1
----> 1 query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)
3 response = query_engine.query(
4 "What is formula 1?"
5 )
7 response.print_response_stream()
File ~/miniconda/lib/python3.9/site-packages/llama_index/core/indices/base.py:376, in BaseIndex.as_query_engine(self, llm, **kwargs)
370 from llama_index.core.query_engine.retriever_query_engine import (
371 RetrieverQueryEngine,
372 )
374 retriever = self.as_retriever(**kwargs)
375 llm = (
--> 376 resolve_llm(llm, callback_manager=self._callback_manager)
377 if llm
378 else Settings.llm
379 )
381 return RetrieverQueryEngine.from_args(
382 retriever,
383 llm=llm,
384 **kwargs,
385 )
File ~/miniconda/lib/python3.9/site-packages/llama_index/core/llms/utils.py:102, in resolve_llm(llm, callback_manager)
99 print("LLM is explicitly disabled. Using MockLLM.")
100 llm = MockLLM()
--> 102 assert isinstance(llm, LLM)
104 llm.callback_manager = callback_manager or Settings.callback_manager
106 return llm
AssertionError:
So why is the LLM disabled?
Thanks!
Post
1521
Hello Everyone,
I signed up as Pro and started a ZeroGPU space with a Gradio chatbot project as default. When building the space, it won't even start the sample Gradio app.. Pretty disappointing when right out of the box, it fails...
Have anyone encountered this yet?
Thanks...
This is the output, odd since it seems to be just a warning. So why wouldn't it start?
/usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:228: UserWarning: The 'tuples' format for chatbot messages is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
warnings.warn(
* Running on local URL: http://0.0.0.0:7860, with SSR โก
To create a public link, set
Stopping Node.js server...
I signed up as Pro and started a ZeroGPU space with a Gradio chatbot project as default. When building the space, it won't even start the sample Gradio app.. Pretty disappointing when right out of the box, it fails...
Have anyone encountered this yet?
Thanks...
This is the output, odd since it seems to be just a warning. So why wouldn't it start?
/usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:228: UserWarning: The 'tuples' format for chatbot messages is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
warnings.warn(
* Running on local URL: http://0.0.0.0:7860, with SSR โก
To create a public link, set
share=True
in launch()
.Stopping Node.js server...