Practical .NET, software, and AI tutorials by Gerard Beckerleg, a developer in Sydney.
- LangGraph raises TypeError on function node output
LangGraph raises TypeError on function node output This morning my LangGraph flow crashed after a new tool node. TypeError: cannot unpack non-iterable ToolInvocation object Python 3.12, langgraph 0.3.8, langchain 0.2.1. The node function returned a ToolInvocation instead of (ToolInvocation, kwargs); API changed in 0.3.7. Fix:
- HuggingFace pipeline returns empty string with Gemini 1.5 Pro
HuggingFace pipeline returns empty string with Gemini 1.5 Pro Last week a quick text‑generation pipeline came back blank every call. from transformers import pipeline gen = pipeline("text-generation", model="google/gemini-1.5-pro") print(gen("hi")) Result: [{'generated_text': ''}] Transformers 4.42.1, Python 3.10, macOS Sonoma. No errors. Network call to Google succeeded, payload returned text.
- vLLM tokenizer mismatch on finetuned Mistral model
vLLM tokenizer mismatch on finetuned Mistral model Spent the afternoon benchmarking a Mistral‑7B finetune with vLLM. First prompt returned gibberish tokens. from vllm import LLM llm = LLM(model="mistralai/Mistral-7B-Instruct-v0.2", tokenizer="mistralai/Mistral-7B-Instruct-v0.2") print(llm.generate("Hello")) Output contained repeating � characters. Ubuntu 22.04, CUDA 12.1, vllm 0.4.3, transformers 4.40.0, Python 3.11. Checkpoint trained with --trust-remote-code.
- OpenRouter API response missing text in JSON payload
OpenRouter API response missing ’text’ in JSON payload Testing OpenRouter with gpt‑4 turbo. Request looked fine but the completion field came back empty: { "id":"chatcmpl-xxxx", "choices":[ {"index":0,"logprobs":null,"finish_reason":"stop"} ], "model":"gpt-4-turbo-preview", "usage":{...} } No choices[0].message.content. cURL request: curl https://openrouter.ai/api/v1/chat/completions -H "Authorization: Bearer $OPENROUTER_KEY" -H "Content-Type: application/json" -d '{"model":"gpt-4-turbo-preview","messages":[{"role":"user","content":"hi"}]}' Turns out OpenRouter now requires the header HTTP-Referer (your site) and X-Title (app name) for non‑localhost keys.
- langchain.tools.SerpAPIWrapper throws Invalid API Key despite valid key
langchain.tools.SerpAPIWrapper throws Invalid API Key despite valid key Spotted this today while wiring a quick search tool. from langchain.tools import SerpAPIWrapper tool = SerpAPIWrapper() print(tool.run("Sydney weather")) Console spat back: SerpApiError: Invalid API key Key worked in curl. Quota fine. Python 3.11, langchain 0.1.12, serpapi 0.6.2. Environment variable SERPAPI_API_KEY was set.
- Gemini Pro API throws 403 in Postman with valid key
Gemini Pro API throws 403 in Postman with valid key Tried hitting the Gemini Pro v1/models/gemini-pro:generateContent endpoint in Postman. Headers set: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=AIzaSy... Content-Type: application/json Response: { "error": { "code": 403, "message": "The caller does not have permission", "status": "PERMISSION_DENIED" } } Key freshly created, billing enabled, Generative Language API enabled.
- Mistral inference on local GPU hits OOM with 13B model
Mistral inference on local GPU hits OOM with 13B model Last week I tried the 13B Mistral model on a single RTX 3060 (12 GB). python main.py crashed instantly: RuntimeError: CUDA out of memory. Tried to allocate 10.2 GiB torch.cuda.mem_get_info() showed only 11.7 GB free.
- Ollama install on Windows missing symlink permissions
Ollama install on Windows missing symlink permissions Setting up Ollama on Windows 11 this morning. Installer finished, but every ollama run threw: open C:\Users\Me\.ollama\models\registry.json: CreateFile symlink \?\C:\Users\Me\.ollama\models\registry.json: Access is denied. Fresh PowerShell session, Admin prompt. Windows Developer Mode was disabled.
- langchain.chains.ConversationalRetrievalChain() fails with FAISS store
langchain.chains.ConversationalRetrievalChain() fails with FAISS store Tried to wire ConversationalRetrievalChain with a FAISS vector store. The constructor crashed. ValueError: Provided embeddings are incompatible with stored index dimension Embeddings built with sentence-transformers/all-MiniLM-L6-v2, dim 384. LangChain 0.0.286, faiss‑cpu 1.7.4, Python 3.11 on macOS.
- Gradio app stalls on LLaMA 2 inference with 8‑bit quant
Gradio app stalls on LLaMA 2 inference with 8‑bit quant Spun up a quick Gradio demo around LLaMA‑2‑7B‑chat‑hf using 8‑bit quantisation. Prompt box froze after I hit “Submit”; no tokens returned. model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-2-7b-chat-hf", load_in_8bit=True, device_map="auto" ) GPU RTX 3090, CUDA 11.8, torch 2.0.1+cu118, bitsandbytes 0.41.0, transformers 4.31.0. Gradio 3.34.0. Linux Mint 21.2.