Joshua Opolko

Building JOSIE: Advanced AI Workflow with Persistent Memory and Live Data Access

Build an uncensored AI assistant with persistent memory using n8n automation, vector databases, and privacy-focused web search

Traditional AI assistants lose memory between sessions and have limited web access. JOSIE solves both problems with a three-tier architecture using open-source tools: n8n workflow automation, Qdrant vector database for persistent memory, and SearXNG for unlimited web search, with complete data privacy.

Key takeaways

What is JOSIE's architecture?

JOSIE’s architecture demonstrates how modern autonomous AI systems combine multiple protocols and components into a coherent, privacy-preserving whole. Similar to Agent Zero’s approach to multi-agent hierarchies and protocol integration, JOSIE leverages standardized HTTP interfaces for tool integration and workflow coordination.

The three layers are independent but complementary. Ollama handles local inference and embedding generation, keeping all compute on your own hardware. Qdrant stores and retrieves vector representations of every past interaction, giving the system a semantic long-term memory that persists between restarts. n8n ties the layers together with a visual workflow engine that decides, on each incoming query, whether to pull from stored memory, hit the web via SearXNG, or search uploaded documents. Because each component communicates over standard HTTP APIs, you can swap individual pieces without rebuilding the whole pipeline.

Component 1: JOSIEFIED-Qwen3:8b Language Model

Built on Alibaba’s Qwen3 with 8.19B parameters and 40K context window:

Component 2: Qdrant Vector Database for Persistent Memory

Qdrant creates persistent memory through semantic search:

Component 3: SearXNG Privacy-Focused Web Search

SearXNG provides unlimited, privacy-focused web search:

What workflow features does JOSIE include?

The n8n workflow is the decision layer that makes JOSIE feel coherent rather than just fast. Every incoming message passes through a routing switch that categorizes the request and selects the right tool automatically. Three capabilities work together: smart routing picks the appropriate data source without manual prompting, session state persists conversation context across multiple exchanges, and graceful error handling degrades cleanly when any one component is unavailable. The result is an assistant that behaves consistently even when hardware or network conditions vary.

What are JOSIE's real-world use cases?

Personal knowledge management: Upload PDFs and query them conversationally. Every exchange is embedded and stored, so you can ask JOSIE what you read three weeks ago and get a semantically relevant answer rather than a keyword hit. The interaction history becomes its own searchable knowledge base over time.

Business intelligence: Maintain project context across weeks of meetings, combine stored internal knowledge with live web data, and give customer-support workflows a memory that persists between agent restarts. Because everything stays local, sensitive business data never passes through a third-party API.

Education and research: Track learning progress across sessions, surface relevant passages from uploaded academic papers, and combine stored domain knowledge with current information retrieved via SearXNG. JOSIE keeps pace with fast-moving fields because its web access is unlimited and uncharged per query.

How does JOSIE protect your privacy and security?

JOSIE prioritizes user privacy through local processing and encrypted storage. For complete context on AI data privacy challenges and best practices, including GDPR compliance, consent frameworks, and privacy-preserving technologies, see our comprehensive guide on data protection in AI systems.

How do you install JOSIE step by step?

Prerequisites: Docker and roughly 16 GB of RAM to run the 8B model alongside Qdrant comfortably. JOSIE is built bottom-up: stand up the three services, then let n8n orchestrate them.

Step 1: The brain (Ollama plus the JOSIEFIED model)

Ollama runs the language model and the embedding model locally, so nothing leaves your machine. Full walkthrough in my Ollama self-hosting guide.

# install Ollama (Linux / macOS)
curl -fsSL https://ollama.com/install.sh | sh

# the uncensored, tool-tuned model that reasons (8B, 40K context)
ollama pull goekdenizguelmez/JOSIEFIED-Qwen3:8b

# the embedding model that powers memory search
ollama pull mxbai-embed-large

Step 2: The memory (Qdrant vector database)

Qdrant stores every conversation and document as vectors, so JOSIE recalls by meaning, not keyword. One container:

# run Qdrant, persisting data to ./qdrant_storage
docker run -p 6333:6333 -p 6334:6334 \
  -v "$(pwd)/qdrant_storage:/qdrant/storage" \
  qdrant/qdrant

The dashboard lives at http://localhost:6333/dashboard. See the Qdrant quickstart for cloud and clustering options.

Step 3: The eyes (SearXNG private web search)

SearXNG gives JOSIE unlimited, untracked web access with a clean JSON API. Full setup in my SearXNG self-hosting guide. Enable the JSON format so n8n can read results:

# searxng/settings.yml
search:
  formats:
    - html
    - json

Step 4: The conductor (n8n wires it together)

n8n is the workflow engine that routes each query to the right tool: web search, memory, or document lookup. Stand it up with my n8n self-hosting guide, then build the JOSIE flow:

# run n8n, persisting workflows to a named volume
docker volume create n8n_data
docker run -d --name n8n -p 5678:5678 \
  -v n8n_data:/home/node/.n8n \
  docker.n8n.io/n8nio/n8n

In n8n, add credentials for Ollama (http://localhost:11434), Qdrant (http://localhost:6333), and SearXNG, then wire four moves:

  1. Trigger to router. A chat trigger feeds a Switch node that reads the query and picks the tool.
  2. Remember. A Qdrant node retrieves the top matches (top-K 3 for documents, 50 for conversation history) and passes them to the model as context.
  3. Search when needed. For fresh facts, an HTTP Request node hits SearXNG's JSON API and hands results to the model.
  4. Answer and store. The model replies, and every exchange is embedded with mxbai-embed-large and written back to Qdrant, so JOSIE remembers next time.

Performance: fully local, sub-second vector search, and Kubernetes-ready when you outgrow a single box.

Frequently asked questions

What does JOSIE stand for?

JOSIE is short for "Just One Super Intelligent Entity." It is a self-hosted AI assistant that combines a local language model, a vector-database memory, and private web search into one n8n workflow.

What hardware do I need to run JOSIE?

A machine with Docker and about 16 GB of RAM runs the 8B JOSIEFIED-Qwen3 model alongside Qdrant comfortably. A GPU with at least 11 GB of VRAM works well, and smaller quantized tags such as q4_k_m let JOSIE run on less RAM.

How does JOSIE remember across sessions?

Every conversation and uploaded document is embedded with mxbai-embed-large and stored in Qdrant. On each new query, JOSIE retrieves the most semantically relevant chunks and feeds them back as context, so memory persists between sessions instead of resetting.

Is JOSIE private?

Yes. The model, embeddings, memory, and search all run on your own hardware. No prompts, documents, or queries leave your network, which makes air-gapped, GDPR-friendly deployments possible.

Can I swap in a different model?

Any Ollama-compatible model works. Point the n8n Ollama node at a different tag, such as a larger Qwen3 or a Llama 3 variant. JOSIEFIED-Qwen3:8b is the default because it is tool-tuned and lightly filtered, with a 40K-token context window.

How is JOSIE different from ChatGPT or a cloud assistant?

Cloud assistants forget between sessions, cap web access, and send your data to a vendor. JOSIE keeps persistent memory, gives unlimited private web search through SearXNG, and runs entirely local with no per-token cost or lock-in. Uncensored models will generally carry out restricted coding techniques that mainstream cloud models refuse.

Conclusion

JOSIE delivers privacy-preserving AI that grows smarter with every interaction. Combining advanced language models, persistent memory, and unlimited web access creates autonomous assistance without vendor lock-in. Start with the n8n documentation and explore AI workflow automation.


Essential Resources


Last updated: June 2026