Roadmap

Synced with README.md - this is the source of truth for feature planning.

Completed

In Progress

Tool calling implementation (infrastructure ready, DB flags exist, no runtime wiring)
UX polish & setup/install experience improvements

Quick Wins

Low-effort, high-impact features from analyzing other Ollama UIs.

Chat UX Improvements

Auto title generation - AI-generated titles on first message (lumina pattern)
Regenerate response - Re-run last assistant message (Vercel AI SDK reload())
Keyboard shortcuts - Ctrl+B sidebar, Ctrl+Shift+O new chat, Ctrl+K search
Reasoning display - Collapsible <think> blocks for DeepSeek R1/o1 models
Inline message editing - Edit past user messages (creates new branch)
Message rating - Thumbs up/down on assistant responses

Code Block Enhancements

Copy button - One-click copy with visual feedback
Download as file - Save code blocks with proper extension
Line numbers toggle - Optional line numbers display
Word wrap toggle - Handle long lines

Error Handling

Enhanced error toasts - Categorized errors with copy action (retry button still TODO)
Offline detection - Show “You appear to be offline” in errors
Actionable suggestions - “Check if Ollama is running”

Planned

Prompt Templates

Reusable system prompts for common use cases.

Create and manage prompt templates
Global templates (admin) vs personal templates (user)
Variable substitution ({{variable}} syntax)
Quick access via / command in chat input
Template categories/tags

Conversation Branching

Explore alternative responses without losing history (ai-ui’s killer feature).

Tree-based message structure (parentId, children[])
Branch navigation UI (“1 of 3” with arrows)
Regenerate creates new branch (non-destructive)
Edit past message creates new branch
Active path tracking per conversation

Pull Model from UI

Download Ollama models directly without terminal (nextjs-ollama-llm-ui).

Model pull form with validation
Real-time progress tracking (streaming)
Non-blocking (continue chatting while downloading)
Link to Ollama library for discovery
Toast notifications on completion

Projects / Folders

Chat organization with folders (full CRUD, colors, collapse state, move chats).

Create and manage folders (color, name, collapse/expand)
Move chats between folders
Folder-scoped chat view (/folder/:id)
Custom system prompt per folder/project
Document upload + RAG context (KnowledgeBaseSelector stub exists, no backend)

Knowledge Bases

Reusable document collections (separate from projects).

Create named knowledge bases
Upload and embed documents
Reference in chat with #knowledge-base-name
Attach to projects or use standalone
Access control (private, shared, public)

Model Arena / Evaluation

Compare models and collect feedback (Open WebUI pattern).

Side-by-side response comparison
Blind arena mode (hide model names)
Collect user preferences
Admin analytics on model performance
Export evaluation data

Image Generation & Editing

OpenAI DALL-E 3
Replicate (FLUX 1.1 Pro default) + OpenRouter image models
Mode switcher in navbar, aspect ratio selector (7 ratios)
Generated images saved to DB with prompt/model metadata
Google Gemini image generation
ComfyUI integration (local)
AUTOMATIC1111/Stable Diffusion WebUI (local)
Prompt-based image editing workflows
Generated image gallery / management UI

Web Search for RAG

Pluggable search provider architecture
SearXNG support (self-hosted, privacy-first)
Brave Search, Kagi, DuckDuckGo
Tavily, Perplexity (AI-native search)
Google PSE, Bing (if needed)
Search results injection into chat context
Toggle search on/off per message

Web Browsing Capability

URL fetching with # command (e.g., #https://example.com)
Web content extraction (Jina Reader or Readability.js)
HTML to markdown conversion
Automatic context injection from URLs
Link preview in input area

Chat Organization

Chat folders - Organize sidebar with folders
Pinned chats - Keep important chats at top
Chat tags - Flexible categorization
Search chats - Fuzzy title search (full-text message search still TODO)
Date grouping - Today, Yesterday, Previous 7 Days, Previous 30 Days, Older

Chat Export/Import

Import from ChatGPT JSON exports (validate → preview → atomic import, 50MB limit)
Export to JSON format (faster-chat native)
Export to Markdown format
Selective export (single chat, date range, all)
Import from Claude exports
Import from Open WebUI exports

Enhanced RBAC & Admin Monitoring

Fine-grained permissions system
Model access restrictions by role
Feature toggles per role (image gen, file upload, web search)
Rate limits per role/user
Admin-only model creation/pulling for Ollama
Group-based permissions (e.g., “Kids” group limited models)
API key scoping
API usage monitoring - Track per-user usage, quota limits, request logging in admin panel
Usage analytics dashboard - Visualize model usage, API costs, user activity

Access Control Pattern

Flexible permission model for resources (from Open WebUI).

access_control: {
  read: { group_ids: [...], user_ids: [...] },
  write: { group_ids: [...], user_ids: [...] }
}
// null = public (all users)
// {} = private (owner only)

Apply to: Knowledge bases, prompt templates, shared chats, projects.

Multilingual Support (i18n)

Internationalization infrastructure
Language detection
UI string extraction and translation system
RTL language support
Community contribution workflow for translations
Per-user language preference

Settings & UX

Temperature/system prompt per chat (not just per model)
Advanced stats display - Token metrics (tokens/sec, time-to-first-token, estimated tokens)
Hide personal info toggle - Option to hide user name/email from UI
Resizable sidebar with state persistence (improvements)
Stream token usage metrics in real-time

Advanced Capabilities

Local RAG with vector search (ChromaDB or similar)
Multi-modal requests (vision, audio)
Conversation sharing (public links)
MCP (Model Context Protocol) integration

Infrastructure

PostgreSQL backend option
Plugin system for extensions
Mobile app (Capacitor)
WebSocket for real-time updates (typing indicators, multi-user)

Implementation Notes

Quick Wins Implementation

Auto Title Generation (from lumina):

// After first message, before storing
if (chat.messages.length === 0) {
  const titleReq = {
    system: "Create a 2-5 word title summarizing this input. No prose, just the title.",
    prompt: userMessage,
    model: currentModel,
    stream: false
  };
  const title = await generateTitle(titleReq);
  chat.title = title.replace(/<think>[\s\S]*?<\/think>/g, ''); // Remove think blocks
}

Keyboard Shortcuts (from chatbot-ollama):

const handleKeyDown = (e) => {
  if (e.key === 'Enter' && !e.shiftKey && !isMobile()) {
    e.preventDefault();
    handleSend();
  } else if (e.key === 'Escape') {
    inputRef.current?.blur();
  } else if (e.key === 'ArrowUp' && !content && !isStreaming) {
    // Recall last user message when input is empty
    const lastUserMsg = messages.findLast(m => m.role === 'user');
    if (lastUserMsg) setContent(lastUserMsg.content);
  }
};

Reasoning Display (from lumina/nextjs-ollama-llm-ui):

// Parse <think> blocks from content
const splitByThinkBlocks = (text) => {
  // Handle incomplete tags during streaming
  if (text.includes('<think>') && !text.includes('</think>')) {
    text += '</think>';
  }
  const regex = /<think>([\s\S]*?)<\/think>/g;
  // Return array of { type: 'think' | 'markdown', text }
};

// Render with collapsible
<details>
  <summary>🧠 Reasoning</summary>
  <Markdown>{thinkContent}</Markdown>
</details>

Conversation Branching

Schema Extension:

ALTER TABLE messages ADD COLUMN parent_id TEXT REFERENCES messages(id);
-- children[] derived from query: SELECT id FROM messages WHERE parent_id = ?

Client State:

// Track which branch is active at each fork point
const activeBranches = new Map(); // messageId -> activeChildId

// Build visible path by walking tree
const buildActivePath = (messages, activeBranches) => {
  const path = [];
  let current = findRoot(messages);
  while (current) {
    path.push(current);
    if (!current.children?.length) break;
    const activeChild = activeBranches.get(current.id) ?? current.children.at(-1);
    current = messages[activeChild];
  }
  return path;
};

Prompt Templates

Schema:

CREATE TABLE prompt_templates (
  id TEXT PRIMARY KEY,
  user_id TEXT REFERENCES users(id),  -- null for global
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  description TEXT,
  variables TEXT,  -- JSON array of variable names
  is_global BOOLEAN DEFAULT FALSE,
  access_control TEXT,  -- JSON access control object
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

Variable Substitution:

// Template: "You are a {{role}} helping with {{task}}"
const parseVariables = (content) => {
  const regex = /\{\{(\w+)\}\}/g;
  const vars = [];
  let match;
  while ((match = regex.exec(content)) !== null) {
    vars.push(match[1]);
  }
  return vars;
};

Projects System

Schema:

CREATE TABLE projects (
  id TEXT PRIMARY KEY,
  user_id TEXT NOT NULL REFERENCES users(id),
  name TEXT NOT NULL,
  description TEXT,
  system_prompt TEXT,
  access_control TEXT,  -- JSON
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

CREATE TABLE project_documents (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL REFERENCES projects(id),
  file_id TEXT NOT NULL REFERENCES files(id),
  embedding_status TEXT DEFAULT 'pending',  -- pending, processing, complete, failed
  created_at TEXT DEFAULT (datetime('now'))
);

-- Add project_id to chats
ALTER TABLE chats ADD COLUMN project_id TEXT REFERENCES projects(id);

Knowledge Bases

Schema:

CREATE TABLE knowledge_bases (
  id TEXT PRIMARY KEY,
  user_id TEXT NOT NULL REFERENCES users(id),
  name TEXT NOT NULL,
  description TEXT,
  access_control TEXT,  -- JSON: {read: {group_ids, user_ids}, write: {...}}
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

CREATE TABLE knowledge_base_documents (
  id TEXT PRIMARY KEY,
  knowledge_base_id TEXT NOT NULL REFERENCES knowledge_bases(id),
  file_id TEXT NOT NULL REFERENCES files(id),
  chunk_count INTEGER DEFAULT 0,
  embedding_status TEXT DEFAULT 'pending',
  created_at TEXT DEFAULT (datetime('now'))
);

Pull Model Feature

API Route:

// POST /api/ollama/pull
app.post('/api/ollama/pull', async (c) => {
  const { name } = await c.req.json();
  const response = await fetch(`${OLLAMA_URL}/api/pull`, {
    method: 'POST',
    body: JSON.stringify({ name }),
  });

  // Stream progress back to client
  return new Response(response.body, {
    headers: { 'Content-Type': 'application/x-ndjson' }
  });
});

Frontend (throttled progress):

import { throttle } from 'lodash-es';

const throttledSetProgress = throttle((progress) => {
  setDownloadProgress(progress);
}, 200); // Max 5 updates/second

Access Control Pattern

Helper Functions:

const canRead = (resource, user, userGroups) => {
  if (resource.access_control === null) return true; // public
  if (resource.user_id === user.id) return true; // owner
  const ac = resource.access_control;
  if (!ac.read) return false;
  if (ac.read.user_ids?.includes(user.id)) return true;
  if (ac.read.group_ids?.some(gid => userGroups.includes(gid))) return true;
  return false;
};

const canWrite = (resource, user, userGroups) => {
  if (resource.user_id === user.id) return true; // owner always can write
  const ac = resource.access_control;
  if (!ac?.write) return false;
  if (ac.write.user_ids?.includes(user.id)) return true;
  if (ac.write.group_ids?.some(gid => userGroups.includes(gid))) return true;
  return false;
};

Model Arena

Schema:

CREATE TABLE evaluations (
  id TEXT PRIMARY KEY,
  user_id TEXT NOT NULL REFERENCES users(id),
  chat_id TEXT REFERENCES chats(id),
  message_id TEXT REFERENCES messages(id),
  type TEXT NOT NULL,  -- 'rating', 'comparison', 'arena'
  data TEXT NOT NULL,  -- JSON: {rating, model_id, winner, reason, comment}
  meta TEXT,  -- JSON: {arena: bool, tags: [...]}
  created_at TEXT DEFAULT (datetime('now'))
);

Principles

Privacy-conscious: Self-hosted, your data on your server (not third-party clouds)
Offline-capable: Works with local models (Ollama, LM Studio), syncs when online
Provider-agnostic: No vendor lock-in
Fast iteration: Speed over ceremony, no TypeScript
Simple code: Small components, derived state, delete aggressively
Lean over bloated: Pick features strategically, don’t become Open WebUI
Truly open source: MIT license, no restrictions

Feature Sources

Features in this roadmap were informed by analyzing these projects:

Project	Key Features Adopted
ai-ui	Conversation branching, inline editing, multi-provider architecture
chatbot-ollama	Keyboard shortcuts, code block features, error handling patterns
lumina	Auto title generation, reasoning display, clean API patterns
nextjs-ollama-llm-ui	Pull model UI, voice input, regenerate response
Open WebUI	Prompt templates, knowledge bases, access control pattern, evaluations

See ~/Projects/0_CHATS/*_IMPLEMENTATION_NOTES.md for detailed analysis of each.