On this page

Refer to Model Configs for how to set the environment variables for your particular deployment. Note: While we support self hosted LLMs, you will get significantly better responses with a more powerful model like GPT-4.

What is FastChat

FastChat is a way to easily host LLMs on cli, using their web client, or as an API server. For the RECAP use case we will focus on interfacing with the model through the API server. See here for more information: https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md In this case, we use LiteLLM’s custom model server option. See here for more information: https://litellm.vercel.app/docs/providers/custom_openai_proxy

Set RECAP to use FastChat Server

On the LLM page in the Admin Panel add a Custom LLM Provider with the following settings. Note that the Provider Name is OpenAI, since FastChat provides an OpenAI compatible. Hints:

To point to other Docker containers running locally (e.g. accessible at http://localhost), use http://host.docker.internal.

Don’t forget to include the /v1 in the API base.

You may also want to update some of the environment variables depending on your model choice / how you’re running FastChat (e.g. on CPU vs GPU):

# Let's also make some changes to accommodate the weaker locally hosted LLM
QA_TIMEOUT=120  # Set a longer timeout, running models on CPU can be slow
# Always run search, never skip
DISABLE_LLM_CHOOSE_SEARCH=True
# Don't use LLM for reranking, the prompts aren't properly tuned for these models
DISABLE_LLM_CHUNK_FILTER=True
# Don't try to rephrase the user query, the prompts aren't properly tuned for these models
DISABLE_LLM_QUERY_REPHRASE=True
# Don't use LLM to automatically discover time/source filters
DISABLE_LLM_FILTER_EXTRACTION=True
# Uncomment this one if you find that the model is struggling (slow or distracted by too many docs)
# Use only 1 section from the documents and do not require quotes
# QA_PROMPT_OVERRIDE=weak

Ollama Custom Model Server

Welcome to RECAP

Connectors

Deploy Onyx

Auth

Enterprise

Guides

Tools

Backend APIs

Cloud APIs

Enterprise

More

FastChat

What is FastChat

Set RECAP to use FastChat Server

Don’t forget to include the /v1 in the API base.

Welcome to RECAP

Connectors

Deploy Onyx

Auth

Enterprise

Guides

Tools

Backend APIs

Cloud APIs

Enterprise

More

​What is FastChat

​Set RECAP to use FastChat Server

​Don’t forget to include the /v1 in the API base.

What is FastChat

Set RECAP to use FastChat Server

Don’t forget to include the /v1 in the API base.