lit-ollama
Replace ollama with LitServe
Features
- LitGPT model support: Load and serve any LitGPT-compatible model using the standard Ollama interface
- Ollama-compatible API: Full compatibility with the Ollama API specification, allowing you to use any Ollama client without modifications
- LitServe powered: Built on LitServe for high-performance model serving with auto-batching and GPU acceleration
Installation
With pip:
python -m pip install lit-ollama
With uv:
uv add lit-ollama
How to use it
Run like any other litserve server:
import litserve as ls
from lit_ollama.server.api import LitOllamaAPI
api = LitOllamaAPI("mock")
server = ls.LitServer(
api,
accelerator="auto",
devices="auto",
callbacks=None,
middlewares=None,
)
server.run()
Start the server with a specific model:
python server.py --model "meta-llama/Llama-3.2-1B-Instruct"
You can test the server by using the client to interact with it:
python client.py
Docs
uv run mkdocs build -f ./mkdocs.yml -d ./_build/
Update template
copier update --trust -A --vcs-ref=HEAD