Ollama now supports multimodel models with v0.1.15! This allows the model to answer your prompt using what it sees. To run it, simply install Ollama, open a terminal, and type in `ollama run llava`. Then, all you need to do is type your prompt, and drag and drop an image.
There is a new `images`parameter for both Ollama’s Generate API & Chat API. The images parameter takes a list of base64 encoded PNG or JPEG format images. Ollama supports image sizes upto 100MB.
Examples here: https://github.com/jmorganca/ollama/releases/tag/v0.1.15
In the background, Ollama will download the LLaVA 7B model and run it. Want another parameter size? Try the 13B model using `ollama run llava:13b` To see more about the LLaVA model.
More multimodal models are becoming available:
BakLLaVA 7B: ollama run bakllava
Learn more: ollama run bakllava
Thank you @LLaVAAI team!