Skip to content

Releases: ollama/ollama

v0.1.3

13 Oct 23:59
Compare
Choose a tag to compare

What's Changed

  • Improved various API error messages to be easier to read
  • Improved GPU allocation for older GPUs to fix "out of memory" errors
  • Fixed issue where setting num_gpu to 0 would result in an error
  • Ollama for macOS will now always update to the latest version, even if earlier updates had also been downloaded beforehand

Full Changelog: v0.1.2...v0.1.3

v0.1.2

12 Oct 18:46
Compare
Choose a tag to compare

New Models

  • Zephyr A fine-tuned 7B version of mistral that was trained on a mix of publicly available, synthetic datasets and performs as well as Llama 2 70B in many benchmarks
  • Mistral OpenOrca a 7 billion parameter model fine-tuned on top of the Mistral 7B model using the OpenOrca dataset

Examples

Ollama's examples have been updated with some new examples:

What's Changed

  • Download speeds for ollama pull have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connections
  • The API now supports non-streaming responses. Set the stream parameter to false and endpoints will return data in one single response:
    curl -X POST http://localhost:11434/api/generate -d '{
      "model": "llama2",
      "prompt": "Why is the sky blue?",
      "stream": false
    }'
    
  • Ollama can now be used with http proxies (using HTTP_PROXY=http://<proxy>) and https proxies (using HTTPS_PROXY=https://<proxy>)
  • Fixed token too long error when generating a response
  • q8_0, q5_0, q5_1, and f32 models will now use GPU on Linux
  • Revise help text in ollama run to be easier to read
  • Rename runner subprocess to ollama-runner
  • ollama create will now show feedback when reading model metadata
  • Fix not found error showing when running ollama pull
  • Improved video memory allocation on Linux to fix errors when using Nvidia GPUs

New Contributors

Full Changelog: v0.1.1...v0.1.2

v0.1.1

02 Oct 21:47
1852755
Compare
Choose a tag to compare

What's Changed

  • Cancellable responses: Ctrl+C will now cancel responses when running ollama run
  • Exit ollama run sessions with Ctrl+D or /bye
  • Improved error messages for unknown /slash commands when using ollama run
  • Various improvements to the Linux install script for distro compatibility and to fix bugs
  • Fixed install issues on Fedora
  • Fixed issue where specifying the library/ prefix in ollama run would cause an error
  • Fixed highlight color for placeholder text in ollama run
  • Fixed issue where auto updater would not restart when clicking "Restart to Update"
  • Ollama will now clean up subdirectories in ~/.ollama/models
  • Ollama when now show a default message when ollama show results in an empty message

New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

23 Sep 13:22
5306b02
Compare
Choose a tag to compare

Ollama for Linux

Ollama for Linux is now available, with GPU acceleration enabled out-of-the-box for Nvidia GPUs.

💯 Ollama will run on cloud servers with multiple GPUs attached
🤖 Ollama will run on WSL 2 with GPU support
😍 Ollama maximizes the number of GPU layers to load to increase performance without crashing
🤩 Ollama will support CPU only, and small hobby gaming GPUs to super powerful workstation graphics cards like the H100

Download

curl https://ollama.ai/install.sh | sh

Manual install steps are also available.

Changelog

  • Ollama will now automatically offload as much of the running model as is supported by your GPU for maximum performance without any crashes
  • Fix issue where characters would be erased when running ollama run
  • Added a new community project by @TwanLuttik in #574

New Contributors

Full Changelog: v0.0.21...v0.1.0

v0.0.21

23 Sep 04:01
Compare
Choose a tag to compare
  • Fixed an issue where empty responses would be returned if template was provided in the api, but not prompt
  • Fixed an issue where the "Send a message" placeholder would show when writing multi line prompts with ollama run
  • Fixed an issue where multi-line prompts in ollama run wouldn't be submitted when pressing Return

Full Changelog: v0.0.20...v0.0.21

v0.0.20

22 Sep 22:04
c928ceb
Compare
Choose a tag to compare

What's Changed

  • ollama run has a new & improved experience:
    • Models will now be loaded immediately making even the first prompt much faster
    • Added hint text
    • Ollama will now fit words in the available width of the terminal for better readability
  • OLLAMA_HOST now supports ipv6 hostnames
  • ollama run will now automatically pull models if they don't exist when using a remote instance of Ollama
  • Sending an empty prompt field to /api/generate will now load the model so the next request is fast
  • Fixed an issue where ollama create would not correctly detect falcon model sizes
  • Add a simple python client to access Ollama in api/client.py by @pdevine
  • Improvements to showing progress on ollama pull and ollama push
  • Fixed an issue for adding empty layers with ollama create
  • Fixed an issue for running Ollama on Windows (compiled from source)
  • Fixed an error when running ollama push
  • Readable community projects by @jamesbraza

New Contributors

Full Changelog: v0.0.19...v0.0.20

v0.0.19

11 Sep 20:39
45ac07c
Compare
Choose a tag to compare

What's Changed

  • Updated Docker image for Ollama docker pull ollama/ollama
  • Ability to import and use GGUF file type models
  • Fixed issue where ollama push would error on long-running uploads
  • Ollama will now automatically clean up unused data locally
  • Improve build instructions by @apepper

New Contributors

Full Changelog: v0.0.18...v0.0.19

v0.0.18

06 Sep 20:54
83c6be1
Compare
Choose a tag to compare

What's Changed

  • New ollama show command for viewing details about a model:
    • See a system prompt for a model: ollama show --system orca-mini
    • View a model's parameters: ollama show --parameters codellama
    • View a model's default prompt template: ollama show --template llama2
    • View a Modelfile for a model: ollama show --modelfile llama2
  • Minor improvements to model loading and generation time
  • Fixed an issue where large prompts would cause codellama and similar models to show an error
  • Fixed compatibility issues with macOS 11 Big Sur
  • Fixed an issue where characters would be escaped in prompts causing escaped characters like &amp; in the output
  • Fixed several issues with building from source on Windows and Linux
  • Minor performance improvements to model loading and generation
  • New sentiments example by @technovangelist
  • Fixed num_keep parameter not working properly
  • Fixed issue where Modelfile parameters would not be honored at runtime
  • Added missing options params to the embeddings docs by @herrjemand
  • Fixed issue where ollama list would error when there were no models to show

When building from source, Ollama will require running go generate to generate dependencies:

git clone https://github.com/jmorganca/ollama
cd ollama
go generate ./...
go build .

Note: cmake is required to build dependencies. On macOS it can be installed with brew install cmake, and on other platforms via the installer or well-known package managers.

New Contributors

Full Changelog: v0.0.17...v0.0.18

v0.0.17

30 Aug 18:21
f4432e1
Compare
Choose a tag to compare

What's Changed

  • Multiple models can be removed together: ollama rm mario:latest orca-mini:3b
  • ollama list will now show a unique ID for each model based on its contents
  • Fixed bug where a prompt wasn't set by default causing an error when running a model created with ollama create
  • Fixed crash when running 34B parameter models on hardware with not enough memory to run it.
  • Fixed issue where non-quantized f16 models would not run
  • Improved network performance of ollama push
  • Fixed issue where stop sequences (such as \n) wouldn't be honored

New Contributors

  • @sqs made their first contribution in #415

Full Changelog: v0.0.16...v0.0.17

v0.0.16

26 Aug 01:49
Compare
Choose a tag to compare

What's Changed

  • Ollama version can be checked by running ollama -v or ollama --version
  • Support for 34B models such as codellama
  • Model names or paths withhttps:// in front of them will now work when running ollama run

New Contributors

Full Changelog: v0.0.15...v0.0.16