Scientific Computing and Data / High Performance Computing / Documentation / Ollama

Ollama

Ollama is a platform that enables users to interact with Large Language Models (LLMs) via an Application Programming Interface (API). It is a powerful tool for generating text, answering questions, and performing complex natural language processing tasks. Ollama provides access to various fine-tuned LLMs, allowing developers and researchers to integrate sophisticated language understanding and generation capabilities into their applications, such as chatbots, content creation tools, and research projects. With its easy-to-use API, Ollama streamlines the process of leveraging advanced Artificial Intelligence (AI) models, making it accessible for a wide range of users in different fields. For detailed information, please check here.

We provide an Ollama wrapper script that allows you to start an Ollama server on Minerva’s compute node and access it from your local machine through an API endpoint. This setup enables computationally expensive LLM tasks to be performed on Minerva, while you can easily access the results from your local machine. Below are step-by-step usage instructions.

Usage

The Ollama script is available on the login node at the following location:

/usr/local/bin/

The script name is: minerva-ollama-web.sh.

To start the script, run minerva-ollama-web.sh on a login node.

Example:

sh minerva-ollama-web.sh 
[INFO] Image not specified, check if previously used 
[INFO] Found previously used image /hpc/users/hasans10/minerva_jobs/ollama_jobs/ollama3.sif. Using it. 
[INFO] Project is not specified, or is acc_null, using 1st avail project. 
[INFO] Project to use is acc_hpcstaff 
[INFO] Parameters used are:  
[INFO] -n 4 
[INFO] -M 3000 
[INFO] -W 6:00 
[INFO] -P acc_hpcstaff 
[INFO] -J ollama 
[INFO] -q gpu 
[INFO] -R v100 
[INFO] -g 1
[INFO] -o /hpc/users/hasans10/minerva_jobs/ollama_jobs 
[INFO] -i /hpc/users/hasans10/minerva_jobs/ollama_jobs/ollama3.sif 
[INFO] Submitting Ollama job... 
Job <136273998> is submitted to queue . 
[INFO] Waiting for the job to start... 
Job <136273998> is not found 
Ollama is running. 
[INFO] Ollama is started on compute node 4*lg03a12, port 11434 
Access Ollama using the following URL:  http://10.95.46.104:53998

By default, this script will start an Ollama server on Minerva’s GPU node. If the script runs successfully, it will provide a URL such as http://10.95.46.104:53998. If you paste the URL into a web browser, you will see a message indicating that ‘Ollama is running’ (an example screenshot is provided below).

This indicates that the service is up and running, and you can interact with it. The generated URL will also serve as your API endpoint.

Ollama is configured to run on a GPU node by default, as CPU execution is slow. The script is flexible, allowing you to change various parameters. To see the usage instructions, please run the following command:

minerva-ollama-web.sh --help

Install Ollama Python Library

You can access Ollama through the Python library. If you don’t have the Ollama Python library installed, use the following commands to install it on Minerva:

module load python/3.10.14

pip install --user ollama==0.3.1

Alternatively, after starting the Ollama server on Minerva, you can also access it from your local machine. To install the Ollama Python library on your local machine, use the following command:

pip install ollama

For more details, visit the Ollama Python library GitHub page.

Download Model and Chat

You can download a model using the URL received after job submission. For example:

from ollama import Client 
ollama_client = Client(host='http://10.95.46.104:53998') 
ollama_client.pull('tinyllama')

You can then chat with a model as follows:

stream = ollama_client.chat( 
    model='tinyllama', 
    messages=[{'role': 'user', 'content': 'What are the main causes of cardiovascular disease?'}], 
    stream=True, 
) 

for chunk in stream: 
    print(chunk['message']['content'], end='', flush=True)

Sample Output:

There are several causes of cardiovascular disease, which includes: 
1. Hypercholesterolemia or high cholesterol: This is a major cause of cardiovascular disease (CVD). High levels of bad (Low-density lipoprotein) ("LDL") cholesterol can lead to thickening and plaque buildup in the blood vessels, which can eventually block or rupture them. 
2. Hypertriglyceridemia: This is another major cause of CVD, especially when triglycerides (fatty substances) are present in large amounts. 
3. Hypertension (high blood pressure): High blood pressure can damage the arteries and reduce blood flow to the heart and other organs. 
4. Diabetes: This condition results in high levels of glucose (sugar) in the bloodstream, which can cause plaque buildup in the arteries. 
5. Smoking and alcohol use: Both smoking and excessive alcohol consumption can damage the lining of blood vessels, leading to inflammation and increased risk of CVD. 
6. Physical inactivity: A sedentary lifestyle is associated with an increased risk of CVD due to reduced blood flow and oxygen supply to organs and muscles. 
7. Obesity: Excess body fat can cause a buildup of plaque in the arteries, leading to CVD. 
8. Atherosclerosis: This is the thickening and hardening of artery walls due to inflammation and plaque accumulation.

Change Ollama Work Directory

By default, Ollama stores models in your HOME directory. Some Ollama models are quite large and may exceed the 20GB size limit of your HOME directory. To avoid this issue, you can use your project directory (or another directory with sufficient space) as the Ollama work directory. For example, you can change the work directory as shown below.

sh minerva-ollama-web.sh -o /sc/arion/projects/

sh minerva-ollama-web.sh --ollamaworkdir /sc/arion/projects/