Gemma-3-27B¶

In this article

Main Features of Gemma-3-27B

Deployment Features

System Requirements and Technical Specifications

Getting Started After Deploying Gemma-3-27B

Order a Server with Gemma-3-27B Using API

Information

Gemma-3-27B is a powerful language model requiring significant computational resources for local deployment via the Ollama platform. This model has high hardware requirements, particularly in terms of GPU memory volume. Deployment is based on Ubuntu 22.04 using modern NVIDIA graphics accelerators. Integration with Open Web UI provides a convenient interface for interacting with the model while maintaining full control over data and request processing.

Main Features of Gemma-3-27B¶

High-performance architecture: The model has 27 billion parameters and is optimized for handling complex tasks with high accuracy using modern technologies;
Integration with Open Web UI: Provides a modern web interface for convenient interaction with the model through port 8080, ensuring full control over data and request processing;
Scalability: Supports multi-card configurations and load distribution across multiple GPUs for optimal performance;
Security and control: Full local deployment ensures data confidentiality, while OLLAMA_HOST and OLLAMA_ORIGINS settings guarantee network security;
Performance: Uses LLAMA_FLASH_ATTENTION technology to accelerate request processing and optimize model operation;
Reliability: An integrated system of automatic restarts for containers and services ensures stable operation.
Examples of use:
- Customer support: Automating responses to user questions;
- Education: Creating educational materials, assisting in solving tasks;
- Marketing: Generating advertising texts, analyzing reviews;
- Software development: Creating and documenting code.

Deployment Features¶

ID	Name of Software	Compatible OS	VM	BM	VGPU	GPU	Min CPU (Cores)	Min RAM (Gb)	Min HDD/SDD (Gb)	Active
250	Gemma-3-27b	Ubuntu 22.04	-	-	+	+	4	32	-	ORDER

Installation time: 15-30 minutes together with the OS;
The Ollama server loads and runs LLM in memory;
Open WebUI is deployed as a web application connected to the Ollama server;
Users interact with LLM through the Open WebUI web interface, sending requests and receiving responses;
All computations and data processing occur locally on the server. Administrators can configure LLM for specific tasks using OpenWebUI tools.

System Requirements and Technical Specifications¶

Graphics Accelerator with CUDA support (one of the options, may be better):
- 2x NVIDIA A4000 (16/24 GB video memory each)
- 2x NVIDIA A5000 (24 GB video memory each)
- 1x NVIDIA A6000 (48 GB video memory)
- 1x NVIDIA 5090 (32 GB video memory)
Disk space: SSD of sufficient size for the system and model;
Software: NVIDIA drivers and CUDA;
Video memory consumption: 28 GB with a 2K token context;
System monitoring: Automatic checks of drivers and containers.

Getting Started After Deploying Gemma-3-27B¶

After payment, an email will be sent to the registered address indicating that the server is ready for work. It will include the VPS IP address, as well as login and password for accessing the server and a link to access the OpenWebUI panel. Clients of our company manage equipment in the server management panel and API — Invapi.

Authentication data for accessing the server's operating system (e.g., via SSH) will be sent to you in the received email.
Link for accessing Ollama control panel with Open WebUI web interface: In the webpanel tag in the tab Info >> Tags of Invapi control panel. The exact link in the form https:gemma<Server_ID_from_Invapi>.hostkey.in is sent in the email when the server is released.

After clicking the link from the tag webpanel, a Get started with Open WebUI login window will open, where you need to create an admin account name, email, and password for your chatbot, then press the ~~Create Admin Account~~ button:

Attention

After registering the first user, the system automatically assigns them an administrator role. To ensure security and control over the registration process, all subsequent registration requests must be approved in OpenWebUI from the administrator account.

Note

Detailed information about features of working with Ollama control panel with Open WebUI can be found in the article AI Chatbot on Your Own Server.

Note

For optimal performance, it is recommended to use a GPU with more than the minimum required 16 GB of video memory. This provides a buffer for processing large contexts and parallel requests. Detailed information about Ollama's main settings and Open WebUI can be found in Ollama developers' documentation and Open WebUI developers' documentation.

Order a Server with Gemma-3-27B Using API¶

To install this software using the API, follow these instructions.

Some of the content on this page was created or translated using AI.