Gemma-3-27B¶
In this article
Information
Gemma-3-27B is a powerful language model requiring significant computational resources for local deployment via the Ollama platform. This model has high hardware requirements, particularly in terms of GPU memory volume. Deployment is based on Ubuntu 22.04 using modern NVIDIA graphics accelerators. Integration with Open Web UI provides a convenient interface for interacting with the model while maintaining full control over data and request processing.
Main Features of Gemma-3-27B¶
- High-performance architecture: The model has 27 billion parameters and is optimized for handling complex tasks with high accuracy using modern technologies;
- Integration with Open Web UI: Provides a modern web interface for convenient interaction with the model through port 8080, ensuring full control over data and request processing;
- Scalability: Supports multi-card configurations and load distribution across multiple GPUs for optimal performance;
- Security and control: Full local deployment ensures data confidentiality, while OLLAMA_HOST and OLLAMA_ORIGINS settings guarantee network security;
- Performance: Uses LLAMA_FLASH_ATTENTION technology to accelerate request processing and optimize model operation;
-
Reliability: An integrated system of automatic restarts for containers and services ensures stable operation.
-
Examples of use:
- Customer support: Automating responses to user questions;
- Education: Creating educational materials, assisting in solving tasks;
- Marketing: Generating advertising texts, analyzing reviews;
- Software development: Creating and documenting code.
Deployment Features¶
ID | Compatible OS | VM | BM | VGPU | GPU | Min CPU (Cores) | Min RAM (Gb) | Min HDD/SDD (Gb) | Active |
---|---|---|---|---|---|---|---|---|---|
250 | Ubuntu 22.04 | - | - | + | + | 4 | 32 | - | Yes |
- Installation time: 15-30 minutes together with the OS;
- The Ollama server loads and runs LLM in memory;
- Open WebUI is deployed as a web application connected to the Ollama server;
- Users interact with LLM through the Open WebUI web interface, sending requests and receiving responses;
- All computations and data processing occur locally on the server. Administrators can configure LLM for specific tasks using OpenWebUI tools.
System Requirements and Technical Specifications¶
-
Graphics Accelerator with CUDA support (one of the options, may be better):
- 2x NVIDIA A4000 (16/24 GB video memory each)
- 2x NVIDIA A5000 (24 GB video memory each)
- 1x NVIDIA A6000 (48 GB video memory)
- 1x NVIDIA 5090 (32 GB video memory)
-
Disk space: SSD of sufficient size for the system and model;
- Software: NVIDIA drivers and CUDA;
- Video memory consumption: 28 GB with a 2K token context;
- System monitoring: Automatic checks of drivers and containers.
Getting Started After Deploying Gemma-3-27B¶
After payment, an email will be sent to the registered address indicating that the server is ready for work. It will include the VPS IP address, as well as login and password for accessing the server and a link to access the OpenWebUI panel. Clients of our company manage equipment in the server management panel and API — Invapi.
-
Authentication data for accessing the server's operating system (e.g., via SSH) will be sent to you in the received email.
-
Link for accessing Ollama control panel with Open WebUI web interface: In the webpanel tag in the tab Info >> Tags of Invapi control panel. The exact link in the form
https:gemma<Server_ID_from_Invapi>.hostkey.in
is sent in the email when the server is released.
After clicking the link from the tag webpanel, a Get started with Open WebUI login window will open, where you need to create an admin account name, email, and password for your chatbot, then press the Create Admin Account button:
Attention
After registering the first user, the system automatically assigns them an administrator role. To ensure security and control over the registration process, all subsequent registration requests must be approved in OpenWebUI from the administrator account.
Note
Detailed information about features of working with Ollama control panel with Open WebUI can be found in the article AI Chatbot on Your Own Server.
Note
For optimal performance, it is recommended to use a GPU with more than the minimum required 16 GB of video memory. This provides a buffer for processing large contexts and parallel requests. Detailed information about Ollama's main settings and Open WebUI can be found in Ollama developers' documentation and Open WebUI developers' documentation.
Order a Server with Gemma-3-27B Using API¶
To install this software using the API, follow these instructions.
Some of the content on this page was created or translated using AI.