How did we create our LLM benchmark for GPU servers using Ollama? We developed a script, tested it with DeepSeek R1, and configured the necessary contexts. We identified some patterns and compared the performance of different GPUs, all of which are now available on GitHub.