
How Local LLMs Performed Offline: 10-Hour Flight Results
LLM, AI Agents & AI Infrastructure Specialist

LLM, AI Agents & AI Infrastructure Specialist
A 10-hour flight experiment tested the offline performance of Gemma 4 and Qwen 4.6 on a MacBook Pro M5 Max. While both models excelled in coding tasks, they struggled with complex requests and consumed significant energy, cutting battery life from 12 to 4 hours. The test highlights both the potential and limitations of running large language models locally.
Large Language Models (LLMs) are indispensable tools in areas like coding, customer support, and content creation. However, their dependency on cloud-based systems and constant internet connectivity limits their usability in remote or offline settings such as long-haul flights or rural areas. This raises a critical question: can LLMs perform effectively offline on high-end consumer hardware?
To address this, an experiment was conducted using two leading-edge LLMs — Google’s Gemma 4 and Qwen 4.6 — on a MacBook Pro M5 Max during a 10-hour international flight. The findings shed light on their performance, energy efficiency, and practicality in offline environments.
The test environment comprised:
The laptop operated completely offline. Tasks included:
The experiment underscores the need for advancements in both hardware and model optimization to make local LLMs a viable alternative to cloud-based solutions:
While still in their infancy for offline use, local LLMs like Gemma 4 and Qwen 4.6 demonstrate the potential to shift AI from the cloud to personal devices. However, significant work is needed to address issues like power consumption and computational demands. For developers and businesses, these advancements could unlock new opportunities in data privacy, cost savings, and offline productivity.
Yes, large language models like Gemma 4 and Qwen 4.6 can run offline on high-end laptops with powerful GPUs and sufficient memory, but they consume significant battery power and are limited in handling complex tasks.
Local LLMs provide offline functionality and improved data privacy by processing data directly on the device without sending it to external servers.
The main challenges include high energy consumption, reduced battery life, and limited ability to handle complex computational tasks without cloud support.
💡 Dica Pro: Use quantized versions of LLMs to reduce computational load and energy consumption. For instance, 8-bit or 4-bit quantization can significantly improve battery life without major performance losses in most tasks.