Module LLM is an integrated offline Large Language Model (LLM) inference module designed for terminal devices that require efficient and intelligent interaction. Whether for smart homes, voice assistants, or industrial control, Module LLM provides a smooth and natural AI experience without relying on the cloud, ensuring privacy and stability. Integrated with the StackFlow framework and Arduino/UiFlow libraries, smart features can be easily implemented with just a few lines of code.Powered by the advanced AX630C SoC processor, it integrates a 3.2 TOPs high-efficiency NPU with native support for Transformer models, handling complex AI tasks with ease. Equipped with 4GB LPDDR4 memory (1GB available for user applications, 3GB dedicated to hardware acceleration) and 32GB eMMC storage, it supports parallel loading and sequential inference of multiple models, ensuring smooth multitasking. The main chip is manufactured using TSMC’s 12nm process, with a runtime power consumption of approximately 1.5W, making it highly efficient and suitable for long-term operation.It features a built-in microphone, speaker, TF storage card, USB OTG, and RGB status light, meeting diverse application needs with support for voice interaction and data transfer. The module offers flexible expansion: the onboard SD card slot supports cold/hot firmware upgrades, and the UART communication interface simplifies connection and debugging, ensuring continuous optimization and expansion of module functionality. The USB port supports master-slave auto-switching, serving as both a debugging port and allowing connection to additional USB devices like cameras. Users can purchase the LLM debugging kit to add a 100 Mbps Ethernet port and kernel serial port, using it as an SBC.The module is compatible with multiple models and comes pre-installed with the Qwen2.5-0.5B language model. It features KWS (wake word), ASR (speech recognition), LLM (large language model), and TTS (text-to-speech) functionalities, with support for standalone calls or pipeline automatic transfer for convenient development. Future support includes Qwen2.5-1.5B, Llama3.2-1B, and InternVL2-1B models, allowing hot model updates to keep up with community trends and accommodate various complex AI tasks. Vision recognition capabilities include support for CLIP, YoloWorld, and future updates for DepthAnything, SegmentAnything, and other advanced models to enhance intelligent recognition and analysis.Plug and play with M5 hosts, Module LLM offers an easy-to-use AI interaction experience. Users can quickly integrate it into existing smart devices without complex settings, enabling smart functionality and improving device intelligence. This product is suitable for offline voice assistants, text-to-speech conversion, smart home control, interactive robots, and more.