Four Methods for Fine-Tuning Large Models
Technical overview of large-model fine-tuning and PEFT approaches, covering prompt/prefix tuning, P-tuning v2, AdaLoRA, adapter/LoRA methods and standard training workflow.
Technical overview of large-model fine-tuning and PEFT approaches, covering prompt/prefix tuning, P-tuning v2, AdaLoRA, adapter/LoRA methods and standard training workflow.
Technical guide to deploying PP-OCRv5 with Intel OpenVINO on a modular mini-PC: export Paddle models to ONNX, run CPU inference, and enable hardware-accelerated OCR.
Survey of LLM inference stacks covering throughput, latency and cost; explains hardware constraints, KV cache, quantization, paged/grouped attention, and practical optimizations.
Technical overview of AI servers, GPU/CPU architectures, training vs inference, compute demand and market estimates, including H100/A100 performance and China server market
MegaScale system design and deployment for efficient, stable LLM training on 10,000+ GPUs: algorithm, communication, network tuning, fault tolerance, MFU gains.
Analysis of semiconductor advances enabling AI scale: 3D integration, CoWoS/HBM packaging, silicon photonics and energy-efficient trends toward trillion-transistor GPUs.
Network requirements for large-model GPU training: RDMA-based bandwidth, ultra-low latency, stability, and automated deployment for scalable multi-GPU clusters.
Technical overview of AI server interconnects and components: DGX H100 architecture, PCIe switches and Retimers, and DDR5 memory interface chip trends.
Overview of AIGC and ChatGPT: technologies, industry chain, applications in text/image/video, e-commerce impact, and prompt engineering best practices.
Guide to converting and deploying the DeepSeek LLM on Rockchip RK3588 using RKLLM-Toolkit: environment setup, cross-compilation, model conversion and board deployment.