Introduction
With broad network coverage, IoT has grown rapidly and the number of IoT end devices continues to increase. This article examines next-generation hybrid DSP technology combined with RTOS and explains why this combination is well suited to IoT applications.
DSP Technology Evolution
Digital signal processors (DSPs) are used to convert and process real-world analog signals. This processing is performed by complex signal-processing algorithms. Since their appearance in the 1980s, DSPs have advanced substantially in hardware capability, software development tools, and infrastructure. Early algorithms were programmed into DSPs in assembly language. As the DSP market expanded and algorithms became more complex, architectures evolved and higher-level language compilers were developed.
Embedded DSP cores on chips typically include on-chip memory large enough to hold the full program set needed for specific tasks. Modern DSPs are applied in audio/voice processing, image processing, telecommunications signal processing, sensor data processing, and system control. The current IoT market covers many combinations of these use cases. Industry analysis firm Markets and Markets projects the global IoT technology market will reach $566.4 billion by 2027. Given the scale of the IoT market, next-generation DSP technology is important.
Why DSPs Suit IoT Devices
IoT uses various sensors to collect data and enable communication and connectivity among physical objects. DSPs analyze and process continuously varying signals from sensors. Sensor-hub DSPs, such as CEVA SensPro2, are designed to process and fuse data from multiple sensors and can be used for context-aware neural-network inference. DSPs are engineered to analyze audio/video, temperature, pressure, humidity, and other real-world signals, performing precise, real-time repeated digital computations.
As IoT deployments expand and more sensors are deployed, collected data requires efficient real-time processing. There is a growing trend to process data locally on IoT devices rather than sending it to the cloud. Another trend is increasing use of AI-based algorithms for on-device processing. Neural-network models require high degrees of parallelism for efficient execution. Parallel processing capability is a key advantage DSPs have over general-purpose CPUs. Modern DSP architectures therefore favor wide-vector and SIMD features.
In short, DSP-based solutions can meet the high-performance computing and low-power requirements of modern IoT devices simultaneously.
Why DSP and RTOS Are a Good Match
DSPs are specialized processors, and RTOSs are specialized operating systems. DSPs focus on fast, reliable processing of real-world data while RTOSs focus on meeting timing and response requirements deterministically. DSPs tend to be compact compared with general CPUs, and RTOSs are typically leaner than conventional operating systems. These characteristics align with IoT device requirements, making DSP plus RTOS a suitable choice for IoT applications.
Historically, embedded devices often used single-purpose 8-bit or 16-bit microcontrollers and could operate without an RTOS. Modern IoT devices are more complex and commonly combine a 32-bit CPU with a DSP plus RTOS to manage control functions and run complex signal processing.
New hybrid DSP architectures that support both DSP functions and controller functions are being adopted quickly in IoT and other embedded devices. These hybrid DSPs may implement VLIW, SIMD, single-precision floating point, compact code size, full RTOS support, fast context switching, and dynamic branch prediction, removing the need for a separate processor to run an RTOS.
RTOS for DSP
An RTOS designed for DSPs aims to leverage DSP performance features. Such an RTOS is typically preemptive and priority-based, providing very low interrupt latency. These RTOS distributions include drivers, application programming interfaces, and chip-support libraries tailored to DSP functions. On-chip peripherals can be controlled, including caches, DMA, timers, and interrupt units. This allows IoT application developers to configure the RTOS to handle resource requests and system management efficiently.
RT-Thread is an open-source RTOS for IoT devices. It supports mainstream toolchains such as GCC, Keil, and IAR, and provides POSIX and CMSIS compatibility as well as environments for C++ and scripting interfaces like MicroPython and JavaScript. RT-Thread offers support for major CPU and DSP architectures, enabling consistent inter-thread communication and synchronization mechanisms such as message passing and event flags.
RT-Thread is available in two editions: a standard edition for resource-rich IoT devices and a Nano edition for resource-constrained systems.
DSP and RT-Thread Integration
Some DSP architectures, for example CEVA DSPs, natively support RTOS features and ultra-fast context switching. Devices built with such DSPs and an RTOS can handle multiple concurrent communication tasks across resources without interrupting the RTOS. A multi-core communication interface (MCCI) mechanism supports command communication and message passing between cores. Inter-core communication can be implemented by directly accessing dedicated command registers from an AXI slave port. The DSP provides control and instruction support to track communication state via MCCI.

Message passing between cores is performed using 32-bit MCCI_NUM dedicated command registers. External cores write to the 32-bit COM_REGx registers via the AXI slave port; cores read these registers. On a 128-bit AXI bus, the command-generating core can write up to four registers simultaneously; on a 256-bit AXI bus, this increases to eight.
When the command-generating core writes to COM_REGx, addressing registers and related status bits in COM_STS are updated. An interrupt (MES_INT) notifies the receiving core. When the receiving core reads a COM_REGx register, it sends a read-indication signal back to the sender using a dedicated RD_IND MCCI_NUM bit-bus interface. Each bit of the RD_IND bus represents a read operation from one COM_REGx register. Using the I/O interface, the receiving core can read only one COM_REGx register at a time. This mechanism simplifies synchronization between different cores and also between tasks within the same core.
ALLPCB