Leveraging Market’s First Arm® Cortex®-M33-Based MCU – Part 1: Manage Power and Performance

By Stephen Evanczuk

Contributed By Digi-Key's North American Editors

Editors’ Note: Part 1 of a 2-part series, this article shows how developers can meet a broad range of requirements for performance and low power consumption using a single general-purpose microcontroller family. Part 2 discusses how extended security features integrated in NXP Semiconductors’ LPC55S6x microcontrollers support lifecycle security from provisioning and commissioning to communications, secure boot, and secure firmware updates.

Developers continually find themselves struggling to balance often-conflicting demands for higher application performance at lower power consumption across a broad array of functions and applications. These applications include the Internet of Things (IoT), industrial automation, medical systems, and consumer devices. The rapidly growing need for greater security in such applications compounds developers’ difficulty in finding a single microcontroller family able to meet a broad, diverse, and growing list of conflicting design requirements.

To address these issues, NXP Semiconductors introduced the LPC55S6x family of microcontrollers. These devices help developers tackle problems by combining a powerful general-purpose core with highly efficient specialized hardware and processing engines.

Identifying and meeting diverse requirements

Connected products have evolved rapidly from relatively simple systems where the most challenging design requirements typically revolved around some aspect of communications. Now, designers face a dramatically more challenging environment that allows little compromise across an expanding set of requirements in every application segment. Each application requires developers to shift emphasis as needed to address that application’s own unique challenges. Yet, demand for higher performance and lower power consumption runs as a common thread across most mainstream application areas.

Smart factories, for example, not only depend on low-latency, high-performance devices needed for signal processing, but also require reduced power consumption as factory engineers pack more of these devices into tight spaces. Consumer wearables not only require designs with reduced power consumption to ensure extended battery life, but also face growing demands for signal-processing capabilities able to handle more complex workloads. Across every application segment, designers find growing pressure to respond more effectively to real concerns about the vulnerability of these devices, their networks, and enterprise assets to immediate attack or advanced persistent threats from individual cyberthieves, criminal organizations, or even state-sponsored groups.

To meet these diverse requirements, developers have typically faced a choice between compromising some aspect of their design. They might sacrifice application performance to reduce power consumption by using a lower performance processor, reducing clock rate, or reduce the duty cycle of the processor in favor of low-power states. To meet strict performance requirements, they might take the converse approach with more powerful processors, faster clock rates, and increased duty cycle at the cost of much greater power consumption. For more computationally complex applications, they might add a dedicated digital signal processor (DSP) device to speed algorithm execution but with increased design complexity, cost, and system power consumption. Even if they arrived at an acceptable balance of power and performance, they would typically need to accept increased design cost and complexity to satisfy security requirements.

While designers struggle over requirements, users increasingly demand fewer compromises or none at all for critical applications such as medical equipment, industrial automation systems, retail payment devices, and others. The LPC55S6x microcontroller family from NXP Semiconductors helps designers eliminate compromises with an architecture that combines the flexibility of a general-purpose processor with specialized capabilities for processing and security required in emerging applications.

Broad capabilities with dedicated processing cores

NXP Semiconductors’ single-core LPC55S66 and dual-core LPC55S69 microcontrollers, the market’s first Arm® Cortex®-M33-based general-purpose MCUs, build on the low-latency, deterministic performance built into the Arm M-series architecture. Among their architectural enhancements, NXP’s LPC55S6x devices include the company’s high-performance PowerQuad DSP accelerator, its CASPER (Cryptographic Accelerator and Signaling Processing Engine with RAM) cryptography engine, and a comprehensive security subsystem. Along with up to 640 Kbytes of flash, up to 320 Kbytes of SRAM, and 128 Kbytes of ROM, LPC55S6x devices integrate an extensive set of functional elements typically required in any deeply embedded system design (Figure 1).

Diagram of NXP LPC55S6x microcontroller architectureFigure 1: The LPC55S6x microcontroller architecture extends the general-purpose processing capabilities of the Arm Cortex-M33 core with specialized hardware blocks for signal processing, cryptography, secure storage, and key management, while providing a full complement of peripherals used in typical embedded designs. (Image source: NXP Semiconductors)

Among those elements, LPC55S6x devices include a comprehensive timer subsystem, multiple serial interfaces, secure direct memory access (DMA) controllers, and up to 64 general-purpose I/O (GPIO) pins. Along with these digital subsystems, the LPC55S6x devices integrate a multichannel 16-bit successive approximation register (SAR) analog-to-digital converter (ADC), analog comparator, and temperature sensor. In addition, an on-chip programmable logic unit (PLU) lets developers build custom combinatorial or sequential logic, including state machines, from its array of 26 five-input look-up table (LUT) elements. Developers can access PLU registers to program the PLU directly for small logic networks or use NXP tools to implement a larger network described in Verilog register-transfer language (RTL).

To avoid bottlenecks in accessing their multiple subsystems, LPC55S6x devices include a multi-layer bus matrix built with the Arm advanced microcontroller bus architecture (AMBA) high-performance bus (AHB) architecture. The AHB bus matrix provides a direct connection between bus masters and peripherals or memory. This approach allows DMA transfers, for example, to proceed at full speed without compromising the performance of the processor’s access to memory. In fact, the ability to maximize processor efficiency amidst diverse design requirements lies at the foundation of the LPC55S6x architecture.

In the LPC55S6x architecture, the Cortex-M33 core offers multiple features designed to help designers more easily address diverse design requirements. As with other devices of its class, the LPC55S6x processor supports several low-power modes. During extended periods of inactivity, developers can put the device in power-down mode, which provides full SRAM retention while consuming only 15.4 microamps (µA), or deep power-down mode, which maintains power to a 4 Kbyte slice of SRAM while consuming about 0.59 µA. Sleep mode and deep-sleep modes shut down the processor while providing different levels of operation for peripherals and memory: Sleep mode provides full operation while consuming about 2.7 milliamps (mA), while deep-sleep mode clock-gates peripherals to drive power consumption down to around 110 µA.

Augmented capabilities

Besides its low-power modes, the LPC55S6x architecture extends support for diverse design requirements with integrated features to enhance performance and security. Built into the primary Cortex-M33 core, these integrated capabilities include Arm TrustZone security extensions (SECEXT), memory protection unit (MPU), IEEE 754 standard floating-point unit (FPU), and embedded trace macrocell (ETM). In addition, the primary core includes the CASPER cryptography engine and PowerQuad accelerator for DSP and single-instruction multiple data (SIMD) operations.

Note: these additional capabilities are not included in the second Cortex-M33 core provided in the dual-core LPC55S69 microcontroller.

Each of these integrated subsystems and architectural features provides an extensive set of capabilities with detailed descriptions that lie well beyond the scope of this article. For example, the PowerQuad DSP accelerator is a sophisticated coprocessor in its own right, able to independently compute signal-processing functions while accessing memory as a bus master.

Internally, the PowerQuad accelerator combines multiple registers and interfaces with a battery of hardware engines for key signal-processing functions including fast Fourier transform (FFT), discrete cosine transform (DCT), infinite impulse response (IIR), finite impulse response (FIR), and the COordinate Rotation DIgital Computer (CORDIC) algorithm used to efficiently compute trigonometric functions (Figure 2).

Diagram of NXP Semiconductors LPC55S6x microcontroller familyFigure 2: The NXP Semiconductors LPC55S6x microcontroller family integrates the company’s PowerQuad coprocessor, which uses specialized engines to speed execution of algorithms typically required in signal-processing applications. (Image source: NXP Semiconductors)

Using the PowerQuad accelerator, developers can execute complex signal-processing operations without compromising the host processor’s ability to respond to real-time events or to complete an extended series of operations. The host processor simply sets PowerQuad registers with the required signal-processing function and specifies the memory addresses for source, destination, and working memory regions. Once invoked, the PowerQuad accelerator operates as a true co-processor, using the AHB matrix to perform 128-bit memory transfers in its role as a bus master. Meanwhile, the host processor can immediately return to its main processing tasks, periodically polling a PowerQuad busy bit or simply responding to a PowerQuad completion interrupt to access the results.

For developers, however, PowerQuad operations are largely transparent. Developers use the standard application programming interface (API) for the Arm Cortex Microcontroller Software Interface Standard (CMSIS) DSP library. NXP’s PowerQuad-supporting version of the library included in the NXP Semiconductors MCUXpresso software development kit (SDK) replaces low-level math functions implemented in software with calls to the PowerQuad API.

For example, to compute a complex FFT, developers use the standard CMSIS-DSP function, arm_cfft_q31(), with data in Q format, which represents a 32-bit fixed-point number using one bit for sign and 31 for exponent. In a pure software implementation, a call to the arm_cfft_q31() function in turn invokes CMSIS DSP FFT butterfly software function, arm_radix4_butterfly_q31() and end function, arm_cfft_radix4by2_q31(), or their inverse versions for complex inverse FFTs.

However, when using the NXP DSP library and PowerQuad, the normal call to arm_cfft_q31() instead invokes PQ_TransformCFFT(), which handles the same calculations in hardware. The end result is not only reduced processing load on the Cortex-M33 core, but also faster execution of DSP functions (Figure 3).

Graph of MCUXpresso software development kitFigure 3: The MCUXpresso software development kit significantly speeds execution of common DSP algorithms while maintaining compatibility with high-level calls to the standard Arm CMSIS DSP library by transparently replacing low-level CMSIS-DSP functions with calls to the PowerQuad accelerator. (Image source: NXP Semiconductors)

Another coprocessor, the CASPER crypto engine, similarly unburdens the main processor from the heavy computational load associated with asymmetric cryptography algorithms. The CASPER engine’s crypto executes Rivest–Shamir–Adleman (RSA), Diffie-Hellman, elliptic-curve cryptography (ECC), and Elliptic Curve Digital Signature Algorithm (ECDSA) up to eight times faster than equivalent crypto software running on the Cortex-M33 core.

To speed execution of symmetric algorithms, the LPC55S6x also integrates hardware blocks for Advanced Encryption Standard 256-bit (AES-256) and Secure Hash Algorithm 2 (SHA-2).

The combination of these hardware blocks and the CASPER engine provides developers with hardware-based support for the crypto algorithms commonly used for authentication and data encryption required to protect data exchange in connected products.

As discussed in Part 2 of this two-part series, the LPC55S6x family’s support for security extends well beyond support for fundamental cryptography algorithms to provide hardware-based security functionality required for full lifecycle security.

System development

Developers can quickly explore the crypto engines, DSP, and general-purpose processing capabilities of the LPC55S6x microcontrollers using the NXP LPC55S69 EVK . Designed to speed development with these devices, the LPC55S69 EVK includes a dual-core LPC55S69 microcontroller, NXP’s MMA8652FCR1 accelerometer, LEDs, buttons, debug interface, and support for multiple expansion options including Arduino UNO, MikroElektronica Click, and Digilent PMod add-on hardware.

Multiple jumpers and headers let developers easily set different hardware configurations and closely examine performance details (Figure 4). For example, developers concerned about power consumption can measure LPC55S69 supply current by simply using a voltmeter to measure the voltage drop at header P12.

Image of NXP Semiconductors LPC55S69 EVK (click to enlarge)Figure 4: Built around a dual-core NXP Semiconductors LPC55S69 microcontroller, the NXP Semiconductors LPC55S69 EVK provides multiple jumpers and headers that let developers easily set configurations and examine performance details such as microcontroller current consumption. (Image source: NXP Semiconductors)

For development, designers would use the board with the MCUXpresso integrated development environment (IDE) and SDK, which uses LPC55S6x specialized hardware such as the PowerQuad functionality mentioned earlier. Also, the LPC55S69-EVK is supported by IAR and Keil IDEs. Additionally, NXP provides free software packages with sample code demonstrating key software design patterns for using LPC55S6x features.


Developers are looking to balance performance, low power, and security across a broad array of applications including the IoT, industrial automation, medical systems, and consumer devices. As described here, the LPC55S6x family of microcontrollers’ powerful general-purpose core with specialized hardware and processing engines provides a path for these developers to more easily meet demands for high performance specialized functions, while containing power consumption.

Part 2 of this series shows how security throughout a device’s life cycle can be managed using the LPC55S6x family.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Stephen Evanczuk

Stephen Evanczuk has more than 20 years of experience writing for and about the electronics industry on a wide range of topics including hardware, software, systems, and applications including the IoT. He received his Ph.D. in neuroscience on neuronal networks and worked in the aerospace industry on massively distributed secure systems and algorithm acceleration methods. Currently, when he's not writing articles on technology and engineering, he's working on applications of deep learning to recognition and recommendation systems.

About this publisher

Digi-Key's North American Editors