Domain-Specific Floating-Point Formats, Dynamic Scaling, and MAC Acceleration Architectures for Efficient & Cost-Effective Machine Learning Inference

TO-213 • PT 1.3164/65/66 • As of 07/2025
Peter Grünberg Institute (PGI)
Energy-efficient information technology (PGI-10)

Technology

We introduce a comprehensive platform designed to enhance the efficiency and accuracy of machine learning (ML) processes. It comprises a novel floating-point number format tailored for ML workloads, utilising an asymmetrically biased exponent to optimise numerical representation where it is most needed. Complementing this, our solution integrates a dynamic scaling method that adjusts the numerical range of data in real time as it moves through various stages of a model, ensuring stable computation regardless of input data variability. Additionally, the technology encompasses an optimised multiply-accumulate (MAC) method and hardware unit that reduces computational complexity and boosts processing speed. These interlinked sub-technologies (number format, scaling, and MAC operations) form a unified system, applicable across diverse ML models – including deep neural networks and large language models – enabling improved performance for both training and inference tasks. Provided as software-hardware codesign integrated circuit, our technology is ready for adaptation to specific ML applications.

Domain-Specific Floating-Point Formats, Dynamic Scaling, and MAC Acceleration Architectures for Efficient & Cost-Effective Machine Learning Inference

Problem addressed

Conventional digital number formats and ML processing architectures face significant performance bottlenecks. Standard floating-point representations, such as IEEE 754 32-bit and 16-bit formats, are not adapted to the statistical properties of neural models, resulting in wasted storage, excess energy consumption, and unnecessary hardware overhead. Attempts to save resources with lower bit-widths often lead to severe loss of precision, making them unsuitable for high-quality training. Static scaling strategies fail to accommodate fluctuating data distributions, frequently causing overflow, underflow, or numerical instability, especially when post-training input data differs from training data. Multiply-accumulate operations in these architectures require repetitive normalisation and alignment steps, slowing computation and increasing complexity. These inefficiencies create barriers to deploying advanced, cost-effective, and energy-efficient ML learning solutions on edge devices and in large-scale data centers alike.

Solution

Our novel floating-point format with its asymmetrically-biased exponent precisely accommodates the data ranges typical in neural networks, delivering higher accuracy with far fewer bits – significantly reducing memory, energy, and silicon requirements without sacrificing result quality. The dynamic scaling method continually adapts to the data flow, preventing overflows and enhancing stability without the need for predetermined scaling factors or costly global checks. This ensures robust model performance, regardless of input shifts, and facilitates real-time operations. The multiply-accumulate approach streamlines computational pathways by minimising the need for normalisation and alignment during arithmetic operations, decreasing latency and silicon hardware complexity while boosting throughput. Combined, these features yield a compact, reliable, and highly efficient ML platform that translates directly into tangible reductions in cost, time-to-market, and power consumption aiming to optimise AI workflows and devices.

Benefits and Potential Use

Our technology stands ready for integration into a wide array of machine learning environments, from cloud data centres seeking energy efficiency to resource-constrained edge devices such as smartphones, wearables, and autonomous systems. It is suited for both training and inference of deep neural architectures, including convolutional and large language models, where precision, speed, and efficiency are paramount. The fully unified solution – including software libraries, firmware, hardware components, or IP cores – slots into existing AI acceleration pipelines and can be customised for semiconductor manufacturers, device OEMs, and AI infrastructure providers. Its backward compatibility and straightforward programmability results in rapid deployment. By licensing this technology, partners gain a direct competitive advantage: achieving breakthrough performance and efficiency while retaining full flexibility to innovate on top of a proven, standards-supporting foundation.

Development Status and Next Steps

Forschungszentrum Jülich (FZJ) has extensive expertise in this field and holds several patents. Our technology described above is continuously being enhanced. Our Peter Grünberg Institute (PGI-10) – Energy-efficient information technology – already cooperates with numerous national and international companies and scientific partners. Forschungszentrum Jülich focuses on energy and cost-efficient devices suitable for application in various emerging technologies. We are thus constantly seeking cooperation partners and/or licensees in this field and adjacent areas of research and applications.

TRL

5

IP

EP 25169997.1

EP 25169998.9

EP 25169999.7

Keywords

Energy-efficient AI, Neural network accelerator, Deep learning model precision, Floating-point number format, Asymmetric exponent bias, Dynamic scaling method, Multiply-accumulate operation (MAC), Low-bitwidth computation, Optimised machine learning inference

Contact us

Loading

More technologies

Loading

Last Modified: 02.04.2026

Cloud Computing Halbleiter, Mikro- und Nanoelektronik Hochleistungsrechnen (Supercomputing)Informationstechnologien künstliche Intelligenz (KI)Neuromorphes Computing (NC)