xiand.ai
Technology

CERN Embeds AI in Silicon to Filter LHC Data at Nanosecond Speeds

CERN is embedding artificial intelligence directly into silicon to manage petabyte-scale data streams from the Large Hadron Collider. Thea Aarrestad presented the strategy at the Monster Scale Summit to address an unprecedented data deluge. This hardware-focused approach contrasts with the trend toward larger language models.

La Era

3 min read

CERN Embeds AI in Silicon to Filter LHC Data at Nanosecond Speeds
CERN Embeds AI in Silicon to Filter LHC Data at Nanosecond Speeds
Publicidad

At the virtual Monster Scale Summit earlier this month, Thea Aarrestad presented CERN's strategy for managing petabyte-scale data streams. The European Organization for Nuclear Research is embedding artificial intelligence directly into silicon to filter particle collision records. This approach addresses an unprecedented data deluge generated by the Large Hadron Collider. The presentation highlighted the extreme engineering constraints faced by modern physics.

Each year the collider produces approximately 40,000 exabytes of unfiltered sensor data alone. Aarrestad estimated this volume represents roughly 25% of the size of the entire Internet. Such quantities exceed the storage capabilities of even the largest commercial data centers. It forces the team to discard most information before it leaves the detector chamber.

Consequently, the facility must reduce data in real time to something affordable to retain. Processing speeds reach hundreds of terabytes per second during operation. These requirements far exceed the latency needs of consumer streaming services like Netflix. Standard cloud infrastructure cannot handle the raw throughput generated by the machine.

Decisions to save or discard data must be burned into the chip design itself. Specialized magnets squeeze proton bunches separated by 25 nanosecond intervals just before detection. Only about 60 pairs collide out of billions in each bunch. The detectors must identify these rare events before the information disappears from history.

The Level One Trigger system aggregates roughly 1,000 field-programmable gate arrays to make these choices. It reconstructs event information from detector data via fiber optic lines at 10 terabytes per second. The system outputs a single accept or reject value within 50 nanoseconds. This speed is critical because the data buffer lasts for only 4 microseconds.

An algorithm named AXOL1TL handles the anomaly detection required for this selection. It flags events falling outside standard collision topologies to hunt for rare physics. Less than 0.02% of all collision data makes the cut for ground-level storage. Researchers rely on the system to distinguish meaningful signals from background noise.

Once on the surface, the data goes through a second round of filtering called the High Level Trigger. This system utilizes 25,600 central processing units and 400 graphics processing units. It identifies roughly 1,000 interesting collisions from the 100,000 events per second that come through the pipe. The process produces about a petabyte of useful information every single day.

CERN engineers developed a transpiler called HLS4ML to target specific platforms like custom FPGAs. Every operation on the hardware is quantized with unique bitwidths for each parameter. This allows the model to be optimized using gradient descent despite extreme constraints. The team pruned and distilled the models to essential knowledge to fit the silicon.

The architecture breaks from the traditional Von Neumann model of memory processor and input output. Processing is based on the availability of data rather than sequential drives. Pre-calculations fill chunks of on-chip silicon to save processing time. Decisions take place at design time because nothing can be handed off to memory.

The collider will shut down this year to prepare for the High Luminosity LHC variant. This upgrade is due to become operational in 2031 with a 10-fold data increase. Event complexity will jump significantly while detectors track particle pairings back to original collision points within microseconds. Engineers must prioritize discarding information when building an understanding of the universe.

Publicidad

Comments

Comments are stored locally in your browser.

Publicidad