Article count:25239 Read by:103424336

Account Entry

Optical chips, is it too early?

Latest update time:2024-09-30
    Reads:

????If you hope to meet more often, please mark the star ?????? and add it to your collection~


At this year's Hotchips, many experts shared some technologies about optical chip interconnection. For example, Tesla, Broadcom, openAI and Intel. From the active layout of these manufacturers, we think that optical chip interconnection is on the eve of an explosion. But in fact, in the eyes of many people, it is still too early.


Figure 1: This is a “chip-to-chip” connection, not an “intra-chip” connection. Intel appears to have changed its mind, saying it’s still a long way from using light for intra-chip connections.


Changing requirements for optical communications


The above diagram shows Intel's view of the evolution of optical communications. This is an exaggeration, considering that optical fiber was actually used before the telecommunications era, and systems using fiber-based token rings to create LANs were introduced to customers in the 1990s. There is a lot of overlap between the telecommunications era and the data communications era. However, this may be an exaggeration, but it is not a lie, considering that fiber-related technologies were originally developed for long-distance communications and have been used for other purposes.


In the case of long-distance applications, being able to stabilize long distances and widen the frequency band is the primary consideration, and cost and power consumption are secondary. For DSP, it is likely to be used as the backbone of long-distance applications, so reliability is critical.


However, as copper-based Ethernet in data centers is replaced by fiber-based Ethernet, new requirements emerge. Bandwidth is necessary here, of course, but reducing costs and power consumption also becomes important.


A large number of servers are arranged in a large number of racks, which are connected to network switches through TOR (Top of Rack) and BOR (Bottom of Rack). Since these switches will be connected to each other and to large-scale back-end switches, there is an urgent need to reduce the power consumption of each network port, which will also affect the reduction of data center installation costs. As a result, the market has spawned the following needs:


  1. Aims to reduce power consumption by using silicon photonics;

  2. Increase output power (and/or) increase receiver sensitivity, thereby eliminating optical amplifiers (which reduces cost and power consumption)

  3. Reduce the functionality of the DSP, or simply remove it (because the power consumption of the DSP is very low, and the processing of the DSP is relatively complex, which is also one of the reasons for the increase in latency)

  4. These needs have changed. This is the biggest factor behind the introduction of CPO.


By the way, the pluggable Ethernet transceiver business that Intel sold to Javi last November is a solution for this era of data communications. By the way, both "silicon photonics" and "silicon optics" are used and have the same meaning.


This brings us to our current topic: the era of artificial intelligence.


Chip-to-chip optical communication


In short, if used for chip-to-chip connections, the range is limited to within a rack or between racks (or more precisely, there is no end unless it is limited to that area). As bandwidth increases, power consumption must be further reduced. Of course, the speed of each wavelength should not be increased, but the speed of each wavelength should be reduced to DWDM. Since CWDM needs to support multiple wavelengths, it is more appropriate to use DWDM.


The optical components for this purpose (e.g. MUX/DEMUX) had been developed for a long time within Intel, so it was easy to implement. So instead of developing a “serial chip-to-chip interconnect using high-speed optical signals,” they developed a prototype of a “parallel chip-to-chip interconnect” that bundled slow-speed optical signals to create a broadband one.


By the way, the term "CPO" appeared earlier. This is an abbreviation for "Co-Package Optics", a term that has recently come into common use, but the first application it has shown so far is Ethernet switches and then computing fabrics. Here we will explain the connection between chips (Figure 2).


Figure 2: If Intel continues to develop Barefoot's Tofino, future products may include products using Ethernet CPO


In fact, this trend is the same for Broadcom. For pluggable Ethernet transceivers, the company will first replace traditional II-V optical components with silicon photonics (Figure 3), then apply the technology to switches and finally to chip-to-chip connections (Figure 4).


Figure 3: This is the story of pluggable Ethernet transceivers. The III-V group mentioned here probably refers to the laser source of the VCSEL structure combining GaAs with InP, Sb, etc.


Figure 4: The switch on the left is equipped with 16 CPOs, 16 ports (4 on each side), and can be configured as a fiber-optic Ethernet switch with a total of 256 channels


The same is true for TSMC, which, at a technical seminar held in June this year, proposed a roadmap to first apply its COUPE (COmpact Universal Photonic Engine) to pluggable Ethernet transceivers and then to switches.


Figure 4: TSMC’s optical chip roadmap


Marvell and GlobalFoundries are also involved in silicon photonics and optical Ethernet, and their roadmaps are likely similar. Intel doesn't do switches (no, there's a non-zero chance Intel Foundry handles them, so it's possible in the future, but I don't see it in the near future), so I'll skip that, which is a step forward in XPU chip-to-chip technology.


Below, this is Intel's configuration (Figure 6). The XPU is what is called a processor, and the connection between it and the CPO chiplet is UCIe. There is an EIC (Electrical Integrated Circuit) at the bottom of the CPO, in which the UCIe I/F and DSP can be integrated if necessary. The electrical/optical conversion is performed by the PIC (Photonic Integrated Circuit) on the top of the EIC. This PIC is implemented using silicon photonics.


Figure 6: Foveros may be used to stack the PIC and EIC. It seems that in this implementation, the DSP is not implemented in the EIC


This CPO chiplet enables 4Gbps interconnection. Although the wavelength is (SR: short reach), it is about 1,310 nm, which is usually the area used by xBASE-LR and other SMF (single-mode fiber), but it cannot communicate with MMF (multi-mode fiber) or even no.


I think the reason they don't use wavelengths around 850nm is because of output and attenuation issues. Each wavelength is 32Gbps, but 8 wavelengths centered at 1310nm are converted to DWDM at about 1.2nm intervals and run through a single fiber. It actually consists of 8 fibers in each direction, so the total bandwidth is 32 x 8 x 8 = 2,048 Gbps.


Assuming it will be applied to PCI Express 6.0, it looks like the configuration is not to go through Ethernet frames, but directly through PCIe if needed.


First, I think the transmission speed of 32Gbps and NRZ modulation is because the PCI Express 5.0 signal is optically converted as it is. In fact, it is written as "un-retimed PCIe6", indicating that the PHY is currently using NRZ for transmission, but it can also use PAM4 for transmission if necessary.


Currently, the EIC seems to be compatible with UCIe 1.1, so PAM4 signals cannot be passed as is, but the next generation EIC compatible with 2.0 will pass PCIe 6 signals as is, hand them to the PIC, and convert them into optical signals for transmission. In this case, they seem to be considering using PCIe FLIT for error correction instead of FEC.


In short, it works like a PCI Express fiber extender. In this case, the XPU operates by reading and writing to a PCI Express device, which is then directly connected to another XPU via fiber. Alternatively, with PCI Express, there are restrictions on the transfer mode, so the logical layer may be CXL, but this is not a big issue. The point here is that it seems to use PCIe as the physical layer.


For fiber-optic Ethernet, the latency caused by FEC will inevitably increase. To avoid this, the idea is to keep the speed of each lane low and use PCI Express error correction and FLIT to expand the bandwidth while keeping the communication latency between XPUs low.


Why doesn't Intel integrate everything with silicon photonics?


Why does Intel use CPO instead of integrating everything with silicon photonics? Here’s the story.


In Figure 7, XPU is of course a silicon process. Since it is XPU, it may be Intel 7 or Intel 3 now, and may be Intel 18A in the future. EIC is of course a silicon process, and if silicon photonics is used, PIC is also a silicon process.


Figure 7: 4Gbps is the total bidirectional bandwidth, 2Tbps in one direction. By the way, for the reasons mentioned in the main text, the EIC interface may have four 16-bit wide 32Gbps UCIe


The thinking up until now has been, "Wouldn't it be easier to manufacture if we integrated everything?" However, Intel has concluded this time that it would actually be more efficient to separate them into chiplets. While the EIC and PIC processes were not shown, the EIC will likely be around 22nm or 14nm, and the PIC will be around 45nm or 65nm.


The reason is simple. The EIC needs to pass the signal to the PIC at a certain voltage, the PHY takes up a large area, and if my assumption is correct, there is no need for protocol conversion or FEC at all, so high-speed logic is unnecessary. 32Gbps PHY may be a bit difficult to use with 22nm process, but it can be manufactured without any problem with 14nm process. And whether the PHY is made with 14nm or 18A, the area is almost the same.


To put it bluntly, cutting-edge processes are not suitable for applications that require a certain voltage (although not impossible, but inefficient) because the operating voltage decreases as the process becomes smaller. In this case, using an older process such as 22nm or 14nm will be easier to handle the high voltage and the manufacturing cost will be lower if the area remains the same.


The situation is even more extreme in PICs, where silicon photonics-based circuit elements were initially developed using planar processes rather than FinFET processes, and the dimensions of these elements are even larger.


In an invited talk by Intel’s James Jaussi at the 2022 Hot Interconnects conference, it was revealed that TIA was developed using a 22nm process (Figure 8). However, given that not all components can be manufactured at 22nm, I suspect that the process is actually a bit old.


Figure 8


Coming back to the topic, the old idea of ​​“electrical and optical in the same piece of silicon” is unfortunately not realistic, the only realistic solution is to separate the components in the form of chiplets.


Dissolution of relationship with Knights Hill


When I saw the photo of the chip released by Intel (Figure 9), I thought of Knights Hill.


Figure 9: At first glance it looks like a pair of 2 fibers, but inside there are 8 pairs of 16 fibers.


Knights Hill was scheduled to be released in 2016 using a 10nm process and was unveiled at SC14 in November 2014, with plans to be implemented in Aurora and delivered to ALCF by Intel. However, at SC17 in November 2017, a blog post briefly mentioned that Knights Hill would be canceled.


According to an article stored in the Network Archive, there are products that can connect directly from the CPU to the external interconnect (Omni-Path Fabric). This generation of Omni-Path Fabric is still 100Gbps copper, while the next generation should be 200Gbps copper or fiber.


So Knights Hill was also planning to offer a version that would connect the next generation 200Gbps with optics, and there seemed to be discussions about incorporating silicon photonics into the mix, but that all ended with the cancellation of Knights Hill and the exit of Omni-Path.


Since the story is over, I don't know what kind of architecture is planned for Knight Hill with this optical interface, but it will probably be equipped with an external chip combining EIC and OIC like Knights Mill, which must be cool.


However, in reality, it is quite difficult to integrate EIC and OIC (the old process makes it impossible to increase the interface speed with Xeon Phi), which may be one of the reasons why Knights Hill was canceled. I don't think so (although I think the biggest problem is that Intel's 10nm was not put into practical use at all in the 2016-2017 time frame). It is completely possible to make Knights Hill now, both in terms of process and interface. So Knights Hill is 10 years early.


Let’s get back to the 4Tbps OCP. How useful is this interface? Some might think so, but Intel actually uses 100GbE or 200GbE for external connections with Gaudi 2 (Figure 10) and Gaudi 3 (Figure 11). Replacing it with the current 4Tbps fiber would make cabling easier, increase speeds, and potentially reduce the power required for communications.


Figure 10: From the Gaudi 2 white paper. 21 100GbE cables arranged in 7-to-3 cables interconnect the Gaudi 2 devices. Three additional 100GbE ports will be used for external connections.


Figure 11: From the Gaudi 3 white paper. This has gone from 100GbE to 200GbE, but we still need to bundle 3 wires together to form 7 pairs, which will interconnect 8 Gaudi 3


Other AI processor vendors have adopted similar configurations, and there is a huge demand for point-to-point applications between these chips. Will it be adopted by Xeon? This may seem a bit strange, but as a solution provided by Intel Foundry, it seems promising.


In contrast, the old vision of integrating electricity and light in a single piece of silicon remains premature and technically difficult. Is it possible? It's doubtful, to be honest. No matter how you look at it, 3D stacking is more flexible, less expensive, and more reliable.


Optical computing, the next hot spot


As Yole said, in recent years, due to a variety of reasons, optical computing has become an emerging force.


But they also acknowledge that optical computing is still in its early stages. As mentioned above, some large companies have shifted their focus from optical computing to optical I/O, but new optical computing startups continue to emerge, exploring various approaches.


Optical processors are mainly targeted at AI reasoning tasks. In addition, optical quantum computers based on qubits and other quantum effects can be used for various applications such as simulation, optimization, and AI/machine learning. On the other hand, optical processors will be specifically targeted at AI reasoning.


Yole estimates that the first optical processors will start shipping in 2027/28. The first shipments in 2027 will likely be for custom systems implementing parts of the technology, with the majority of revenue coming from non-recurring engineering (NRE) services. By 2028, direct sales of general-purpose systems equipped with optical processors will begin. Optical processors will be gradually adopted by early adopters, followed by OEMs and system integrators, starting in 2029. By 2034, we estimate that the total number of optical processors will reach nearly 1 million units, representing a multi-billion dollar* market value.


Yole also predicts that shipments of photon-based quantum computers will grow significantly starting in 2030, with companies such as Quandela, QUIX and Pasqal leading the way. By 2034, the market is expected to be worth several hundred million dollars at system level*. In the coming years, most of the revenue in this field will come from projects and NRE.



Optical computing is not a new concept, and there are many ways to implement light gates, with photonic integrated circuits and quantum optics being the most interesting approaches today. However, despite progress, practical optical logic gates still face major challenges, as they need to meet multiple criteria, such as cascadability between gates, scalability, and recovery from light losses, to be competitive with electronic gates. While current research typically involves single gates or simple circuits, the development of large-scale optical computers is still in its early stages.


Silicon photonics is an enabling technology for optical computing due to its scalability. One of the biggest issues in photonics has always been integration. With integrated optics rapidly advancing through different material approaches (SOI, SiN, TFLN, graphene, BTO, polymers), this could pave the way for practical optical processors based on PICs. Increased integration will also benefit the quantum optics community by enabling the development of quantum optical computers with more qubits and in a compact form factor.


There are many ways to make an optical processor. It can be analog or digital, using various optical media to process data, such as PICs, FSOs or optical fibers. For qubit-based optical quantum computers, we considered three different approaches. One uses photonic qubits, while the other two use photonics to control non-photonic qubits, such as trapped ions and neutral/cold atoms.


In addition, some companies claim to be developing optical quantum computers that are not based on qubits, but use light quantum effects and nonlinearities. Novel materials are also being developed for optical processors, although they are still at a very early stage, such as metasurfaces and SiC.



The success of optical computing requires a multi-dimensional approach that addresses integration challenges, manufacturing complexities, and infrastructure requirements. In terms of geopolitics, especially regarding the US/China ban, when China's domestic chip production catches up, the United States will need to have already begun to conquer the next technological frontier of advanced computing, such as light-based computing or quantum computing. The optical quantum supply chain is still in its early stages, and the demand for advanced products that require extensive R&D is high, resulting in long lead times and hindering progress.


Nevertheless, the supply chain remains highly dynamic, with numerous players offering PIC foundry services, including GlobalFoundries, TSMC, Samsung, LioniX, etc. The industry is still struggling with the “low-volume problem” as the industry has not yet reached the scale and commercialization stage, and the current focus is still on development and prototyping.


Companies working on optical computing have raised nearly $3.6 billion in the past five years. The race for faster, more efficient computing is intensifying as giants like Google, Meta, and OpenAI push AI capabilities to the limit. The latest round of funding highlights investors’ confidence that photonics can provide the breakthroughs needed to sustain AI progress in the future.


However, as with quantum computers in general, it is difficult to predict when the inflection point for optical computing will occur. Optical computing platforms are expected to see some level of use in academic and private research over the next few years, but whether they will achieve widespread applicability and adoption in the short to medium term remains uncertain.


Reference Links

https://pc.watch.impress.co.jp/docs/column/tidbit/1626432.html#Photo02_l.jpg


https://www.yolegroup.com/press-release/could-optical-computing-solve-ais-power-demands/


END


????Semiconductor boutique public account recommendation????

▲Click on the business card above to follow

Focus on more original content in the semiconductor field


▲Click on the business card above to follow

Focus on the trends and developments of the global semiconductor industry

*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.



Today is the 3901st content shared by "Semiconductor Industry Observer" for you, welcome to follow.


Recommended Reading


"The first vertical media in semiconductor industry"

Real-time professional original depth

Public account ID: icbank


If you like our content, please click "Reading" to share it with your friends.

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号