Spotlight

Life After NVIDIA: The Coming Crisis of Data Center Silicon

Editor’s Note: On Mondays we turn the spotlight outside of Porter & Co. to bring you exclusive access to the research, the thinking, and the investment ideas of the analysts who Porter follows.

(If you missed our announcement explaining your newest membership benefit, The Porter & Co. Spotlight, you can read it here.)

Each month we’ll focus on an analyst who isn’t part of the Porter & Co. family, whose work Porter thinks you need to be aware of – and that he believes could make a big difference to your finances.

These are analysts Porter has known and followed for years, if not decades – whose ideas and insights have withstood the test of time. This month’s Spotlight is a man who truly exemplifies this. Enjoy.

Meet the Prophet

Prophets walk among us.

I (Porter) have met them. Men and women who can intuit the future with an almost supernatural degree of accuracy. 

I’m not talking about predicting what a company’s earnings will be next quarter, or which sectors will benefit from some new policy.

I mean envisioning what the world will look like 5, 10, and even 20 years down the line… and being so close to the mark it’s uncanny.

Over the course of my life, I have maybe met a handful of these thinkers… and the greatest of them is our Spotlight today.

He’s predicted everything from the rise of the microchip to personal computers, iPhones, Netflix, cryptocurrencies and more – often years before anyone else.

I’ve been following his work for close to 30 years, and I can’t express the impact he’s had on my life.

When I was in my early 20s he helped shape my understanding of supply-side economics… the inherent morality of capitalism…and the dangers of collectivism. 

He also helped lay many of the foundations for my first newsletter, as it was his research that opened my eyes to the inevitable dominance of the internet.

Without him, I likely would never have seen the rise of Amazon, Qualcomm, Adobe or any of my other early recommendations. 

All these years later, when I want to know where the world is going, George Gilder is one of the first I turn to. After speaking with George about the future of AI, Nvidia, and parallel processing, I wanted to bring his work to you too.

Below you’ll read George’s Life After Nvidia thesis – that until now was reserved for the members of his research advisory – and if you want to find out how George is investing in this next trend, go here now.

Enjoy.


INSIDE: NVIDIA is the Center of the AI Universe • Approaching a Pinnacle • The Graphene Moment

NVIDIA IS THE CENTER OF THE AI UNIVERSE

Kaboom! Or at least a stock market boom for teradollar goliaths. Watch the sky fill up with unicorns, moonshots, and transcendental sand!

Behind this boom, as everyone knows, is the explosive rise of artificial intelligence (AI). Machines aiming to augment or even to usurp human minds are the dynamo at the heart of the new silicon economy. In impact, AI is “more profound,” says Google Chief Sundar Pichai, than “fire or electricity.”

Everyone also knows that the center of the universe of AI is NVIDIA, recently reporting blow away earnings from its exponentially advancing regime of parallel processing on graphics processing units (GPUs).

Under the charismatic helm of Jensen Huang and his technical spearhead Bill Dally, NVIDIA has moved from gaming chips to generative mind machines. Affecting every industry, this coup drives computer architecture from serial processing step-by-step on individual microprocessors to massively parallel “accelerated computing” on huge arrays of GPU chips and boards, racks and datacenters.

As Huang explains it, computing moved beyond Moore’s Law with doubling densities every two years or so and on to computational capabilities rising some 10,000-fold every 20 years.

Now with NVIDIA’s labyrinthine parallelism, Jensen hails an ascent of a million-fold in computer capabilities in a decade. Relevant to investors, his tech chieftain Dally points to a growth rate of demand for NVIDIA’s devices expanding from threefold every year to 16-fold annually, as machine learning performance tests portend an ever-increasing lead over all rivals.

Under Jensen’s vision of “accelerated computing,” even this pace of advance is still compounding today, fulfilling Dally’s 30-year dream of “exaflop” computing at 1018 operations, or quintillions of steps, every second. On toward zettaflops (1021), NVIDIA’s supercomputer “EOS” summons “a new dawn” at its current level of 43 exaflops and then embarks on a quest toward “quantum computing.”

All these raptures of hardware invention, miniaturization and multiplication through huge matrices of chips and boards and Brobdingnagian datacenters would amount to little without NVIDIA’s feats of software programmability. Consummating this explosion of silicon hardware is NVIDIA’s 2007 development of parallelizing software. This was its CUDA (Compute Unified Device Architecture). Converting video maps to software mindscapes, this utterly dominant software standard enables ordinary C++ programmers to take massive datasets, encompassing the Internet and beyond, and process them all but instantaneously on NVIDIA’s neural networks.

Jensen recently announced the latest CUDA revelation, named CUDA Q, integrating NVIDIA’s empires of GPUs with magical entangled qubits of “quantum computing.”

Though Jensen suggested that stupendous quantum machines are well on the way, we at Gilder’s Technology Report regard all modern computers as quantum machines. After all, my first book on chips, “Microcosm” 1989), bore a subtitle: “The Quantum Era in Economics and Technology.” But the existing fancy for computing with quantum “qubits” is chiefly an intriguing project in academic physics. It’s a way to inculcate kids with a practical sense of the quantum beyond the “quantum enigmas,” Heisenberg uncertainties, and self-contradictory Schrodinger equations and Dirac dualities of waves and particles that lend an inscrutable mystique to studies in the field.

All the same, as Caltech’s and the world’s leading sage of engineering physics, Carver Mead asserts, “I can’t think of a single case where theoretical physicists have aided technology. They don’t understand electronic bandgaps. They haven’t even figured out what an electron is.” As I wrote in a recent Gilder’s Guideposts, the engineers have been left to contrive practical solutions on their own.

CALTECH CELEBRATION

I write these words from Pasadena, California, the home of Caltech, and the inspirational center of the continuing spiral of technological and scientific achievement.

At the venerable gothic Athenaeum here, I walked up curving marble staircases past statues of Albert Einstein and Linus Pauling. I recall Bill Dally’s avid recollections of his Caltech classes “in the physics of computation” with Nobelist Richard Feynman, and Mead. Mead’s career and 90th birthday was celebrated at Caltech back in May by the entire Caltech community in a series of exalted events also attended by scores of Carver’s former students and associates.

From Mead’s Caltech classrooms ultimately stemmed the neural networks and parallel processing architectures behind NVIDIA’s “generative” feats and mind-blowing near three trillion-dollar market cap.

After a career in which he defined the field of VLSI (very large scale integrated circuits) and both researched and named Moore’s Law for his friend Gordon Moore of Intel, Carver Mead is still working devoutly at Caltech on the most fundamental problems of physics and engineering. Among scores of companies led by his former students, he also was the founding chairman of Impinj (PI). At the heart of the internet of things, Impinj has roughly tripled in value in recent months.

As Carver celebrated his birthday with a venerable Watson Lecture and campus reunions, he was still exploring the very nature of the electron, pondering its resistance, and its potential for superconductivity—even the possibility of it leading to quantum computers.

At the heart of quantum computing, however, is a return to analog, replacing simple bits and bytes with continuous “qubits” or quantum bits.

Why would you want to do that?

Until Claude Shannon proved the superiority of digital in 1948, the regnant computing machine was his boss Vannevar Bush’s “Analog Computor.”

Rather than translate all inputs to the simple and clearly intelligible binary on-off switches of digital mathematics and Boolean logic, analog contrives continuous multi-dimensional paradigms of reality. For all the bold assertions of quantum computing projects and companies, these so-called “qubits” and their computing outputs have nothing to do with actual bits and are inherently beyond the grasp of the very minds they aspire to model.

Essentially nanoscopic analog clouds, qubits purport to capture the paradoxical reality of quantum matter as both particles and waves in continuous flows and phases. Analog computing may be instantaneous (as MIT’s Seth Lloyd points out, the very universe is a quantum analog computer). But this means all the hard work is relegated to input-output. Relegated to God, so to speak. So don’t hold your breath for the “second coming” of analog computing.

As trumpeted by Jensen Huang, this NVIDIA venture into qubits is meant to signal that the company has all futuristic paths dauntlessly covered. But it should signal to NVIDIA’s followers and investors that the company is approaching a pinnacle, which is another way of signifying a top.

APPROACHING A PINNACLE

Over the next decade, this model is all coming to an end. To understand why, it is necessary to explain the basics of chip fabrication. Sorry!

Nearly all chips are made from wafers as large as 12 inches in diameter, sliced from “boules” of crystalline silicon. In the climax of a 700-step process, “wafer-fabs” (wafer fabrication plants) inscribe the design of the chip hundreds or even thousands of times across the wafer.

To accomplish this mass miniaturization, they use 580-ton machines that emit beams of “deep or “extreme” ultraviolet radiation through “reticles” containing the pattern of circuitry. In those some 700 steps, the pure silicon surface becomes layered with silicon dioxide insulators, varied chemical “dopants” to render positive and negative sources, drains and gates, and copper or aluminum wires to interconnect the transistor switches that enact the computational logic.

NVIDIA and all other chip companies rely on this wafer-fab process, in which that 12-inch wafer—once it is elaborately inscribed with circuitry—is effectively broken into silicon smithereens. That is, it is “diced” into hundreds or thousands of tiny separate rectangles each a few millimeters wide. Bare unlinked nano-patterned slivers of silicon, these “dice” do not really become “chips” until they are sealed in plastic packages in factories by thousands of white-garbed, masked women on remote Pacific islands.

Then, often in Taiwan or mainland China, the now-encapsulated silicon slivers are arduously attached to printed circuit boards with larger wires that interconnect various chips in patterns that constitute the final computing machine.

Including memories, NVIDIA’s acclaimed GPUs actually consist of hundreds of inter-connected chips. The leading-edge Hopper 200 GPU bears some 35,000 components, weighs 70 pounds, and costs $250,000 for just one device!

Since scores are needed at data centers those markets ultimately mount toward trillions of dollars and render these devices by far the most important, profitable, and transformative collections of chips ever made.

Yet, at the same time that Jensen Huang was proclaiming the immense accomplishments of NVIDIA to grateful investors and customers and projecting markets for future AI engines in the trillions of dollars, the outline of a new regime was emerging.

It came in two riveting announcements. A small Silicon Valley company named Cerebras launched its radical new technology. Meanwhile, a professor at Georgia Tech and his mostly Asian students introduced an even more revolutionary new electronic material that makes the Cerebras concept the path to an almost entirely new electronics paradigm.

Rather than assembling scores of GPUs in hundreds of chips deployed across scores of printed circuit boards interconnected by fiberoptic channels in towering racks at hundreds of gigantic data centers, Cerebras achieves a vast simplification. It banishes the dice, the chips, the packages, and the racks and puts all circuitry on a single 12-inch silicon platter or wafer.

In a complementary breakthrough, the Georgia Tech announcement proposes to coat such a “platter” or wafer with a 2D film of the new miracle material, graphene.

Graphene, a single layer of carbon atoms that conducts a thousand times faster than copper, can usher in an era of “terahertz electronics.”

At both Cerebras and at Georgia Tech, the new paradigm is called “wafer-scale integration.” At least in theory, it makes much of what NVIDIA does obsolete. In recent months, Cerebras issued a report detailing the superiority of Cerebras’s wafer scale technology to NVIDIA’s GPU chips and systems.

Cerebras’s wafer contained 57 times as many transistors as a comparable NVIDIA Hopper 100 GPU. At 21 petabytes (1015) a second, its memory bandwidth is 7,000 times as great as NVIDIAs H-100’s. At 214 petabits a second, the overall bandwidth of its circuitry — its “fabric bandwidth” — signifies its overall potential performance. That speculative figure of “fabric bandwidth” indicates a superiority to NVIDIA’s machine of 3,715 times.

NVIDIA’s figures may vary, and no one believes that Cerebras will displace NVIDIA tomorrow. And no one supposes that graphene from Georgia Tech and its onetime partners in Tianjin China will displace silicon anytime soon.

For one thing, as I’ve been writing in my Gilder’s Guideposts, the U.S. government regards the previous Georgia Tech links to Tianjin as a threat to national security. In an action that suggests a real threat to national security from the FBI’s “China Initiative”, it has sent most of the Chinese home. There they have collected the equivalent of hundreds of millions of venture capital dollars to carry out their plans in China.

Nonetheless, the writing is on the wafer, inscribed in lines less than five nanometers wide. Wafer-scale is simply far more elegant and efficient than chip scale.

Cerebras machines are now being installed in data centers in silicon deserts such as Dubai, where they can afford to experiment. Other announced customers include the Mayo Clinic, Argonne National Laboratory, Smith Kline, Astra-Zeneca, Edinburgh University, and startup Jasper AI.

So far, these initiatives represent no short-term commercial threat to NVIDIA. Changes so radical will take time to be adopted.

Wafer scale is intrinsically simpler and better, but it is still embryonic and outside the torrential mainstream. While undergoing the challenge of wafer scale, however, the industry will also be facing an even more fundamental shift. It is moving beyond silicon as the prime computing material and beyond data centers as the prime vessels of computation.

Some 40 percent of NVIDIA’s total market is giant data centers. These are immense kludges. A finite amount of electricity enters and it must be apportioned between computing tasks and heat extraction. With the data centers’ profile dominated by air conditioning towers or “chiller units,” the thermal crisis prophesied by Richard Feynman in 1959 (see my GTR May 2022 issue) has returned.

Today, as Ernst Hozle of Google once confessed, the bulk of the energy is used not to implement chip designs but to chill the wires and switches. Since heat is essentially noise, the data center may be said to produce more noise than signal.

In 2024, some 65 years after Feynman’s prophecy, with then 25-year-old Carver Mead taking notes, we face a world of electronics that strangely and unexpectedly repeats the predicament that Feynman depicted in his address. Just as the earlier regime was dominated by a few score five-ton mainframes at IBM and the BUNCH (Burroughs, Univac, NCR, Control Data, and Honeywell), we have a limited number of integrated computer systems. These huge contraptions perform feats of artificial intelligence in gigantic data centers run by Google, Amazon, Meta, Microsoft, and IBM.

Together these centers now consume over 1% of all the world’s power. In the AI boom, they are increasing their power use at a rate of 60% annually and are collectively on a path to use some 8% of the world’s energy.

They already require multiple TVAs of 30-gigawatt pumps worth of electricity. Meanwhile, in these centers, the “planar” insight of Bob Noyce and Jean Hourni, laying down all the wires and switches on the surface of silicon chips, is giving way to a baroque, three-dimensional architecture of “FinFETs” and other vertical structures, vias and shafts, with fins needed to shed the heat in castles of sand.

As even NVIDIA’s technical chief Bill Dally confessed early this month speaking at Georgia Tech, “Transistors are getting worse as the technology advances.” Laying down the nanometer patterns on chips entails use of those extreme ultraviolet photolithography machines. These EUV “mainframes” and their supporting infrastructure occupy basketball courts of space. Made by ASML in the Netherlands with a worldwide supply chain, they are priced at some $360 million apiece and weigh 165 tons, dwarfing even the five-ton mainframes of old.

Most of these problems of the new centralized silicon regime have their roots in the intrinsic properties of silicon chemistry. Obscured through decades of progress into the planar regime is the inexorable three-dimensional topology of the silicon molecule. With four covalent bonds needed in its stable form, sticking out in all directions, silicon cannot ultimately lie flat. As a 3D substance, silicon does not naturally form one-way “channels,” and is not a ready carrier of communications. It requires wires. Forced into an ultimate planarity inimical to its inherent chemistry, the molecule is finally too coarse for the nanocosm, becoming a blob with a topology that rebels and breaks.

At the nanometer scale, devices are increasingly prone to a phenomenon called “electromigration,” where the momentum of electrons—electron wind—affects the integrity of the silicon crystals themselves. Damaging the atomic structure of the material, this effect undermines the very regularity of the crystal molecules that assure reliable digital electronics.

Moore’s Law’s planar revolution increased the number of transistors 25-fold every 10 years. But now the industry is replacing its planar circuitry with stacks of orthogonal chip layers and chiplets piling up like “si-scrapers” with “copper heat pipes” to prevent melting. The iconic system on a chip becomes a server in a baroque data center.

The rule in technological enterprise is to waste what is abundant to save what is scarce. Planar electronics prospered by wasting transistor switches. Concentrating computation in space economized on wires. The defining abundance was switches and the scarcity was three-dimensional connectors.

The rise of fiber optics, as Eric Schmidt’s Law ordained in the 1990s, meant that the “wires,” now photonic channels of light, became “faster than the backplane of the computer.” This according to Schmidt, in a call to me past midnight from his lab at Sun Microsystems, made bandwidth the defining abundance. “It hollowed out the computer and spread it across the network.” In Sun’s famous phrase, “the network became the computer.”

It soon turned out, however, that the process of converting the computational information to light waves and back again resulted in concentrating the heart of the “network computer” in centralized hubs. In 2020, NVIDIA bought Israeli Infiniband optics supplier Mellanox with optical top-of-rack switches that suffused these massive data centers with fiber optic infrastructure. In these data centers, the flaws of silicon are becoming evident.

As the chief silicon supplier to the data center, NVIDIA is abandoning planarity and integrating software and hardware, wires and switches, memories, and parallel processors in its 3D CUDA system to escape long slow passages to remote memories.

All these developments near the end of Moore’s Law engender and reflect increasing centralization. The reason silicon triumphed was that it united knowledge with power, distributing computing tools as widely as human minds. The essence of the Moore’s Law regime was fast switches compensating for slow wires. Focusing on service of individual human minds, with their serial thought processes and intuitions, wielding PCs, pods, and smart phones, microchip companies could extend the planar revolution. They could follow the logic of distributed processing down to the individual human user.

As any technology approaches its climax of obsolescence, its incumbent prophets often adopt a rhetoric of completion or consummation. Proclaiming ourselves on a mountain top—the pinnacle of human achievement, envisioning apocalyptic epiphanies of manmade new mind—we are becoming delusionary apostles of a suboptimal cult. We are mistaking a “local minimum” for a summit of global singularity. We are engulfed in machines that obscure our real future behind their telecom clouds.

In its final stages, a prevailing technology is enshrined in a monotheistic temple. Silicon chips gain a nimbus of artificial mind that obsoletes their creator. Aluminum seems intrinsic to air flight rather than a phase in the advance of ever lighter and stronger materials. From the pinnacle of their own attainment, in the silicon castles and complexities at the end of Moore’s Law, in the Googleplex in Mountain View or Apple’s “Infinite Loop” or in NVIDIA’s Santa Clara launchpad, even in the “Azure” skies of Microsoft on the Columbia River, the prophets of incumbency sometimes seem to believe they have achieved an eschaton—a final thing. They believe they are exhausting both the nanocosm and human supremacy in it.

THE GRAPHENE MOMENT

At hand in our 21st Century predicament is a new age of two-dimensional sheets of carbon: “the graphene moment”.

The titan of science fiction, Neal Stephenson in 1995 as much as dubbed it The Diamond Age: “If nanotech can manipulate atoms at will, it can assemble diamond structures from carbon atoms. In diamond, a dense network of strong bonds creates a strong light and stiff material.”

Graphene is a two-dimensional form of diamond. It is a scintillating jewel one atom deep, so thin as to be almost invisible. But it is so strong you could stand upon it and so flexible you could bounce on it like a trampoline. There, suspended above the world, as if dancing on air in a spaceship, you might see graphene as a substance to signify a new technological epoch.

As a three-dimensional form of carbon, however, diamonds are a complex miracle that bear a less likely and less thermodynamically stable configuration than graphene. Now found to be more natural and stable than diamonds is this two-dimensional phase of carbon.

Graphene is so well hidden in graphite that it took centuries to identify it as a separate substance. But you encounter graphene routinely as a single layer of your graphite pencil lead. But you don’t really see it. At less than three nanometers thick—that’s three billionths of a meter—it is essentially invisible. If you see a line on your paper, it is not graphene but rather thousands or even millions of layers of graphite.

Beyond pencil lead, graphite has a host of uses, such as lubricants, brushes for electric motors, arc-lamps, anodes for lithium ion and other batteries, and many electronic and electrical applications. As part of the Manhattan Project, Enrico Fermi, after fleeing fascist Italy with his Jewish wife in 1938, used graphite as a modulator-reflector. This Fermi insight enabled the first nuclear reactors in the “Chicago Pile” in 1942, with his X-10 Graphite Reactor following at Oak Ridge in 1943.

A residue of rocks mined from the ground, graphite is the most stable form of carbon within the earth under usual conditions. Holding the layers together is the elusive static attraction of molecular “van der Waals forces.”

Bearing a nimbus of mind in artificial intelligence (AI), silicon seems a summit of human achievement. But compare it to graphene. In the face of the new epoch, silicon is a slow conductor, prone to distortion and buildup of heat, altogether a transitory element. In its form as the glass silica of fiber optics, it entails constant conversion from light to electricity and back to play its role in the data center. A maturing and even passing phase, silicon chips now preen for enshrinement by national subsidies, to be immured by mandates in the laws and worship of increasingly socialist governments.

Complexity, mutability, mandates, and industrial policy regimes, however, signal a material in the obsolescent phase of its epochal reign. By contrast, graphene is a material so supremely simple, so akin to carbon life, so inevitable in its purity and power that it promises to sweep its rivals away. With features measured in nanometers, it nonetheless prompts dreams of space elevators. NASA wants it “in every part of the space program” and Ford wants it in all its future cars, after using it in the motors and insulators of its current models, largely as a noise shield on the engine.

It is a material that in parts per million strengthens concrete by 35%, as well as reduces the thermal threshold or buckling point of asphalt by some 20 degrees F. It is a bionic material so ductile and exquisite in the bearing of signals that it can repair a severed spinal cord, and is a material so reliable and robust in its precise regularity that it inspires realistic notions of “terahertz” electronics. Trillions of cycles a second, like light, terahertz signals are a thousand times faster than the billions of cycles in our currently seething gigahertz radiations of infrared heat.

Georgia Tech’s paragon of the new paradigm of graphene electronics is Walter de Heer. Like many figures in chemistry and physics, de Heer was in the thrall of Linus Pauling’s work on the “Nature of the Chemical Bond” (1939) and his Nobel Prize lecture in Stockholm in 1954. “The miracles of graphene all began with Linus Pauling,” says de Heer. Reconciling quantum theory and chemistry, Pauling expounded the details of the resonance theory of molecular bonds, focusing on carbon. He envisaged hybrid bonds joined by “resonance” effects from quantum mechanics. In particular, as de Heer noted, Pauling’s resonance theory explained the need for planarity in some forms of carbon.

From the early 1990s at Lausanne, de Heer was intrigued with the findings in Richard Smalley’s Laboratory at Rice University in Houston. Smalley had contrived so-called one-dimensional C-60 particles called Buckyballs or Fullerenes and also had advanced the cause of carbon nanotubes. A Nobel laureate for his breakthroughs in these carbon “allotropes” (or different forms), Smalley died of leukemia at age 62 before he could extend his sway to graphitic monolayers.

That baton in his lab at Rice fatefully passed to inventive prodigy James Tour, who in taking a different path from de Heer, would pursue the study of graphene into unique new paths and possibilities, yielding some 18 pioneering new companies, mostly in Israel. We write about them frequently in our newsletters.

In my upcoming series that is being graciously shared with you, dear reader, by our host Porter Stansberry and the fine team at Porter & Co., I will delve deeper into exactly how to participate in this Graphene Moment and ways to capitalize on these monumental breakthroughs.

Until Then!


George Gilder
Editor, Gilder’s Technology Report & Gilder’s Moonshots

Special Note from George Gilder: A Big AI Lie is spreading through the media and in Silicon Valley boardrooms today. It’s attempting to cover up a harsh reality… As our laptops have become thinner, and our screens wider, these silicon-based computers are essentially the same models which took 5 minutes to load a basic webpage in 1999.

Now Silicon Valley is telling the world that modern computers are powerful enough to think and reason like a human brain? But people ignorant of this Big AI Lie could end up holding some of the market’s most expensive stocks, right before they crash.

I say don’t fall for it.

To find out the truth behind what’s really happening with AI in 2024, my research uncovered a company at the center of a new breakthrough that’s currently trading under $10. Discover the details Right Here.