Comments for The Next Platform https://www.nextplatform.com/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Tue, 28 Jan 2025 00:50:57 +0000 hourly 1 https://wordpress.org/?v=6.7.1 Comment on How Did DeepSeek Train Its AI Model On A Lot Less – And Crippled – Hardware? by itellu3times https://www.nextplatform.com/2025/01/27/how-did-deepseek-train-its-ai-model-on-a-lot-less-and-crippled-hardware/#comment-246851 Tue, 28 Jan 2025 00:50:57 +0000 https://www.nextplatform.com/?p=145225#comment-246851 So it’s all mechanics, and really not a touch of actual theory.
But then, what is the theory behind LLMs? Doh!

]]>
Comment on How Did DeepSeek Train Its AI Model On A Lot Less – And Crippled – Hardware? by Carl Schumacher https://www.nextplatform.com/2025/01/27/how-did-deepseek-train-its-ai-model-on-a-lot-less-and-crippled-hardware/#comment-246845 Tue, 28 Jan 2025 00:03:51 +0000 https://www.nextplatform.com/?p=145225#comment-246845 If this is indeed “DeepFake”, then its one of the best engineered shorts (both in an energy and IT capital sense) in decades.

]]>
Comment on How Did DeepSeek Train Its AI Model On A Lot Less – And Crippled – Hardware? by Tapa Ghosh https://www.nextplatform.com/2025/01/27/how-did-deepseek-train-its-ai-model-on-a-lot-less-and-crippled-hardware/#comment-246833 Mon, 27 Jan 2025 22:33:03 +0000 https://www.nextplatform.com/?p=145225#comment-246833 “And here is another side effect: The V3 model uses pipeline parallelism and data parallelism, but because the memory in managed so tightly, and overlaps forward and backward propagations as the model is being built, V3 does not have to use tensor parallelism at all. Weird, right?”

This is mostly because of the small # of GPUs used, they can use expert parallelism as well, to eliminate the need for TP, if you used more GPUs, you’d need TP

]]>
Comment on GenAI Boom: Datacenter Spending Forecast Raised Again by Kris https://www.nextplatform.com/2025/01/23/genai-boom-datacenter-spending-forecast-raised-again/#comment-246632 Sat, 25 Jan 2025 21:04:31 +0000 https://www.nextplatform.com/?p=145212#comment-246632 With Gartner’s report, which companies will benefit the most worth investing in growth stocks?

]]>
Comment on OpenAI Declares Its Hardware Independence (Sort Of) With Stargate Project by HuMo https://www.nextplatform.com/2025/01/22/openai-declares-its-hardware-independence-sort-of-with-stargate-project/#comment-246419 Thu, 23 Jan 2025 18:54:41 +0000 https://www.nextplatform.com/?p=145210#comment-246419 Spot on! And yet rather soap-opera-esque … if I understand well, OpenAI found itself a wealthier new “suitor” who promised it the Stargate, but it also wants to remain friends with its ex, Microsoft. And meanwhile, its ex-ex, or former ex, now of xAI, is building its own Colossus dream castle, while suing OpenAI for lying about being open during their past romance, and it is also stating that its new suitor is not even that rich anyways (appearing obsessively jealous in the process)! Keeps me at the edge of my seat, wondering who will end up with who, when, where, and for how much? Particularly seeing how, as far as openness is concerned, a new potential romantic partner migh just have hit town, looking to exotically challenge OpenAI’s o1, on the large language catwalk: https://api-docs.deepseek.com/news/news250120 8^b

]]>
Comment on HLRS Takes First Steps To Exascale by Slim Albert https://www.nextplatform.com/2025/01/21/hlrs-takes-first-steps-to-exascale/#comment-246406 Thu, 23 Jan 2025 15:44:09 +0000 https://www.nextplatform.com/?p=145208#comment-246406 Nice to see Hunter embracing the MI300A, for serious matrix-vector oomph at 1/10ᵗʰ the power consumption of an all CPU machine (for example Top500’s #38 Shaheen III – CPU) — granted though that all-CPU would be the way to go for graphs (eg. Fugaku vs Frontier and Aurora in Graph500).

]]>
Comment on TSMC Can’t Be Caught Or Bought, Only Sought Or Stolen by Timothy Prickett Morgan https://www.nextplatform.com/2025/01/16/tsmc-cant-be-caught-or-bought-only-sought-or-stolen/#comment-246104 Mon, 20 Jan 2025 16:11:21 +0000 https://www.nextplatform.com/?p=145197#comment-246104 In reply to Eric Olson.

Correct. Happy fingers hit a zero unintentionally.

]]>
Comment on TSMC Can’t Be Caught Or Bought, Only Sought Or Stolen by Anonymouse https://www.nextplatform.com/2025/01/16/tsmc-cant-be-caught-or-bought-only-sought-or-stolen/#comment-246101 Mon, 20 Jan 2025 15:09:10 +0000 https://www.nextplatform.com/?p=145197#comment-246101 5 times faster is 300%?

]]>
Comment on TSMC Can’t Be Caught Or Bought, Only Sought Or Stolen by Eric Olson https://www.nextplatform.com/2025/01/16/tsmc-cant-be-caught-or-bought-only-sought-or-stolen/#comment-245889 Sat, 18 Jan 2025 01:29:40 +0000 https://www.nextplatform.com/?p=145197#comment-245889 By my calculation 6 times 5 is 30.

]]>
Comment on The Bespoke Supercomputing Architecture That Stood the Test of Time by Ray McConnell https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-245852 Fri, 17 Jan 2025 19:21:24 +0000 https://www.nextplatform.com/?p=143347#comment-245852 It would be interesting to understand where you are today given the recent advances in AlphaFold and related protein design tooling coming from DeepMind. BTW are they now your competitors in drug design going forward?

]]>