Urban75 Home About Offline BrixtonBuzz Contact

No "The AI bubble has finally burst" thread?

It's already "used" data in training (and it'll be the same data as everyone else ie. the entire internet). What they've released is the end result - the billions of weights for the nodes in the network. That is static and unchanging for every LLM, simply because deriving those numbers is so astronomically expensive and time-consuming.

PS: This is why LLMs are a complete dead end for AGI. It's like taking a backup copy of a person, telling them a story and ask them to think of the next word, which you write down. Then you kill them and reboot a new version of that old backup and tell them the slightly longer story, get the next word, kill, rinse repeat. The person is unchanging and only every knows their upbringing and the story you just told them. They don't learn.
I understand bits of that :)

Ah ok so you're running the program offline and using the web as data? I was confused by "off-line" but that makes sense.
 
And we also have:


DeepSeek, the viral AI company, has released a new set of multimodal AI models that it claims can outperform OpenAI’s DALL-E 3.

The models, which are available for download from the AI dev platform Hugging Face, are part of a new model family that DeepSeek is calling Janus-Pro. They range in size from 1 billion to 7 billion parameters. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.
Presumably it's going to need to be open source for people to be able to examine and so trust it - anything closed could open back doors and things.
 
I understand bits of that :)

Ah ok so you're running the program offline and using the web as data? I was confused by "off-line" but that makes sense.

The data is contained in the model once it is trained. No need for the internet; no need to use a cloud service. Need a pretty hefty computer for the high end models though.

GPT4All is the easiest way to go about running one locally, and they have a tonne of lower end models that work well for those with a regular computer.
 


-------
Think in relation to dotcom bubble it's like around 96 when Microsoft, having missed the boat on the Internet went everyone now has the tools to Web publish. Actual Bubble pop was a while later but factors of quality and decernment started to mingle with the buzzwords.
Much like we lost a load of real world presence off the dotcom a substantial proportion of Business is going to go cheap and end up with an AI as fit for purpose as our #1 for...systems
..
 
Surely though if I ask a specialist question then there's no way it'll have enough data on that subject stored on my computer?

That was my intuition to begin with too, and I won't pretend to know how it works exactly, but it's something to do with the way data is stored as a vast neural net.

It's not storing data as regular files, like text files with facts in it; it somehow keeps the data in the same way the human brain does, as activation patterns within the neural net.

I'm basically as clueless as the next guy in regards to the details though, so take my explanation with a pinch of salt. I'm sure someone else will correct me
 
That was my intuition to begin with too, and I won't pretend to know how it works exactly, but it's something to do with the way data is stored as a vast neural net.

It's not storing data as regular files, like text files with facts in it; it somehow keeps the data in the same way the human brain does, as activation patterns within the neural net.

I'm basically as clueless as the next guy in regards to the details though, so take my explanation with a pinch of salt. I'm sure someone else will correct me
You're mostly right. It's basically compression.

Imagine telling someone from 1525 that you could store all the world's books in a device that fits in your hand. They wouldn't think it was possible, but of course it is now, due to how we store it. It's the same with LLMs; they're just more efficient at compressing and storing the information.
one of the major innovations is that they've split the model into "experts" and only use the ones needed to answer the query. Rather than trying to use the whole thing for everything.
OpenAI and others have been doing MoE (Mixture of Experts) for a while. It's not that new. Here's an Nvidia article on it from last year:


No, training is something only the big boys can do. It takes a long time and costs millions in GPU time.
What this does is makes it much cheaper to do the end-user bit, the "fancy autocomplete" bit.

I think training is the thing that's actually been cracked here. A research lab at Berkley claims that they've replicated the training methods from Deepseek for under $60 and can get a 3B model that's "comparable to larger systems".

Deepseek's full model is 685B, so quite a bit bigger, but the costs are now within reach of many more organizations.

I have no idea how this scales, though. And of course, have no idea if any of Deepseek's or other claims are true.
 
Actually I guess those times are in GMT in that chart. So must have been flagged all weekend and traders were hammering away at their keyboards at 8.55 EST. Not seen something like that since Truss got to her feet in the Commons to launch her mini-budget.
 
Surely though if I ask a specialist question then there's no way it'll have enough data on that subject stored on my computer?
Sure it will. The entire text of English wikipedia is about 25GB and that contains loads of redundant information. LLMs are kind of like lossy compression for text. If wikipedia.zip is like a PNG then a LLMs is like a JPG; close enough to the real thing to fool the eye (or rather, fool the brain). So the "weights" data is even smaller than all of wikipedia yet contains approximately all the same knowledge. Kinda. If you don't mind it being less confident about stuff that isn't common knowledge.
 
This is fucking insane... surely the markets must have had some kind of warning here for such a massive bubble burst? Trump's reaction is even weirder, he's actually kind of welcoming it.

View attachment 461589
obviously no warning hence the cliff!
and whats Trump going to say? More tariffs on an open source platform?

heres some perspective on the Nvidia share price....still way up on last year
Nvidia.png
 

Attachments

  • deepseek-releases-new-image-model-family-v0-przh8je4elfe1.jpeg
    deepseek-releases-new-image-model-family-v0-przh8je4elfe1.jpeg
    37.1 KB · Views: 16
Analysis from Pivot to AI:

Chinese company DeepSeek announced its new R1 model on January 20. They released a paper on how R1 was trained on January 22. Over the weekend, the DeepSeek app became the number-one free download on the Apple App Store, surpassing ChatGPT. DeepSeek was the tech talk of the weekend.


On Monday, the markets reacted.


The European markets opened first — six hours ahead of New York — and tech and energy stocks crashed. So US traders knew it was time to dump. [FT, archive]


Markets may recover. But the vibes of the US AI market, and its sense of superiority, have taken a beating.





Number go down​


On Monday, the tech-heavy Nasdaq composite fell 3.1%, while the S&P ended the day 1.46% lower, after hitting a record last week. It was the worst trading day in five years, since the COVID crash of January 27, 2020.


Nvidia took the biggest tumble. The GPU maker saw $598 billion in value disappear from its market cap — the largest single-day market decline in US stock market history.


(Though, to be fair, several of the biggest single-day declines in the past year were also Nvidia, as its stock traced its usual rollercoaster trajectory through the AI bubble.)


A bunch of other techs in hock to AI went down, too. Oracle fell 14%. Super Micro Computer, which makes AI servers, slid 13%. Broadcom fell 17%. Microsoft and Alphabet (Google) also dropped.


Not all big tech stocks took a whooping. Apple was up 3.3%, overtaking Nvidia as the top company by market value. Meta was up 1.9%. Two tiny tech stocks surged after saying they would add DeepSeek to their platforms: Aurora Mobile ADRs soared 229% intraday and MicroCloud Hologram went up 67%. [Bloomberg, archive; Bloomberg, archive]


Energy stocks took a pounding. Oklo, the Sam Altman-backed nuclear fission reactor company, fell 26%, after gaining more than 60% last week.


DeepSeek’s killer advantage: it’s cheaper​


DeepSeek is specifically training the weights — the bits in front of the actual model, which is either Meta’s Llama or Ali’s Qwen model. You’ll frequently see new models that have trained their weights differently, to beat whatever the hot benchmark is. That’s the work that DeepSeek claim to have done on a shoestring. [Stratechery, archive]


Is the R1 model better than all existing models? Well, it benchmarks well. But everyone trains their models to the benchmarks hard. The benchmarks exist to create headlines about model improvements while everyone using the model still sees lying slop machines. No, no, sir, this is much finer slop, with a bouquet from the rotting carcass side of the garbage heap. Goodhart’s Law as a service.


What matters here is that DeepSeek’s API is GPT-compatible and a lot cheaper than GPT-4o via OpenAI. Also, R1 is about as good as any of the current LLMs, and you can run it yourself.


DeepSeek R1 does what LLMs do. But it was reportedly created with a fraction of the resources. That’s the killer advantage.


We remain skeptical of DeepSeek’s claims to have trained on the cheap. The much-touted figure of $5 million to $6 million is just the final training run, which cost $5,576,000. [arXiv]


But that doesn’t matter while the model is cheaper via the API and you can run it at home — that’s what’s making customers leave OpenAI. And that’s what people discovered over the week since R1’s announcement.


DeepSeek’s Janus Pro-7B, released January 27, does images as well as text — and again, you can run it at home (albeit slowly) in 24GB of video RAM. It may not be as good as Midjourney or OpenAI’s Dall-E — but it doesn’t have to be. [Hugging Face; TechCrunch]


How’s OpenAI doing?​


OpenAI was already deeply unprofitable. They priced ChatGPT Pro, with the o1 model, at $200 per month, and then a Chinese company comes out of nowhere and undercuts them


OpenAI has been hemorrhaging key research staff and executives for a while now. In September, Chief technology officer Mira Murati, chief research officer Bob McGrew, and VP of research Barret Zoph all announced they were quitting.


What’s really popped is OpenAI’s credibility. They marketed themselves as needing to burn all those billions of dollars on model training. Along comes DeepSeek with a model that works just about as well but for much cheaper, and saying it was trained for ridiculously less.


Sam Altman is good at raising money and good at making promises about the future — but OpenAI is a machine for spending money as fast as possible, not a machine for working efficiently.


OpenAI doesn’t need a massive AI infrastructure initiative like Stargate — they need to work out how to train more cheaply.


SoftBank, one of the big backers behind Stargate, is notorious for WeWork-level funding disasters — but five days from floating the idea to the crash might be a new record.


What happens next?​


The reaction to the crash includes a lot of people explaining why the tech market panic over DeepSeek validates their previous position, whatever it was.


This crash doesn’t mean AI sucks now or that it’s good now. It just means OpenAI, and everyone else whose stock dipped, was just throwing money into a fire. But we knew that.


Slop generators are inexpensive now, and that’s a sea change — but the output is still terrible slop, just more of it.


We think the market is reacting to the shock. We expect the stock prices to go back up again, at least in the short run. Business that have products and customers should be okay. If cheap LLMs are deployed everywhere, they’ll run on Nvidia.


Venture capital firms who put money into startups training AI foundational models — and it’s quite a lot of money — are likely to be unhappy (now rather than later). [Axios]


The AI bubble and its VC funders are deeply vibes-based. This is because generative AI promoters still haven’t found a convincing use case. The vibes have taken a serious hit.


We don’t know of anyone who predicted that tech stocks would crash just from someone coming up with a cheaper model. But Silicon Valley AI is Theranos on the GPU all the way down, so we should expect that to come out at some point.
 
I mistakenly posted this on the new developments thread. It more properly belongs here:

Concerning Deepseek, I read this last night from Ryan Grim of Dropsite. Really interesting. Because of US protectionism with particular reference to China, Deepseek has been developed on a much cheaper, efficient and open source platform. The big US tech corporations may consequently be in big trouble - pass the box of tissues lol:


also there's this article this morning from Richard Murphy:

 
Last edited:
obviously no warning hence the cliff!
and whats Trump going to say? More tariffs on an open source platform?

heres some perspective on the Nvidia share price....still way up on last year
View attachment 461593

Well yeh theyre no longer the most valuable company in the world. Apple is king again, for a day or two maybe. But yeh $600bn knocked off their valuation in a day. I think that's the biggest loss ever in a single day.

And yeh I love that Trump can't throw tariffs at an open source model. That must really really irk. And must be very confusing for him.
 
Well yeh theyre no longer the most valuable company in the world. Apple is king again, for a day or two maybe. But yeh $600bn knocked off their valuation in a day. I think that's the biggest loss ever in a single day.

And yeh I love that Trump can't throw tariffs at an open source model. That must really really irk. And must be very confusing for him.
The thought of Donald Trump being told about how Open Source works is very very funny.
 
Can someone better versed in this explain how it might affect Starmer's great plans for AI transforming the UK economy?

Am I right in thinking these developments make it even more of a pile of shite than it already is, i.e. any 'growth' in AI infrastructure, power generation, even less likely now?
 
Can someone better versed in this explain how it might affect Starmer's great plans for AI transforming the UK economy?

Am I right in thinking these developments make it even more of a pile of shite than it already is, i.e. any 'growth' in AI infrastructure, power generation, even less likely now?
If they had any really flexible innovative people on the team they would view this as a great opportunity..... but..... perhaps not.
 
I mistakenly posted this on the new developments thread. It more properly belongs here:

Concerning Deepseek, I read this last night from Ryan Grim of Dropsite. Really interesting. Because of US protectionism with particular reference to China, Deepseek has been developed on a much cheaper, efficient and open source platform. The big US tech corporations may consequently be in big trouble - pass the box of tissues lol:


also there's this article this morning from Richard Murphy:

The linked substack which tests Deepseek is pretty impressive:
 
This is fucking insane... surely the markets must have had some kind of warning here for such a massive bubble burst? Trump's reaction is even weirder, he's actually kind of welcoming it.

Trump’s full of shit. He’s hardly going to say that the Chinese have just humiliated the US tech industry.

He’s probably just lost a couple of billion himself.
 
obviously no warning hence the cliff!
and whats Trump going to say? More tariffs on an open source platform?

heres some perspective on the Nvidia share price....still way up on last year
View attachment 461593
Yeh plus its earnings in 29 days which are likely to be impressive, they've been on a massive upswing. I imagine there will also be a lot buying in on this basis with effectively the discount from today. Mostly panic selling due to sentiment as their cards were a near monopoly for running the existing models which was priced in. Deepseek is not dependent on them, however the performance is still far better using the nVidia cards from what I gather.
 
Can't see that the tech bros have any better a record on data privacy than the CCP to be honest.
Some of them are arguably considerably worse.
Wonder if Trump is going to try to ban DeepSeek, would probably be easy enough to do so with the TikTok precedent - the AI's probably got a "flatter Trump" mode that can be activated if a ban appears likely

There is a 100% chance it will be. The tech bros are at the wheel in government. The problem the tech sector has though is multi-part. The model itself can be reused, which is why Nvidia is the one taking a hit atm, and the model is free, meaning there's a hell of an incentive to use it, and a hell of an incentive to just VPN your way to finding it.
 
I see people having difficulty getting av Deepseek account. I got one this morning after the 3rd attempt at sending the code.

It's not any better than ChatGPT. The fuss over it is just that it's gonna be way cheaper and better for the environment. And at least you can ask Chat GPT about Hong Kong, Taiwan etc
 
It's not any better than ChatGPT. The fuss over it is just that it's gonna be way cheaper and better for the environment. And at least you can ask Chat GPT about Hong Kong, Taiwan etc
yes but it is your duty to download it and spook the markets further
i have done so and immediately deleted it
i might download it again - #feeltheagency
 
The Guardian looked into DeepSeek censorship and found some workarounds

chrome-capture-2025-1-28 (1).png



I'm glad to see bloated US tech giants get a bloody nose but not sure if a cheap alternative from a country controlled by a totalitarian regime is a great development
 
The Guardian looked into DeepSeek censorship and found some workarounds

View attachment 461598



I'm glad to see bloated US tech giants get a bloody nose but not sure if a cheap alternative from a country controlled by a totalitarian regime is a great development
presumably if others use the open sources the tianamen square bit isn't hardcoded in
 
Lol, just watching Sky news - their tech journo is saying this is all because of Biden :D

In response to US controls, allies could seek alternatives to US tech, creating unparalleled long-term incentives for Chinese competitors such as Huawei, Baidu, Alibaba, and Tencent. US tech groups and Congressional leaders worry the White House is rushing to avoid outside scrutiny and beat the buzzer on President Biden’s term. US officials have declared that the new controls do not represent a “significant regulatory action” — allowing the rule to skirt normally required stakeholder review.

 
Back
Top Bottom