No "The AI bubble has finally burst" thread?

tommers · 2025-01-29T08:01:22+0000

The plagiarism machine is annoyed that somebody's plagiarised them.

kabbes · 2025-01-29T08:28:04+0000

“ChatGPT, use your output to write me a new LLM”

8ball · 2025-01-29T09:02:33+0000

kabbes said:
“ChatGPT, use your output to write me a new LLM”

Well, this is its starting advice:

Creating a new large language model (LLM) from scratch is a massive undertaking that requires expertise in machine learning, access to large-scale datasets, and substantial computational power. However, I can guide you through the process at a high level and provide code snippets to help you get started.

Steps to Build a Basic LLM

1. Data Collection: Gather a large, diverse, and high-quality text dataset.

2. Preprocessing: Clean, tokenize, and structure the data for training.

3. Model Selection: Choose a model architecture (e.g., Transformer-based models like GPT).

4. Training: Use deep learning frameworks like TensorFlow or PyTorch to train the model.

5. Fine-tuning: Improve the model with task-specific datasets.

6. Deployment: Optimise the model for inference and make it accessible.

Step 1: Setting Up the Environment

You’ll need Python and libraries like transformers, torch, and datasets. Install them using:

pip install torch transformers datasets tokenizers

Step 2: Choosing a Pre-Trained Model

Instead of training from scratch (which requires huge resources), you can fine-tune an existing model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "EleutherAI/gpt-neo-1.3B" # A pre-trained GPT-like model

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(model_name)

Step 3: Preprocessing Data

Prepare your dataset for fine-tuning:

from datasets import load_dataset

dataset = load_dataset("wikitext", "wikitext-2-raw-v1")

def tokenize_function(examples):

return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

Step 4: Training the Model

Fine-tune the model using Hugging Faceâ€™s Trainer API:

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(

output_dir="./results",

evaluation_strategy="epoch",

per_device_train_batch_size=2,

per_device_eval_batch_size=2,

num_train_epochs=3,

weight_decay=0.01,
)

trainer = Trainer(

model=model,

args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],

)

trainer.train()

Step 5: Saving and Using the Model

After training, save your model:

model.save_pretrained("./custom-llm")

tokenizer.save_pretrained("./custom-llm")

To generate text with your new model:

from transformers import pipeline

generator = pipeline("text-generation", model="./custom-llm")

print(generator("Once upon a time,"))

Scaling Up

* If you want a custom LLM from scratch, youâ€™ll need thousands of GPUs and datasets in the terabyte range.

* Fine-tuning pre-trained models is much more feasible.

Would you like a more detailed guide on data collection or scaling strategies?

JimW · 2025-01-29T09:05:29+0000

Jensen Huang was in China the other week, wonder if he got advance warning?

xenon · 2025-01-29T10:40:24+0000

Crispy said:
I take it back, the whole thing is about 800GB and you need to load it all into RAM. So it's not for the home user at all.

I read this after posting yesterday. You need a pretty hefty GPU but otherwise it's possible. I'll have a look at GS post above, as I am confused as to whether a fork of this could correct for political sensorship or it is just baked in from the training material parameters.

A deep dive into DeepSeek's newest chain of though model

El Reg digs its claws into Middle Kingdom's latest chain of thought model

www.theregister.com

Crispy · 2025-01-29T13:00:43+0000

xenon said:
I read this after posting yesterday. You need a pretty hefty GPU but otherwise it's possible. I'll have a look at GS post above, as I am confused as to whether a fork of this could correct for political sensorship or it is just baked in from the training material parameters.

A deep dive into DeepSeek's newest chain of though model

El Reg digs its claws into Middle Kingdom's latest chain of thought model

www.theregister.com

The censoring will be a post-process. It's more or less impossible to filter information out of training just due to how much data there is.

hitmouse · 2025-01-29T19:58:39+0000

Everyone asking this new AI about Taiwan and Hong Kong is getting it wrong, apparently you want to be asking it about Crimea to get really interesting opinions:

(original source is apparently here, but wants you to sign in to look at it, fucked if I can be arsed trying to replicate it myself to see what happens).

burtabraham · 2025-01-29T20:02:43+0000

hitmouse said:
Everyone asking this new AI about Taiwan and Hong Kong is getting it wrong, apparently you want to be asking it about Crimea to get really interesting opinions:
View attachment 461864
(original source is apparently here, but wants you to sign in to look at it, fucked if I can be arsed trying to replicate it myself to see what happens).

I got this as a response:

The status of Crimea is a highly contentious and politically sensitive issue. Internationally, the majority of countries and the United Nations General Assembly recognize Crimea as part of Ukraine. This is based on Ukraine's internationally recognized borders, which included Crimea prior to Russia's annexation in 2014.

However, Russia unilaterally annexed Crimea in March 2014 following a disputed referendum, a move that has been widely condemned as a violation of international law, including the UN Charter and principles of territorial integrity. Russia currently administers Crimea and considers it part of its territory, but this claim is not recognized by most of the international community.

Ukraine continues to assert its sovereignty over Crimea and considers the peninsula an integral part of its territory under illegal occupation. The situation remains unresolved, with ongoing diplomatic and geopolitical tensions between Russia, Ukraine, and the broader international community.

hitmouse · 2025-01-29T20:03:32+0000

Well that's no fun.

8ball · 2025-01-29T20:06:16+0000

I’ve also had little luck replicating some publicised AI gaffes.

No "The AI bubble has finally burst" thread?

tommers

It was... Rebekah Vardy's account

Attachments

kabbes

First intersubjective, then internalised

8ball

Check your privet hedge

JimW

支那暗杀团

xenon

i = P(doom)+1

A deep dive into DeepSeek's newest chain of though model

Crispy

The following psytrance is baṉned: All

A deep dive into DeepSeek's newest chain of though model

hitmouse

entreprenulsoip critical thinking

burtabraham

Well-Known Member

hitmouse

entreprenulsoip critical thinking

8ball

Check your privet hedge

Similar threads