ChatGPT is WORSE now than before | ChatGPT’s declining accuracy is concerning

ChatGPT is WORSE now than before | ChatGPT’s declining accuracy is concerning

Coding with Dee

2 месяца назад

5,325 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@gregorybolin4672
@gregorybolin4672 - 04.07.2024 09:09

Nice editing and flow 😊

Ответить
@x2phat2cmytoes
@x2phat2cmytoes - 02.07.2024 20:08

Your videos are entertaining, informative and funny! Happy to be a new subscriber! Keep up the goood work.

Ответить
@rickharms1
@rickharms1 - 02.07.2024 17:43

Thank you, I thought it was me. I am a retired system/ network engineer. I did support for a computer sales team. Programming was not a part of my duties, but I could kind of wade my way through some simple issues. Fast forward to today, my hobby is micro controllers, e.g., Arduino with its simplified C++. I have ChatGPT help me. Sometimes it has been of great assistance, especially when exploring new concepts. But, it then gets bogged down, creating questionable and even wrong code. I will show it how it is wrong. At least it apologized. However, it is stubborn, and will ignore some of the issues which it created.

Ответить
@Hcakdot
@Hcakdot - 01.07.2024 17:01

The reason for GPT and others getting 'stupid' is their security training ('aka censoring'), one of the projects I've been working on was using LLMs and similar for identification of 'bad things', one of the tools I use for testing this is a series of photos. These photos are pictures of explosives of various types, on release of GPT4 it could correctly identify various pictures of Semtex in official packaging with warning logos etc. By June 2023 it thought the same pictures were Playdoh, I was testing this monthly and roughly by middle of March is the point is started to turn bad... It turns out that the 'security' features they impose on the model prevent it correctly identifying it, and because of the reinforced learning of the model over time, this corrupts the model...

Ответить
@sanjaybhatikar
@sanjaybhatikar - 01.07.2024 11:09

Some of the frameworks around LLM, like LangChain and Llama Index, cause dependency conflicts from the very first line of code (i.e. `pip install`). The hype is fueling a frenzy of poorly-written and sloppily tested code with many a cut-corner. A tower of babel is arising.

Ответить
@sanjaybhatikar
@sanjaybhatikar - 01.07.2024 11:02

It is like Giggle search, it only gets worse.

Ответить
@D7460N
@D7460N - 26.06.2024 19:03

This is exactly right! GPt4o is TERRIBLE!

Ответить
@Unimatrix69
@Unimatrix69 - 25.06.2024 16:15

ChatGPT is a LANGUAGE probability model NOT A TRUTH ENGINE!

Ответить
@natgenesis5038
@natgenesis5038 - 24.06.2024 09:39

3/10 accuracy of codes and must ask it multiple times just to code something can work .

Ответить
@daviddivas9443
@daviddivas9443 - 23.06.2024 21:04

It's also a problem with RLHF, take a model that surpasses human levels on various things, then ask humans to "align" it. Ends up more "rounded". Especially when the humans doing the grunt work are from mechanical turk or similar. Dumbing it down to the lowest common denominator...

Ответить
@demokratifestmariestad6638
@demokratifestmariestad6638 - 22.06.2024 00:23

Bard (now Gemini) has also got worse and really starts gaslighting after a while

Ответить
@charlesd4572
@charlesd4572 - 20.06.2024 22:36

Inference is pretty cheap - but I guess on scale does make sense still

Ответить
@LukeAvedon
@LukeAvedon - 18.06.2024 20:43

Interesting analysis. I think AI drift is also an issue.

Ответить
@luke2937
@luke2937 - 17.06.2024 10:56

Came here after I got this from GPT


const updateState = (data, setState) => {
if (data) {
setState(data);
}
};

// Effect hooks
useEffect(() => {
updateState(servicesData, setAllServicesOptions);
}, [servicesoneDataservicesData]);

useEffect(() => {
update.
update was interesting to read, but that was years withdifferent component frameworkshipp. ShippingData, setstoneAllucectorcopesalShipping citylybeautyByOptions);
}, [flipping other's cards whilst recycling conversationsmippingData]);

useEffect(() => {
add wonders to their literature, or little hinges to locationsData, conversationsstatistical houseworkoneAllxcludedLocation started new epochs liceptaOptions);
}, [leaning into the cold, dancing everywhere,cationData]);

useEffect(() => {
updateoceans, rains, and fires were folklore millsh.societyData]

Ответить
@rhettr4923
@rhettr4923 - 16.06.2024 16:36

Yep, that's been my experience

Ответить
@IStMl
@IStMl - 16.06.2024 13:29

They should just give us X true GPT-4 queries and let us pick the model when we have a complex prompt

Ответить
@xd-qi6ry
@xd-qi6ry - 29.05.2024 15:18

have made a custom gpt It has superior reasoning and so much more

it is 5x + smarter than base-model, it understands the complex

Its called Smarter Vision Multimodal image/text analysis

Its unlike any custom GPT’s before and is ready for new vision features for 4o


and also an example i’ve been \using is upload an image of a cloud that looks like multiple things but it can be interpreted, the one i have made recognised it was a rabbit every time now on 1st shot so it knows when something is unusual about an image even if you dont say anything is, it can also do iq test image reasoning pattern questions.

It kind of even understands real logic games when giving good instruction



just gotta follow the instructions given to get the right seed its 1 in 2 chance or so i have absolutely no idea why it needs that.

Ответить
@DaysOfFunder
@DaysOfFunder - 28.05.2024 03:56

Absolutely. I noticed chatgpt4 personality has nose dived. Its now very bland and gives hyper ling answers. Previously it would give short concise ansaers when logical. Last night i asked it to NOT present code tet, just confirm it understands my request. It completely ignored this and powered through and gage me a script.
I was shocked bexause thats why i upgraded my plan back to gpt premium, and subsequently reduced down from gpt4o ti got4 because they were all doing the same thing.

This to me suggests they have started to force 3.5 or similar into certain gpt4 scenarios OR chsnged the models to leverage longer form answers to try reduce load. Or something.

It was heartbreaking tbh. I felt like id lost a friend.

Ответить
@arkimphiri
@arkimphiri - 27.05.2024 11:14

Great analysis Dee. My approach has been to use 3 LLMs at once, I ask ChatGPT, Gemini, and Claude at the same time, in one UI using Semaj AI which I developed solely for this purpose. I can confirm indeed that Claude usually gives the best code

Ответить
@DanandNato
@DanandNato - 27.05.2024 01:21

Why did Sam Altman say that? We know its pretty dumb in many areas and its dumber now, but does it mean chat-gpt gets worse in the future?

Ответить
@TheTrainstation
@TheTrainstation - 27.05.2024 00:02

Claude will give you the full code length, gpt4 was super lazy. GPT4o give you the complete code but it glitches out

Ответить
@h.c4898
@h.c4898 - 26.05.2024 22:10

They probably put guardrails due user data privacy and safety measures. These companies come to realize wherever they extract their intels from, mostly from the internet, datas belong to somebody else. They might get sued for it. Unless the datasets belongs to them. Then they can use them freely. It leaves those models depleted otherwise.

That's what they are doing with gemini. Bard performed well and was generous with its response. It penetrated a lot of websites to give me summaries. Gemini can't anymore especially if the website has a subscription strategy they put paywalls.

We'll see where this is headed.

Ответить
@jspencer89yt
@jspencer89yt - 26.05.2024 19:40

I gave it a Word document pre-filled with questions and answers and asked it to remove any identifying factors it gave me back the document and it only said questions and answers literally everything else was gone 😂

Ответить
@Septumsempra8818
@Septumsempra8818 - 26.05.2024 18:52

The context window is much shorter than Claude and Gemini. Copilot was stubborn 2 miths ago, but now its back to working well. The 4-O models are really good. Clocked 1000 lines of code and it did it well.
Honestly, just use all of them at the same time

Ответить
@olabassey3142
@olabassey3142 - 26.05.2024 09:07

lmao i started coding for the first time in 7 years last week and was using chat gpt, after a lot of stress i used claude and got my code working. claude is definitely better. i experimented with gpt, bing/copilot and claude, claude is the best, chatgpt is questionable and bing is brain damaged, bing was even hallucinating without actually returning code. 😂😂😂

Ответить
@AsadKhan-lu8kx
@AsadKhan-lu8kx - 26.05.2024 08:39

Ahh, you are too pretty to talk about mind-numbing codes and tedious tech stuff. Try poetry or singing; you belong in a colorful garden, not a lifeless cubicle.

Ответить
@akagordon
@akagordon - 25.05.2024 00:43

My background is chemistry and data science with a little bit in distributed computing. That has been niche enough to have a few contacts in high places. One of these was at Deep Mind. He had been working on projects in biology, including Alpha Fold, when the LLM wars kicked off and he got temporarily reassigned. He told me a year ago they were already worried about compute and were looking for ways to make them more efficient.

Ответить
@mind_of_a_darkhorse
@mind_of_a_darkhorse - 23.05.2024 22:23

I also find it humorous that Scarlett Johanson threatened to sue them over using her voice as the model's voice and how fast they changed it!

Ответить
@mind_of_a_darkhorse
@mind_of_a_darkhorse - 23.05.2024 22:22

Well-explained details on why ChatGPT is starting to get mediocre! I've noticed that most of the easily available AI Models seem to be horrible at coding. It makes me wonder if the coders writing the code for the models are attempting to maintain their necessity. But your reasoning makes sense as well!

Ответить
@KingHenrySB
@KingHenrySB - 23.05.2024 19:45

Great video, the explanation you provided makes a lot of sense.

Ответить
@KingHenrySB
@KingHenrySB - 23.05.2024 19:37

Ever since they rolled out 4o, it's been more buggy than ever before and 3.5's output has gotten so much worse, it's as if they're intentionally trying to force people into paying for subscriptions

Ответить