We are simply experiencing the limitations of the current model’s capabilities. Each model can only do so much based on its boundaries, which is why Open AI has been continuously evolving from GPT-1 to GPT-4. As a black box, nobody fully understands what each model is capable of, which naturally leads to fear and uncertainty among people.

I have been using Open AI since its release and I have noticed that it can sometimes be repetitive. It heavily relies on prompt engineering, meaning that the quality of the output depends on the quality of the input. It’s like a wide range of inputs being processed and transformed into output. If you provide a good prompt, you’ll get good output; if you provide a bad prompt, you’ll get bad output. This concept is not new; it’s similar to the idea of “Garbage In Garbage Out” in software engineering, but on a much larger scale.

Photo by Erick Butler on Unsplash

Our brains struggle with large numbers

If you’ve heard the phrase “the house always wins” and wondered why people still go to casinos, it’s because our brains are not naturally wired to think about big numbers.

Let’s consider a thought experiment.

Imagine you’re in Las Vegas and someone asks you how much you would bet that a coin will land on heads 20 times in a row. Would you believe that it’s remotely possible? Let’s do some math.

https://sciencenotes.org/coin-toss-probability-formula-and-examples/

If we calculate the chances of getting 20 consecutive heads in a row, we find that it is only 0.38%, or 1 in 262 attempts. If you actually saw 20 heads in a row, would you think that the coin was rigged? Surprisingly, this outcome is entirely possible; the question is how many times we need to flip the coin to increase the odds of getting 20 consecutive heads. If you were to flip the coin a million times, there would be a 38% chance of getting 20 heads in a row. If your bet was not time-bound and the coin kept flipping for an extended period, it’s likely that you would eventually see 20 consecutive heads. If the number of flips was increased to one trillion, there would be a 77.7% chance of getting 20 consecutive heads. Depending on the bet and circumstances, you could easily end up on the losing side if there was a machine continuously flipping coins or a group of people doing so.

Why are we talking about coins and math?

I mentioned coins to illustrate that something we may consider rigged can actually occur with a relatively high probability depending on the size of our data set.

Now back to AI. The people running data models keep increasing the size of the data so that it can predict outputs based on certain inputs. Of course, this is an oversimplification as training is involved, but it’s the general idea. GPT-4 currently has around 1.7 trillion parameters, and with millions of people consistently asking questions, some were bound to reach the limit of that particular data set.

This is also why small companies cannot easily start their own AI ventures. They would need to accumulate more data than Open AI’s current model size. Only large conglomerates like Google have been able to compete in this space because they have amassed massive amounts of data sets which allowed them to develop new AI engines relatively quickly. While flipping a coin will almost always result in either heads or tails (with a small off chance of landing on its edge after a number of attemtps), training an AI engine with massive amounts of data requires time and resources that Google had at its disposal.

So what?

Regardless of the sheer data size, there will also be boundaries, the same way we tossed the coin a trillion times, the coin will never land as a fruit, it could land on it’s edge but it will never land as a completely different thing.

AI can be smarter by adding more and more data set, as they currently are and inevitably replacing jobs that doesnt do much thinking and with enough data set, it could even replace the most sophisticated white-collar job.

We still don’t know the final boundaries, as we have only reached the one petabyte data set mark. What will happen when we reach a googol (10 to the power of 100) more of its data set? The fear is that once AI processes all possible data, it will start encroaching on all our jobs, regardless of their nature. What do we do then? How will the economy react?

To be honest, I don’t know.

© Melchor Tatlonghari. All rights reserved.