What is GPT-4 (and When?)

Published in

Towards AI

4 min readNov 14, 2022

Comparing GPT-4 to GPT-3 and human brain (source: Lex Fridman @youtube)

It has been some time since Robert Scoble wrote this about GPT-4. Which pointed me that OpenAI might be giving access to GPT-4 to a certain closed group of individuals. Not sure if it is a conventional alpha-build or a beta but given the timelines, I am guessing that since Aug 2022, there has been enough time passed, which suggests a beta or even early release candidate.

What is GPT-4?

We all know that GPT-3 was a huge leap in itself. The refined model that could spit out fluent paragraphs compared to GPT-2. Since, the release of GPT-3, the discussion on the “next big thing” was fairly quiet and muted. Now, we have got more information about GPT-4.

The concrete specifics about GPT-4 specifications are still in flux due to NDA, however GPT-4 is likely to use 100 Trillion parameters (source). This is the first large-scale model with sparsity at the core design. What does it mean to have sparsity, well, it means that even at the 100T parameter space, the compute cost is likely to be lower. This means that a lot of neurons are still active in the final model. From layman’s understanding, it is a model where the model can keep a lot more choices of “next word” or “next sentence” or “next emotion” based on the context. In all essence, this means it is more similar to actual human thinking than its predecessor.

Wait, but what is GPT

Generative Pre-trained Transformer (GPT) is a text generation deep learning model trained on the data available on the internet. It is used for question & answers, text summary generation, machine translation, classification, code generation, and conversation AI.
The applications of GPT models are endless. Furthermore, you can even fine-tune them on specific data to create even better results (transfer learning). By using the “sauce” from GPT models, building NLP projects becomes a heck lot easier. Easier means you save time, money, and resources and ultimately you use the generalization (giant sample size) to get started without having to reinvent the wheel for general aspects of the language.

GPT-1 to GPT-3

Since 2018 when GPT-1 was first published (link) GPT-3 has made giant progress. The GPT-1 had (only /s) 117 million parameters. GPT-2 raised the bar to 1.2 billion parameters (publication), and GPT-3 raised it even further to 175 billion parameters (publication). For reference, the Deepmind’s Gopher model had 250 billion parameters (publication) and Megatron NLG’s model had 500 billion+ parameters (publication).

OpenAI is Adding Watermark to GPT: No More Plagiarizing

IP protection commonly known as “Watermarking” of AI models is critical for future of use cases of AI. It is being…

ithinkbot.com

OpenAI Releases Embeddings model

It is Powerful, cheaper, and more flexible!

ithinkbot.com

At the same time, Microsoft’s efforts with OpenAI lead to the conclusion that optional hyperparameter tuning has great utility in fine-tuning models at this scale. Generally the larger the model, it is extremely costly to fine-tune it. Deepmind’s chinchilla experiment (publication) concluded that the number of parameters is as important as the size of the training corpus.

Final remarks,

GPT-4 is a text-only model that takes the NLP one giant step ahead promising step. GPT-4 is likely to be released early next year! Given the abilities of GPT-4 extrapolated from GPT-3, we might now need a new Touring test standard. The topic of AI in deepfakes is another stream, but with every leap in the model, we get closer to it.

Since the release of previously released General-Purpose models, a plethora of text has been generated. The understanding of the general populous is unmistakably lower than what it needs to be. I am optimistic but I would love to have a similar concerted effort from the same for-profit entities in improving AI models that can identify an AI-generated text from a human-generated text.

OpenAI just released GPT-3 text-davinci-003, I compared it with 002. The results are impressive!

OpenAI GPT-3 text-davinci-003 produces better quality results (writeup quality, formatting, grammar, and being…

ithinkbot.com

Meta’s Galactica shuts down in 48 hrs!

Following is the abstract for the paper on Galactica — the NLP that is supposed to be a new open-source large language…

pub.towardsai.net

Microsoft/GitHub CoPilot class action lawsuit

In June 2022, Microsoft released the AI-assisted coding solution by Github. The co-pilot was supposed to help generate…

ithinkbot.com

Leadership in AI: Is Your Leadership Fit for Data Science?

Non-technical people leadership that may have transitioned into data science leadership is often unfamiliar with the…

pub.towardsai.net

Undoubtedly, these are exciting times, fall is upon us so let us enjoy the progress. I will be eagerly waiting for more information as and when it comes.!

What is GPT-4 (and When?)

What is GPT-4?

Wait, but what is GPT

GPT-1 to GPT-3

OpenAI is Adding Watermark to GPT: No More Plagiarizing

IP protection commonly known as “Watermarking” of AI models is critical for future of use cases of AI. It is being…

OpenAI Releases Embeddings model

It is Powerful, cheaper, and more flexible!

Final remarks,

OpenAI just released GPT-3 text-davinci-003, I compared it with 002. The results are impressive!

OpenAI GPT-3 text-davinci-003 produces better quality results (writeup quality, formatting, grammar, and being…

Meta’s Galactica shuts down in 48 hrs!

Following is the abstract for the paper on Galactica — the NLP that is supposed to be a new open-source large language…

Microsoft/GitHub CoPilot class action lawsuit

In June 2022, Microsoft released the AI-assisted coding solution by Github. The co-pilot was supposed to help generate…

Leadership in AI: Is Your Leadership Fit for Data Science?

Non-technical people leadership that may have transitioned into data science leadership is often unfamiliar with the…

Written by Mandar Karhade, MD. PhD.