6 Signs It’s Time To Automate

All About The GPT-3 Hype

Language models in NLP(Natural Language Processing) Systems are Machine Learning based models which are trained to learn and understand Natural Languages like how humans do. A simple example of a trained language model will predict the occurrence of the next word in a sentence.

language model

Some of the common Language model use cases are

  • Language Translation
  • Text Classification
  • Sentiment Extraction
  • Reading Comprehension
  • Named Entity Recognition
  • Question Answer Systems
  • News Article Generation, etc

Traditionally there have been 2 major Language models, which are statistical and Neural network based Language models .

All About The GPT-3 Hype

Figure 2 Language Model Techniques

Statistical Language models typically predict based on probabilistic distribution of a word given preceding ones using techniques such as N-Gram, Hidden Markov model etc.

Neural Net based models are little more sophisticated than statistical, as they use neural nets to model the language. 

The challenges

   The challenge with both these traditional models are

Lack of Agility : The time and effort required to collect vast amount of data, pre-process, creating sequence , encoding the sequence , splitting the data for training and validation, deploying the model and inferencing it is so huge

All About The GPT-3 Hype

Figure 3 Stages of Language Model training

        Domain specific :  Models trained against data from one domain cannot predict the data from another domain. Ex. Cloud based Q&A , chatbot applications are trained to answer questions from set of pre-defined document collection. If you rephrase the question it might give you a diff answer. If you train the Language model by feeding it with the Reuters financial news feed , then the prediction of the next word is as follows

language model

Figure 4 Model trained against finance news feeds data

 Transfer Learning :

Transfer Learning is a Machine learning technique by storing knowledge gained while solving one problem and applying it to a different but related problem. 

In 2018 the concept of pre-trained transformer models became popular after  Google’s BERT (Bidirectional Encoder Representations from Transformers ) paper , which falls into the above mentioned Transfer Learning technique which is basically pre-training the model against vast amount of data and transferring the model’s learning to do relevant tasks.

language model

All About The GPT-3 Hype

Figure 5 Evolution of Language Models

Google’s BERT model was originally trained with 340 million parameters against Wikipedia and millions of book corpus data to build  simple Q&A application, the model accuracy was by far the best at that time. Facebook and microsoft also created BERT based models such as  RoBERTa and codeBERT( NL-PL conversion) respectively . Following the trend that larger natural language models lead to better results, Microsoft Project Turing introduced Turing Natural Language Generation (T-NLG), the largest model ever trained using 17 billion parameters as of Jan’2020 .  NLP tasks such as Writing news articles, generating code etc. became lot simpler with these transformer models without needing to have much processing headache for NLP engineers.

Around Jun’2020 OpenAI released their Beta version of GPT-3 model, which was trained against 175 billion parameters, which is almost all of the internet is the most sophisticated Language model ever built with such large parameter set. More parameters the model is trained against, better the predictions would be.

OpenAI : GPT-3( Generative Pre-Trained Transformer model 3rd version) released by San Francisco based AI research company called OpenAI , which was founded in 2015 by Sam Altman and Elon Musk , Microsoft invested $1bn in 2019 became an exclusive Cloud provider and the GPT-3 models are trained against the Microsoft’s AI super computer. 

OpenAI

Figure 6 Stats about the data used to train GPT-3

Working with GPT-3

Before start working with GPT-3 API, lets first understand some key concepts.

Prompt: The text input given to the API

Completion: The resultant text that the API generates as a result of processing text input prompt

Token: The number of tokens( chopped sentence into pieces)

Let’s look at how powerful the model is at some NLP tasks

Example 1:  SQL Generator prompt is a simple English sentence to get the count of total employees in HR department

SQL Generator

Figure 7 GPT-3 model converting  Natural language to SQL

With couple of examples to prime the model, the GPT-3 model is able to produce the SQL statements accurately.

Example 2: Email message generator – I prompted the model to generate email message for a typical hotel booking 

Email Message Generator

Figure 8 GPT-3 model able to write email message on my behalf 🙂

Sample code for both the above examples can be found here

There are a lot of examples provided by OpenAI in their playground.

Playground – OpenAI API

Though the model performs to a greater extent, researchers fear it can heavily pose a threat to disinformation, where it can be used by bad actors to create an endless amount of fake news, spread misinformation etc. Here is the tweet by the Sam Altman , the CEO of OpenAI  

Based on my years of experience dealing with Financial Services Industries customers, I certainly believe there are some valuable use cases which are best suited for GPT-3 model. 

FSI Use Cases

  • Automated Named Entity Extraction 
  • Sales trader- Client Meeting notes summarization
  • Financial statement summarization
  • Financial sentiment analyzer
  • Domain specific speech to text translators, robotic form filling based on user voice inputs

References:

Disclaimer:

This is completely my personal view on GPT-3 model, The opinions expressed here represent my own and not those of my employer. In addition, my thoughts and opinions change from time to time I consider this a necessary consequence of having an open mind. 

  • Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Leave a Comment
    Next Post

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »