Vietnam.vn - Nền tảng quảng bá Việt Nam

Độc lập - Tự do - Hạnh phúc

Few parameters, lots of data

VietNamNetVietNamNet18/05/2023


PaLM 2, Google's latest large language model (LLM) announced last week, uses nearly five times the amount of training data as its 2022 predecessor, allowing it to do more advanced programming, math, and content creation.

At the Google I/O Developer Conference, the search giant introduced PaLM 2 - a language model trained on 3.6 trillion tokens. These tokens are sequences of words - the building blocks used to train LLM to predict the next word.

The previous version of PaLM was released in 2022 and was minted with 780 billion tokens.

Google CEO Sundar Pichai at last week's Google I/O event introduced the company's latest large language model PaLM 2

While Google has been touting its AI prowess in search, email, word processing, and spreadsheets, the company has been reluctant to disclose the size or details of its training datasets. OpenAI also keeps the details of its latest LLM training algorithm, GPT-4, a secret.

Tech companies attribute the reason to the competitive nature of their businesses. Both Google and OpenAI are racing to attract users with chatbots rather than traditional search engines.

Compact, powerful, cost effective

Google says PaLM 2 is smaller than its predecessors, training with 340 billion parameters compared to 540 billion for the previous version. This shows that the company's technology is becoming more efficient at performing complex tasks.

To achieve this, PaLM 2 uses a new technique called “extended computing optimization,” which delivers “better overall performance, including faster inference with fewer parameters that reduce overhead.”

Google's latest language AI model, trained in over 100 languages, is performing a variety of tasks for 25 features and products, including the experimental chatbot Bard. PaLM 2 comes in four versions based on size, from smallest to largest: Gecko, Otter, Bison, and Unicorn.

According to public documentation, PaLM 2 is more powerful than any existing model. Facebook's LlaMA, released in February, was trained on 1.4 trillion tokens. Meanwhile, OpenAI last publicly disclosed the training data size for ChatGPT, a version of GPT-3, with 300 billion tokens.

The explosion of AI applications has created controversy around the technology. Earlier this year, El Mahdi El Mhamdi, a senior scientist at Google Research, resigned in protest at the search giant’s lack of transparency.

This week, OpenAI CEO Sam Altman also testified before the US Senate Judiciary subcommittee on privacy and technology in the context of AI becoming more widespread, where the “father” of ChatGPT agreed with lawmakers that new regulations are needed to govern AI.

(According to CNBC)



Source

Comment (0)

Simple Empty
No data
Patriotism in the young way
People joyfully welcome the 80th anniversary of National Day
Vietnam women's team beat Thailand to win bronze medal: Hai Yen, Huynh Nhu, Bich Thuy shine
People flock to Hanoi, immersing themselves in the heroic atmosphere before National Day.
Suggested locations to watch the parade on National Day September 2
Visit Nha Xa silk village
See beautiful photos taken by flycam by photographer Hoang Le Giang
When young people tell patriotic stories through fashion
More than 8,800 volunteers in the capital are ready to contribute to the A80 festival.
The moment the SU-30MK2 "cuts the wind", air gathers on the back of the wings like white clouds

Heritage

Figure

Enterprise

No videos available

News

Political System

Destination

Product