Current directory: /home3/bjinbymy/public_html/indianext/wp-content/mu-plugins ChatGPT Is Improved By Microsoft's Kosmos-1 With Voice And Visual Commands - AI Next
Indianext
No Result
View All Result
Subscribe
  • News
    • Project Watch
    • Policy
  • AI Next
  • People
    • Interviews
    • Profiles
  • Companies
  • Make In India
    • Solutions
    • State News
  • About Us
    • Editors Corner
    • Mission
    • Contact Us
    • Work Culture
  • Events
  • Guest post
  • News
    • Project Watch
    • Policy
  • AI Next
  • People
    • Interviews
    • Profiles
  • Companies
  • Make In India
    • Solutions
    • State News
  • About Us
    • Editors Corner
    • Mission
    • Contact Us
    • Work Culture
  • Events
  • Guest post
No Result
View All Result
Latest News on AI, Healthcare & Energy updates in India
No Result
View All Result
Home AI Next

ChatGPT Is Improved By Microsoft’s Kosmos-1 With Voice And Visual Commands

March 13, 2023
chatgpt

Kosmos-1, a multimodal large language model (MLLM) from Microsoft, can handle both linguistic and visual data. Kosmos-1 can be used for a variety of tasks, including as image captioning, answering visual questions, and more.

The GPT model and the ability to transform a text prompt or input into an output are two LLMs that ChatGPT has advocated.

In a paper titled “Language Is Not All You Need: Matching Perception with Language Models,” Microsoft’s AI researchers assert that while users are impressed by these conversational abilities, LLMs still have trouble handling multimodal inputs like visual and audio recommendations. The study suggests that in order to advance from ChatGPT-like abilities to artificial general intelligence, multimodal perception, or knowledge acquisition and “grounding” in the real world, is required (AGI).

By using LLMs, the robotics companies Everyday Robots, owned by Alphabet, and Google’s Brain Team last year showed the value of grounding in getting robots to follow human descriptions of physical tasks. The plan involved grounding the language model in doable tasks in a specific real-world environment. Similar to how OpenAI’s GPT models were integrated with real-world feedback from Bing’s search ranking and search results, Microsoft used grounding in its Prometheus AI model.

According to Microsoft, the Kosmos-1 MLLM can see general modalities, follow directions (zero-shot learning), and learn context (few-shot learning). The purpose of the study is to “correlate perception with LLMs so that the models can see and speak,” according to the report.

Each illustration demonstrates how MLLMs like Kosmos-1 could automate a task in various scenarios. They may, for instance, explain to a Windows 10 user how to restart their computer (or carry out any other activity with a visual prompt), read a web page to launch a web search, comprehend health data from a gadget, caption photographs, and so forth. The model, however, cannot analyze videos.

Kosmos-1’s performance on the Raven IQ test was also tested by the researchers. “Significant performance difference between the current model and the average level of adults,” according to the findings. The model’s precision nevertheless implied that MLLMs might be able to “perceive abstract conceptual patterns in a nonverbal context” by coordinating perception with language models.

The study towards “web page question answering” looks intriguing given Microsoft’s desire to use Transformer-based language models to make Bing a more formidable rival to Google Search.

Conclusion

Kosmos-1, a Multimodal Large Language Model (MLLM) that can comprehend generic modalities, learn in context (i.e., few-shot), and obey instructions, is introduced by the researchers in this study (i.e., zero-shot). They specifically use large-scale multimodal corpora from the web, such as text and image combinations, image-caption pairs, and text data, to train Kosmos-1 from scratch.

Without making any adjustments to or updates to the gradients, researchers examine several settings, including zero-shot, few-shot, and multimodal chain-of-thought prompting, on a variety of tasks. According to experimental findings, Kosmos-1 is quite effective with

I language production, comprehension, and OCR-free NLP (directly fed with document images),

(ii) perception-language exercises such multimodal discussion, captioning of images, and answering visual questions, and

(iii) visual tasks like description-based picture identification (specifying classification via text instructions).

The researchers also demonstrate that cross-modal transfer—i.e., the transfer of knowledge from one modality to another—can be advantageous for MLLMs. Last but not least, they provide a dataset for the Raven IQ exam, which gauges how effectively MLLMs can reason abstractly.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Editors Corner

How can Artificial Intelligence tools be a blessing for recruiters?

Will Artificial Intelligence ever match human intelligence?

Artificial Intelligence: Features of peer-to-peer networking

What not to share or ask on Chatgpt?

How can Machine Learning help in detecting and eliminating poverty?

How can Artificial Intelligence help in treating Autism?

Speech Recognition and its Wonders in your corporate life

Most groundbreaking Artificial Intelligence-based gadgets to vouch for in 2023

Recommended News

AI Next

Google: AI From All Perspectives

Alphabet subsidiary Google may have been slower than OpenAI to make its AI capabilities publicly available in the past, but...

by India Next
May 31, 2024
AI Next

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

New research from Bryter, which involved over 200 doctors from the US and the UK, including neurologists, hematologists, and oncologists,...

by India Next
May 31, 2024
Solutions

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

Three government agencies joined forces to form a synergy in order to deliver eMigrate services through Common Services Centers (CSCs)...

by India Next
May 31, 2024
AI Next

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

The advent of artificial intelligence has significantly changed the landscape of entrepreneurship. The figures say it all. Global AI startups...

by India Next
May 31, 2024

Related Posts

Google
AI Next

Google: AI From All Perspectives

May 31, 2024
Pfizer
AI Next

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

May 31, 2024
Artificial-Intelligence
AI Next

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

May 31, 2024
openai
AI Next

OpenAI Creates An AI Safety Committee Following Significant Departures

May 31, 2024
Load More
Next Post
artificial-intelligence

Cancer Medicines Could Be Developed By AI In Less Than A Month

IndiaNext Logo
IndiaNext Brings you latest news on artificial intelligence, Healthcare & Energy sector from all top sources in India and across the world.

Recent Posts

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

OpenAI Creates An AI Safety Committee Following Significant Departures

Tags

  • AI
  • EV
  • Mental WellBeing
  • Clean Energy
  • TeleMedicine
  • Healthcare
  • Electric Vehicles
  • Artificial Intelligence
  • Chatbots
  • Data Science
  • Electric Vehicles
  • Energy Storage
  • Machine Learning
  • Renewable Energy
  • Green Energy
  • Solar Energy
  • Solar Power

Follow us

  • Facebook
  • Linkedin
  • Twitter
© India Next. All Rights Reserved.     |     Privacy Policy      |      Web Design & Digital Marketing by Heeren Tanna
No Result
View All Result
  • About Us
  • Activate
  • Activity
  • Advisory Council
  • Archive
  • Career Page
  • Companies
  • Contact Us
  • cryptodemo
  • Energy next
  • Energy Next Archive
  • Home
  • Interviews
  • Make in India
  • Market
  • Members
  • Mission
  • News
  • News Update
  • People
  • Policy
  • Privacy Policy
  • Register
  • Reports
  • Subscription Page
  • Technology
  • Top 10
  • Videos
  • White Papers
  • Work Culture
  • Write For Us

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

IndiaNext Logo

Join Our Newsletter

Get daily access to news updates

no spam, we hate it more than you!