「コンピュータビジョン、言語モデルが見たものを理解するのをサポートする」

Supporting the understanding of what computer vision and language models see.

.fav_bar { float:left; border:1px solid #a7b1b5; margin-top:10px; margin-bottom:20px; } .fav_bar span.fav_bar-label { text-align:center; padding:8px 0px 0px 0px; float:left; margin-left:-1px; border-right:1px dotted #a7b1b5; border-left:1px solid #a7b1b5; display:block; width:69px; height:24px; color:#6e7476; font-weight:bold; font-size:12px; text-transform:uppercase; font-family:Arial, Helvetica, sans-serif; } .fav_bar a, #plus-one { float:left; border-right:1px dotted #a7b1b5; display:block; width:36px; height:32px; text-indent:-9999px; } .fav_bar a.fav_print { background:url(‘/images/icons/print.gif’) no-repeat 0px 0px #FFF; } .fav_bar a.fav_print:hover { background:url(‘/images/icons/print.gif’) no-repeat 0px 0px #e6e9ea; } .fav_bar a.mobile-apps { background:url(‘/images/icons/generic.gif’) no-repeat 13px 7px #FFF; background-size: 10px; } .fav_bar a.mobile-apps:hover { background:url(‘/images/icons/generic.gif’) no-repeat 13px 7px #e6e9ea; background-size: 10px} .fav_bar a.fav_de { background: url(/images/icons/de.gif) no-repeat 0 0 #fff } .fav_bar a.fav_de:hover { background: url(/images/icons/de.gif) no-repeat 0 0 #e6e9ea } .fav_bar a.fav_acm_digital { background:url(‘/images/icons/acm_digital_library.gif’) no-repeat 0px 0px #FFF; } .fav_bar a.fav_acm_digital:hover { background:url(‘/images/icons/acm_digital_library.gif’) no-repeat 0px 0px #e6e9ea; } .fav_bar a.fav_pdf { background:url(‘/images/icons/pdf.gif’) no-repeat 0px 0px #FFF; } .fav_bar a.fav_pdf:hover { background:url(‘/images/icons/pdf.gif’) no-repeat 0px 0px #e6e9ea; } .fav_bar a.fav_more .at-icon-wrapper{ height: 33px !important ; width: 35px !important; padding: 0 !important; border-right: none !important; } .a2a_kit { line-height: 24px !important; width: unset !important; height: unset !important; padding: 0 !important; border-right: unset !important; border-left: unset !important; } .fav_bar .a2a_kit a .a2a_svg { margin-left: 7px; margin-top: 4px; padding: unset !important; }

MIT researchers created a new annotated synthetic dataset of images that depict a wide range of scenarios, which can be used to help machine-learning models understand the concepts in a scene. ¶ Credit: Khaled Shehada et al. — MITの研究者たちは、さまざまなシナリオを描いた新しい注釈付きの合成データセットを作成しました。これは、機械学習モデルがシーンの概念を理解するのに役立つことができます。 ¶ クレジット：Khaled Shehada et al.

マサチューセッツ工科大学の研究者は、コンピュータ生成のデータを使用して、ビジョンと言語モデルが概念をより良く理解するのを支援する技術を開発しました。

研究者たちは、注釈付きの合成データセットを使用して、人気のあるビジョンと言語モデルを微調整し、概念の理解精度を最大10%向上させました。

彼らは、多様な3次元環境とオブジェクトのコンピュータ生成の合成ビデオを使用して、約80万枚の写真のような画像を生成しました。これには、それらと対話するためにヒューマンアバターが追加されました。

各画像には、オブジェクトの属性、位置関係、人間とオブジェクトの相互作用についての詳細なキャプションが付けられました。

合成データにより、実データを生成するよりも多様な画像を低コストで作成することができ、アバターの使用によりプライバシーを保護することができました。MIT Newsの記事を参照してください。

抄録の著作権は2023年のSmithBucklin、ワシントンDC、アメリカに帰属しています

We will continue to update VoAGI; if you have any questions or suggestions, please contact us!

Artificial Intelligencecomputer applicationshuman-computer interactioninformation systemsperformance and reliability

Was this article helpful?

93 out of 132 found this helpful

「コンピュータビジョン、言語モデルが見たものを理解するのをサポートする」

Was this article helpful?

「AI規制、キャピトルヒルで初歩的な進展を見せる」

「特殊ガラスの構造と開発における特定の酸化物の役割を説明する研究」

機械学習

シンガポールがAIワークフォースを3倍に増やす予定

「NotebookLMは12以上の新機能を追加します」

GoogleがNotebookLMを導入：あなた専用の仮想研究アシスタント

自然言語処理のタクソノミー

「UCSCとTU Munichの研究者が、余震を予測するための新しいディープラーニングベースのモデルであるRECASTを提案する」

中国における大量生産自動運転の課題