할리우드를 공략하는 AI, AI models that understand video are coming(비디오를 이해하는 AI 모델이 온다)

Jung Han

2024년 12월 2일

수십년 된 콘텐츠 스튜디오와 미디어 기업들은 엄청난 영상 자산을 보유하고 있다. 그러나 오래된 회사일 수록 이들 비디오를 제대로 인덱싱하거나 라벨링해놓은 기업이 드물다. 그래서 오랫동안 근무했던 영상 편집자의 기억에 의존하는 경우가 많다.

영상 이해 비디오 AI 스타트업 트웰브 랩스(Twelve labs)의 창업주 이재성(Jae Lee)는 인포메이션과의 인터뷰에서 “가끔 이런 회사들은(미디어 기업들) 콘텐츠가 너무 많아서 자신이 무엇을 가지고 있는지 모를 때가 있다(Sometimes these companies have so much content that they don’t know what they have)”고 말하기도 했다.

최근 텍스트에 이어 비디오를 이해하는 대규모 언어 모델(LLM) 개발에 나서고 있는 AI 개발 회사들이 본격적으로 할리우드 공략에 나섰다. 비디오의 장면과 제작 위치를 정확히 AI가 파악해 특정 화면을 찾아주거나 보다 고도화된 기능으로 제작의 보조 역할을 하는 AI다.

아마존과 오픈AI는 할리우드나 콘텐츠 기업을 향해 가고 있는 대표적인 AI테크 기업이다. 할리우드를 공략하고 있는 생성 기업들이 목표는 ‘비디오를 이해하는’ AI다.

Decades-old content studios and media companies have tons of video assets. But the older the company, the less likely it is that these videos are properly indexed or labeled. Instead, they often rely on the memories of longtime video editors.

"Sometimes these companies (media companies) have so much content that they don't know what they have," Jae Lee, founder of video AI startup Twelve labs, told Information.

Recently, AI development companies that are working on large-scale language models (LLMs) that understand video in addition to text have begun to target Hollywood in earnest.

AI that knows exactly what's happening in a video and where it's happening, so it can find a specific screen or, with more sophisticated capabilities, act as a production assistant.

Amazon and OpenAI are two of the leading AI tech companies that are targeting Hollywood and content companies. The goal of AI that "understands video" is what the Hollywood creators are aiming for.

[아마존, 비디오를 이해하는 LLM 개발 착수]

아마존(Amazon)이 비디오를 보다 깊게 이해하는 대량 언어모델(LLM)을 개발 중이다. 할리우드 스튜디오 등 대규모 비디오 아카이브를 보유한 미디어 기업에 최적화된 모델으로 비디오를 이해해 특정 맥락을 담은 영상을 정확히 찾아줄 수 있다. 비디오 AI가 보다 정교화된다면 방대한 영상 아카이브를 가진 미디어 기업이나 심해에서 작업하는 석유나 가스 시추 기업들에게도 유용할 수 있다.