Multimodāla meklēšana video un attēlu arhīvā

Mētra, Oskars

Открыть

302-109386-Metra_Oskars_om21018.pdf (638.5Kb)

Автор

Mētra, Oskars

Co-author

Latvijas Universitāte. Eksakto zinātņu un tehnoloģiju fakultāte

Advisor

Sproģis, Artūrs

Дата

2025

Metadata

Показать полную информацию

Аннотации

Bakalaura darbs sastāv no ievada, 3 daļām un secinājumiem Bakalaura darba ietvaros tika salīdzināti 3 multimodālie un 7 teksta iegulto elementu modeļi attēlu meklēšanai. Multimodālajiem modeļiem tika salīdzināti 3 varianti: veidojot iegulto elementu attēla aprakstam, veidojot iegulto elementu attēlam un veidojot apvienoto apraksta un attēla iegulto elementu. Visiem 10 iegulto elementu modeļiem tika salīdzinātas daudzvalodības spējas, salīdzinot to atgrieztos rezultātus angļu, vācu un latviešu valodā, kā arī tika pētīta vaicājumu tulkošanas uz angļu valodu ieguvums. Tika salīdzināti Google un OpenAI iegulto elementu modeļi, vaicājumu tulkošanai tika izmantots Google Cloud Translate API. Iegulto elementu vektoru datubāzei tika izmantots Weaviate.

The Bachelor's work consists of introductory, 3 parts and conclusions As part of the Bachelor's work, 3 multimodal and 7 text embeddings models were compared for image searching. For multimodal models, 3 variants were compared: creating an embedded element for the image description, creating an embedded element for the image, and creating a combined description and image embedded element. All 10 models of embedded elements compared multilingualism abilities by comparing their returned results in English, German and Latvian, as well as studying the benefits of translating queries to English. Embeddings models of Google and OpenAI were compared, the Google Cloud Translate API was used to translate queries. Weaviate was used for the embedded element vector database.

URI

https://dspace.lu.lv/dspace/handle/7/71498

Collections

Bakalaura un maģistra darbi (EZTF) / Bachelor's and Master's theses [6168]