Lielo valodas modeļu izmantošana dokumentu vienkāršošanai

Straume, Katrīna Tīna

dc.contributor.advisor	Ivanovs, Maksims
dc.contributor.author	Straume, Katrīna Tīna
dc.contributor.other	Latvijas Universitāte. Datorikas fakultāte
dc.date.accessioned	2024-06-20T01:04:31Z
dc.date.available	2024-06-20T01:04:31Z
dc.date.issued	2024
dc.identifier.other	102959
dc.identifier.uri	https://dspace.lu.lv/dspace/handle/7/66141
dc.description.abstract	Šajā darbā izstrādāts pētījums par lielo valodas modeļu izmantošanu teksta vienkāršošanai jeb saīsināšanai. Teksta saīsināšana ir garu teksta dokumentu pārveidošana īsākā, saprotamā tekstā, saglabājot oriģinālā teksta kontekstu un svarīgākās detaļas. Darbā apskatīti pieci lielie valodas modeļi un to teksta saīsināšanas spējas, izmantojot četras dažādas datu kopas – trīs specifiski teksta saīsināšanas pārbaudei sagatavotas datu kopas un viena autores atlasīta datu kopa. Katrā datu kopā aprēķināti modeļu saīsināšanas rezultāti ar ROGUE metriku. Pētījuma rezultāti norāda, ka modeļu veiktspēja atšķiras no konkrētās datu kopas, proti, no teksta sarežģītības un struktūras. Balstoties uz pētījuma secinājumiem, iespējams izvēlēties personīgi piemērotāko lielo valodas modeli attiecīgajam pielietojumam un avota tekstam. Darbs ietver teorētisko pārskatu par dabiskās valodas apstrādi, valodas modeļu attīstību, pētījuma metodoloģiju, eksperimenta gaitu, rezultātus un secinājumus.
dc.description.abstract	In this work is a developed study about the use of large language models for text simplification or summarization. Text summarization is the transformation of long text documents into shorter, understandable text, preserving the context and important details of the original text. The paper examines the five large language models and their summarization abilities using four different data sets – three data sets specifically prepared for testing text summarization and one data set selected by the author. Model summarization results are calculated with ROGUE metrics for each data set. The results of the experiment indicate that the performance of the models varies by the specific data set, namely with the complexity and structure of the original text document. Based on the experiment results, it is possible to choose the most suitable large language model for the relevant application and source text. The work includes a theoretical overview of natural language processing, the development of language models, research methodology, the course of the experiment and conclusion.
dc.language.iso	lav
dc.publisher	Latvijas Universitāte
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Datorzinātne
dc.subject	Teksta saīsināšana
dc.subject	Lielie valodas modeļi
dc.subject	ROGUE metrika
dc.subject	Datu kopas
dc.title	Lielo valodas modeļu izmantošana dokumentu vienkāršošanai
dc.title.alternative	Using large language models for document simplification
dc.type	info:eu-repo/semantics/bachelorThesis

Files in this item

Name:: 302-102959-Straume_Katrina.Tin ...
Size:: 627.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Bakalaura un maģistra darbi (EZTF) / Bachelor's and Master's theses [5688]

Show simple item record