Stimulētās mācīšanās efektivizācija izmantojot redzes, valodas sasaisti un mikrobioloģijas procesu modelēšanu

Mauriņš, Roberts

dc.contributor.advisor	Bārzdiņš, Guntis
dc.contributor.author	Mauriņš, Roberts
dc.contributor.other	Latvijas Universitāte. Eksakto zinātņu un tehnoloģiju fakultāte
dc.date.accessioned	2025-06-28T01:06:32Z
dc.date.available	2025-06-28T01:06:32Z
dc.date.issued	2025
dc.identifier.other	107674
dc.identifier.uri	https://dspace.lu.lv/dspace/handle/7/71062
dc.description.abstract	Atšķirībā no citām mašīnmācīšanās metodēm stimulētās mācīšanās apmācības procesā kļūda nav atsķirība starp modeļa izrēķināto un marķēto vērtību (angliski - “ground truth”), bet vairāku secīgu notikumu rezultātu ieguvuma maksimizācija. Ņemot vērā, ka kopējās notikumu secības garums līdz jebkuram ieguvumam var svārstīties no pāris līdz tūkstošiem soļu kā arī to, ka ieguvumam ir stohastisks raksturs, tad šis mašīnmācīšanās paveids ir ievērojami sarežģītāks par jebkuru uzraudzītās mašīnmācīšanās (angliski – “supervised learning”) procesu. Vairumā gadījumā, lai sasniegtu apmacītā aģenta darbības līmeni, kas būtu tuvs cilvēka līmenim joprojām ir nepeiciešams miljoniem iterāciju. Līdz ar lielo valodu modeļu parādīšanos un pieejamību arvien lielāku popularitāti gūst stimulētās mācīšanās efektivizācija izmantojot to padomus vai veikto darbību un situācijas novērtējumus. Darba mērķis ir izpētīt lielo valodu modeļu ietekmi uz vairāku aģentu sistēmas stimulētās mācīšanās efektivizāciju kā arī veidot pašu modeļu arhitektūru iedvesmojoties no mikrobioloģijas procesiem.
dc.description.abstract	Unlike other machine learning methods, in the training process of reinforcement learning, the “error” is not the difference between the model’s computed value and a labeled value (in English, “ground truth”), but rather the maximization of the reward resulting from a sequence of consecutive events. Given that the total length of the event sequence leading to any reward can vary from a few steps to thousands of steps and that the reward has a stochastic nature, this branch of machine learning is significantly more complex than any supervised learning process. In most cases, to achieve a trained agent performance level close to that of humans, millions of iterations are still required. With the emergence and availability of large language models, the efficiency of reinforcement learning is gaining increased popularity by using these models’ advice or by evaluating actions and situations. The aim of the work is to explore the impact of large language models on improving the efficiency of multi-agent systems in reinforcement learning, as well as to design the architecture of the models themselves inspired by microbiological processes.
dc.language.iso	lav
dc.publisher	Latvijas Universitāte
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Datorzinātne
dc.subject	neironu tīkli
dc.subject	stimulētā mācīšanās
dc.subject	lielie valodu modeļi
dc.subject	neural networks
dc.subject	reinforcement learning
dc.title	Stimulētās mācīšanās efektivizācija izmantojot redzes, valodas sasaisti un mikrobioloģijas procesu modelēšanu
dc.title.alternative	Enhancing reinforcement learning efficiency through vision, language connections and microbiological process modeling
dc.type	info:eu-repo/semantics/masterThesis

Files in this item

Name:: 302-107674-Maurins_Roberts_rm1 ...
Size:: 4.332Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Bakalaura un maģistra darbi (EZTF) / Bachelor's and Master's theses [6025]

Show simple item record