Stimulētās mācīšanās efektivizācija izmantojot redzes, valodas sasaisti un mikrobioloģijas procesu modelēšanu

Mauriņš, Roberts

Öffnen

302-107674-Maurins_Roberts_rm18057.pdf (4.332Mb)

Autor

Mauriņš, Roberts

Co-author

Latvijas Universitāte. Eksakto zinātņu un tehnoloģiju fakultāte

Advisor

Bārzdiņš, Guntis

Datum

2025

Metadata

Zur Langanzeige

Zusammenfassung

Atšķirībā no citām mašīnmācīšanās metodēm stimulētās mācīšanās apmācības procesā kļūda nav atsķirība starp modeļa izrēķināto un marķēto vērtību (angliski - “ground truth”), bet vairāku secīgu notikumu rezultātu ieguvuma maksimizācija. Ņemot vērā, ka kopējās notikumu secības garums līdz jebkuram ieguvumam var svārstīties no pāris līdz tūkstošiem soļu kā arī to, ka ieguvumam ir stohastisks raksturs, tad šis mašīnmācīšanās paveids ir ievērojami sarežģītāks par jebkuru uzraudzītās mašīnmācīšanās (angliski – “supervised learning”) procesu. Vairumā gadījumā, lai sasniegtu apmacītā aģenta darbības līmeni, kas būtu tuvs cilvēka līmenim joprojām ir nepeiciešams miljoniem iterāciju. Līdz ar lielo valodu modeļu parādīšanos un pieejamību arvien lielāku popularitāti gūst stimulētās mācīšanās efektivizācija izmantojot to padomus vai veikto darbību un situācijas novērtējumus. Darba mērķis ir izpētīt lielo valodu modeļu ietekmi uz vairāku aģentu sistēmas stimulētās mācīšanās efektivizāciju kā arī veidot pašu modeļu arhitektūru iedvesmojoties no mikrobioloģijas procesiem.

Unlike other machine learning methods, in the training process of reinforcement learning, the “error” is not the difference between the model’s computed value and a labeled value (in English, “ground truth”), but rather the maximization of the reward resulting from a sequence of consecutive events. Given that the total length of the event sequence leading to any reward can vary from a few steps to thousands of steps and that the reward has a stochastic nature, this branch of machine learning is significantly more complex than any supervised learning process. In most cases, to achieve a trained agent performance level close to that of humans, millions of iterations are still required. With the emergence and availability of large language models, the efficiency of reinforcement learning is gaining increased popularity by using these models’ advice or by evaluating actions and situations. The aim of the work is to explore the impact of large language models on improving the efficiency of multi-agent systems in reinforcement learning, as well as to design the architecture of the models themselves inspired by microbiological processes.

URI

https://dspace.lu.lv/dspace/handle/7/71062

Collections

Bakalaura un maģistra darbi (EZTF) / Bachelor's and Master's theses [6025]