Projektu pārvaldības optimizēšana surogātpasta un e-pasta klasifikācijas mašīnmācībai

E-pasts (elektroniskais pasts) ir pasaulē visvairāk izmantotā saziņas platforma starp lietotājiem, izmantojot dažādas ierīces. Tas ir tāpēc, ka to ir ļoti viegli lietot un tā ir ātrāka nekā citas saziņas platformas. Mūsdienu pasaulē surogātpasta e-pasta aktivitāšu skaits pieaug ar katru dienu, un katru dienu parādās daudz šādu gadījumu. Pašreizējā situācijā pēc Covid-19 ir reģistrēti vairāk nekā 18 miljoni krāpniecisku e-pasta aktivitāšu, un šo surogātpasta e-pastu dēļ pieaug arī personas informācijas zādzības un pikšķerēšanas aktivitāšu risks. Turklāt, sūtot e-pastu, sūtītājam nav garantijas, ka viņa/viņas e-pasts nonāks saņēmēja parastajā (galvenajā) iesūtnē vai surogātpasta mapē. Tas var palēnināt visu saziņas procesu vai dažkārt palikt bez uzraudzības. Šī projekta mērķis ir izmantot un optimizēt projektu vadības principus mašīnmācīšanās jomā, kā arī apspriest un aprakstīt, kā mašīnmācīšanās algoritms palīdzēs atrisināt šo surogātpasta e-pasta klasifikācijas problēmu. Šajā projektā dažādu mašīnmācīšanās algoritmu apmācības nolūkos tiek izmantoti dažādi datu kopumi, lai izvēlētos, kurš no tiem vislabāk darbosies šāda veida teksta klasifikācijai. Pēc apmācības algoritms izmanto bināros klasifikatorus, lai kategorizētu e-pastus divās dažādās kategorijās (surogātpasta e-pasti un e-pasti, kas nav surogātpasta e-pasti). Galīgais algoritms prognozēs surogātpasta teksta procentuālo daļu e-pastā un to, vai šis e-pasts nonāks surogātpasta mapē vai saņēmēja e-pasta galvenās iesūtnes mapē. Algoritms arī identificēs un norādīs kļūdaino tekstu e-pastā, ja e-pasts jau ir klasificēts kā surogātpasta e-pasts.
Email (electronic mail) is the world’s most used communication platform between users through different devices. This is because it is very easy to use and quicker than other communication platforms. In today’s world the amount spam email activities are increasing day by day and a lot of cases are coming every single day. As in the current situation of covid-19 more than 18 million scam email activities are raised and because of these spam emails the risk of stealing the personal information and phishing activities is increasing as well. Also, while sending an email the sender is not guaranteed that his/her email will land in the normal (Primary) inbox of the receiver or in the spam folder. Which can make the whole process of communication slow or sometime unattended. The aim of this project is to use and optimize principals of project management in machine learning field and discuss and describe how the machine learning algorithm will help in solving this problem of spam email classification. This project uses different datasets for training purposes of different machine learning algorithms and choose which one will work best for this type of text classification. After training, the algorithm uses binary classifiers to categorize the emails into two different categories (spam email and non spam emails). The finalized algorithm will predict the percentage of spam text present inside email and will predict whether this email will land in the spam folder or in the Primary inbox folder of the receiver email. The algorithm will also identify and provide the faulty text present inside the email if the email is classified as spam email already.

Keywords

Ekonomika un uzņēmējdarbība, Machine Learning, Project Management, Artificial Intelligence, Project Planning

URI

https://dspace.lu.lv/handle/7/70380

Collections

Bakalaura un maģistra darbi (ESZF) / Bachelor's and Master's theses

Full item page

Projektu pārvaldības optimizēšana surogātpasta un e-pasta klasifikācijas mašīnmācībai

Files

Date

Authors

Co-author

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Language

Abstract

Keywords

Citation

Relation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By