A systematic review of SQL-on-Hadoop by using compact data formats

Plase, Daiga

dc.contributor.author	Plase, Daiga
dc.date.accessioned	2016-10-31T07:15:12Z
dc.date.available	2016-10-31T07:15:12Z
dc.date.issued	2016-10-30
dc.identifier.uri	https://dspace.lu.lv/dspace/handle/7/34452
dc.description	Article also submitted for publication in Baltic J. Modern Computing (BJMC) on October 5, 2016.
dc.description.abstract	There are huge volumes of raw data generated every day. The question is how to store these data in order to provide faster data access. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats shows that two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, a systematic review of SQL-on-Hadoop by using Avro and Parquet has been performed over the past six years (2010–2015) using publications of conference proceedings and journals of IEEEXplore, ACM Digital Library, ScienceDirect. With the help of search strategy followed, 94 research papers have been identified out of which 17 have been analyzed as relevant papers. At the end, the conclusion has been made that direct comparison by compactness and fastness between Avro and Parquet do not exist in data science.	en_US
dc.language.iso	eng	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Systematic review
dc.subject	Big Data
dc.subject	Hadoop
dc.subject	HDFS
dc.subject	Avro
dc.subject	Parquet
dc.title	A systematic review of SQL-on-Hadoop by using compact data formats	en_US
dc.type	info:eu-repo/semantics/preprint	en_US

Файлы в этом документе

Имя:: Daiga_Plase_A1_lit_review_BJMC ...
Размер:: 606.3Kb
Формат:: PDF
Описание:: Text

Открыть

Имя:: Results_v6.xlsm
Размер:: 536.9Kb
Формат:: Unknown
Описание:: Results

Открыть

Данный элемент включен в следующие коллекции

Preprinti (MII) / Preprints [12]

Показать сокращенную информацию