Show simple item record

dc.contributor.authorPlase, Daiga
dc.date.accessioned2016-10-31T07:15:12Z
dc.date.available2016-10-31T07:15:12Z
dc.date.issued2016-10-30
dc.identifier.urihttps://dspace.lu.lv/dspace/handle/7/34452
dc.descriptionArticle also submitted for publication in Baltic J. Modern Computing (BJMC) on October 5, 2016.
dc.description.abstractThere are huge volumes of raw data generated every day. The question is how to store these data in order to provide faster data access. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats shows that two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, a systematic review of SQL-on-Hadoop by using Avro and Parquet has been performed over the past six years (2010–2015) using publications of conference proceedings and journals of IEEEXplore, ACM Digital Library, ScienceDirect. With the help of search strategy followed, 94 research papers have been identified out of which 17 have been analyzed as relevant papers. At the end, the conclusion has been made that direct comparison by compactness and fastness between Avro and Parquet do not exist in data science.en_US
dc.language.isoengen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectSystematic review
dc.subjectBig Data
dc.subjectHadoop
dc.subjectHDFS
dc.subjectAvro
dc.subjectParquet
dc.titleA systematic review of SQL-on-Hadoop by using compact data formatsen_US
dc.typeinfo:eu-repo/semantics/preprinten_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record