Category Archives: R&D
One of the tasks of Database Forensics is to detect events (or categories of events) that began too frequent. From programming perspective, this means that for each date there should be searches categories appearing more than certain number of times … Continue reading
Two competing cloud storage products by Microsoft are defined the next way: Azure Blob Storage is a general purpose, scalable object store that is designed for a wide variety of storage scenarios. Azure Data Lake Store is a hyper-scale repository that … Continue reading
Slides from presentation by Hortonworks’ founder Alan Gates describing at high-level new features of Hive 2.0 tailored for the kind of queries typical for Data Warehousing: Apache Hive 2.0: SQL, Speed, Scale from Hadoop Summit Hive with LLAP is available … Continue reading
History of RDBMS Every time new technology emerged it’s evolution ended up in realisation as relational system (RDBMS). In other words, the business before adopting the stuff always demanded atomicity, consistency, isolation, and durability (ACID).
Recently I was literally stunned when loading bulky data into PDW got result a magnitude faster than expected. Load of 20Gb file into Azure Blob Storage from local machine takes 15 minutes, copying from Blob Storage into Data Lake (ie from one … Continue reading
The idea behind Hadoop шs brilliant and revolutionary: invented an algorithm – MapReduce – allowing decomposition on it of all major data processing tasks (grouping, statistical, graph, etc). However it’s use of input files and lack of schema support prevented the performance improvements … Continue reading
Why might company consider using Azure Data Lake Store over Azure Blob Storage? There are several options for data storage in Azure, each with a specific goal. In particular Data Lake Store is design specifically for Azure Data Lake Analytics. While Blob Storage can … Continue reading