Category Archives: R&D

Problem unsolvable for Data Lake Analytics

One of the tasks of Database Forensics is to detect events (or categories of events) that began too frequent. From programming perspective, this means that for each date there should be searches categories appearing more than certain number of times … Continue reading

Posted in Big Data, Business Capability, R&D, Uncategorized | Tagged , , , , , , | Leave a comment

Lake and Blob – how far from each other?

Two competing cloud storage products by Microsoft are defined the next way: Azure Blob Storage is a general purpose, scalable object store that is designed for a wide variety of storage scenarios. Azure Data Lake Store is a hyper-scale repository that … Continue reading

Posted in Big Data, Business Capability, R&D | Tagged , , , , | Leave a comment

Hive 2.0 – The Solution For Data Warehousing

Slides from presentation by Hortonworks’ founder Alan Gates describing at high-level new features of Hive 2.0 tailored for the kind of queries typical for Data Warehousing: Apache Hive 2.0: SQL, Speed, Scale from Hadoop Summit Hive with LLAP is available … Continue reading

Posted in Big Data, R&D | Tagged , , , , | Leave a comment

From SQL to NoSQL to NewSQL

History of RDBMS Every time new technology emerged it’s evolution ended up in realisation as relational system (RDBMS). In other words, the business before adopting the stuff always demanded atomicity, consistency, isolation, and durability (ACID).

Posted in Big Data, Business Capability, Business Delivery, R&D, Uncategorized | Tagged , , , , , , , , | 1 Comment

PolyBase: A Superior Alternative To Process And Query Data

Recently I was literally stunned when loading bulky data into PDW got result a magnitude faster than expected. Load of 20Gb file into Azure Blob Storage from local machine takes 15 minutes, copying from Blob Storage into Data Lake (ie from one … Continue reading

Posted in Big Data, Business Capability, R&D | Tagged , , , , , , , | Leave a comment

Hadoop Data Processing: Battle For Speed

The idea behind Hadoop шs brilliant and revolutionary: invented an algorithm – MapReduce – allowing decomposition on it of all major data processing tasks (grouping, statistical, graph, etc). However it’s use of input files and lack of schema support prevented the performance improvements … Continue reading

Posted in Big Data, Business Capability, R&D, Uncategorized | Tagged , , , , , , | Leave a comment

Data Lake Store vs Blob Storage

Why might company consider using Azure Data Lake Store over Azure Blob Storage? There are several options for data storage in Azure, each with a specific goal. In particular Data Lake Store is design specifically for Azure Data Lake Analytics. While Blob Storage can … Continue reading

Posted in Big Data, Business Capability, R&D | Tagged , , , , , , | Leave a comment