Every year about a million of Australians experience blackout according to stat provided by Ausgrid.

Average number of customers with no light at home per month is 101K:

The unluckiest suburb seems is Gosford, then go Sutherland, Wyong, and Lake Macquarie.

Continue reading

Posted in Data to Knowledge, R&D, Visualisation | Tagged , , , , , | Leave a comment


List of advices collected from the Web.

Recommendations that present in all sources

1a. No bright light before going to bed, reading a book under a lamp or reading from IPad till sleepy state is also an option but then falling asleep will take more time (30 mins vs 10 mins);
1b. All sorts of LEDs shining all night long are highly undesirable (drop some material on them);
2. It’s better to cool bedroom so that you can sleep only under a blanket (expensive advice taking into account power rates).

Recommendations appearing in the lists from time to time

Contradicting advices found in different sources: “drink a glass of warm water or milk for the night, but do not eat anything” vs “eat a bit of cashews and other nuts/fruits and do not drink anything or drink very few”. Both looks unreliable. Probably they try to comply the recommendation “don’t eat before sleep but don’t go to bed hungry”, that means something more or less sufficient should be eaten 2-4 (depends on source) hours before sleep.

Good advice not only about falling asleep: “write down three good things from your day”. (Sometimes it’s advised to write three tasks to the do next day, but not in articles about falling asleep.) Or basic meditation might calm down.

Deep breathing is recommended (friend of mine told me he falls asleep after 10 deep inhale/exhale cycles).

Hot shower is recommended as well.

Bedroom preparation contributes good sleeping:

Some basic commands of auto-suggestion might help (depends on type of mind):

Works personally for me – to listen calming sound for 15 minutes. There are a lot of mobile applications for that and suitable sound can be found on youtube:

Night train in the rain – 2 hours – soundscape – relaxing sound – sleep sound – Train sound

Posted in Health | Tagged , | Leave a comment

Compatibility issue in SSIS

Just discovered a difference in the way SSIS parent-child package variable values interact in the newest version of SSIS. This can seriously impact SSIS-projects’ upgrade. If packages exchange information not using parameter binding but using configuration mechanism then upgraded project after upgrade stops working!


Open each package and alter every parent-child package variable the next way:

Old version (wrong) variant:

Another wrong variant:

Correct variant:

Posted in Administration, programming | Tagged , , | Leave a comment

5 questions for time limited interview for BI developer position that touch almost all areas related to modern Microsoft BI

1. For example it’s necessary to allow user to input worker’s salary but ensure that it’s in range related to the worker’s category. Ranges for categories are stored in a separate table. Data inserted by using T-SQL. Where is the best place to put the check?

2. On Power BI report there are two slicers “Product Category” and “Customer”, and one grid with Sales data. Business asks for cross-filter slicers: after choosing of a customer in the second slicer must remain only product categories ever purchased by the selected customer. How to enable cross-filtering in Power BI?

3. For BI solution are allocated two VMs. How would you install 4 components of the MS BI Stack (SQL, SSAS, SSIS, SSRS) for even balance load?

4. There is 1 Gb fact table in source. It should be transferred to DWH with slight transformation. What combination of SSIS and T-SQL would you use and why?

5. A company used Power BI reports with datasets included in them. As the next step company decided to analyse more data and total size is going to exceed 2 Gb. Business logic is complicated and a lot of DAX-queries are going to be involved. What is the next step the company has to take to operate large amounts of data in Power BI reports?

Posted in Business Capability, Data to Knowledge, programming, Uncategorized | Tagged , , , , , , | Leave a comment

Problem unsolvable for Data Lake Analytics

One of the tasks of Database Forensics is to detect events (or categories of events) that began too frequent. From programming perspective, this means that for each date there should be searches categories appearing more than certain number of times during next certain number of days.

In the next example are searched categories appearing more than 3 times during 20 subsequent days.

Continue reading

Posted in Big Data, Business Capability, R&D, Uncategorized | Tagged , , , , , , | Leave a comment

Lake and Blob – how far from each other?

Two competing cloud storage products by Microsoft are defined the next way:

  • Azure Blob Storage is a general purpose, scalable object store that is designed for a wide variety of storage scenarios.
  • Azure Data Lake Store is a hyper-scale repository that is optimised for big data analytics workloads.

Data Lake

Let’s go deeper and list the major differences between them:

Azure Data Lake Store Azure Blob Storage
Purpose Optimized storage for big data analytics workloads General purpose object store for a wide variety of storage scenarios
Use Cases Batch, interactive, streaming analytics and machine learning data such as log files, IoT data, click streams, large datasets Any type of text or binary data, such as application back end, backup data, media storage for streaming and general purpose data
Structure Hierarchical file system,
Data Lake Store account contains folders, which in turn contains data stored as files
Object store with flat namespace.
There is actually a single layer of containers. You can virtually create a “”file-system”” like layered storage, but in reality everything will be in 1 layer, the container in which it is.
Server-side API WebHDFS-compatible REST API Azure Blob Storage REST API
Hadoop File System Client Yes Yes
Data Operations – Authentication Based on Azure Active Directory Identities Based on shared secrets – Account Access Keys and Shared Access Signature Keys.
Data Operations – Authentication Protocol OAuth 2.0. Calls must contain a valid JWT (JSON Web Token) issued by Azure Active Directory Hash-based Message Authentication Code (HMAC) . Calls must contain a Base64-encoded SHA-256 hash over a part of the HTTP request.
Data Operations – Authorization POSIX Access Control Lists (ACLs). ACLs based on Azure Active Directory Identities can be set file and folder level. For account-level authorization – Use Account Access Keys
For account, container, or blob authorization – Use Shared Access Signature Keys
Data Operations – Auditing Available. Available
Encryption data at rest Transparent, Server side
With service-managed keys
With customer-managed keys in Azure KeyVault
Transparent, Server side
With service-managed keys
With customer-managed keys in Azure KeyVault (coming soon)
Client-side encryption
Developer SDKs .NET, Java, Python, Node.js .Net, Java, Python, Node.js, C++, Ruby
Analytics Workload Performance Optimized performance for parallel analytics workloads. High Throughput and IOPS. Not optimized for analytics workloads
Geo-redundancy Locally-redundant (multiple copies of data in one Azure region) Locally redundant (LRS), globally redundant (GRS), read-access globally redundant (RA-GRS).

What is not mentioned here is that U-SQL engine generates different query plans for Data Lake and Blob Storage. That means for some types of solutions it would be more reasonable to make choice not basing on optimisation for load but on optimisation for read.


Posted in Big Data, Business Capability, R&D | Tagged , , , , | Leave a comment

Hive 2.0 – The Solution For Data Warehousing

Slides from presentation by Hortonworks’ founder Alan Gates describing at high-level new features of Hive 2.0 tailored for the kind of queries typical for Data Warehousing:

Hive with LLAP is available in Azure:


Posted in Big Data, R&D | Tagged , , , , | Leave a comment