Data Warehouse

We build a data warehouse from the General Query Log

The design of a data warehouse differs from relational design. Data warehouses are often designed according to the concept of the star schema.

When building a data warehouse, you usually put the cart before the horse:

  • What questions should my data warehouse be able to answer?
  • How do I have to design my model so that my questions can be answered easily?
  • Where do I get the data to populate the model?
  • How do I fill my model with the data?

For training purposes, we have investigated an issue that arises from …

Data Warehouse Design

This is my cheat sheet for dimensional modelling design techniques of a data warehouse (DWH) according to Kimball/Ross.

Dimensional Design Process (p. 38 ff.)

  • Select the business process.
  • Declare the grain (what a single fact table row represents).
  • Identify the dimensions.
  • Identify the facts.

Dimension Tables (p. 46 ff., p. 62 ff.)

who, what, where, when, why and how

  • Dimension tables are entry point to the fact tables.
  • Every dimension table has a single primary key (PK) column.
  • Dimension tables are …

Query performance comparison between MariaDB ColumnStore and other Storage Engines

Storage Engines like InnoDB, Aria and MyISAM are Row Stores. They store rows one after the other in blocks or even directly in a single file (MyISAM). On the other hand a Column Store like MariaDB ColumnStore stores all the same attributes (columns) of the rows together in chunks.

This is how the table sales_fact looks like:

CREATE TABLE `sales_fact` (
  `product_id` int(11) NOT NULL,
  `time_id` int(11) NOT NULL,
  `customer_id` int(11) NOT NULL,
  `promotion_id` int(11) NOT NULL,
  `store_id` int(11) NOT …

Do not trust other peoples benchmarks!

Because they do NOT reflect your problems.

One of our customers upgraded last month from MariaDB 10.2 to MariaDB 10.5. In the same change he also converted all his data warehouse (DWH)/BI tables from MyISAM to Aria. An all this, naturally, without testing. And it miserably failed! And then we were under heavy time pressure to make things working again…

What has changed:

  • MariaDB version: MariaDB optimizer got a lot of changes between these 4 major release series (10.2, 10.3, 10.4 and 10.5)!
  • Storage …

MariaDB 10.2 Window Function Examples

MariaDB 10.2 has introduced some Window Functions for analytical queries.

See also: Window Functions, Window Functions, Window function and Rows and Range, Preceding and Following

Function ROW_NUMBER()

Simulate a row number (sequence) top 3

SELECT ROW_NUMBER() OVER (PARTITION BY NULL ORDER BY category_id) AS num
     , category.category_id
  FROM category
 LIMIT 3
;

or

SELECT ROW_NUMBER() OVER (ORDER BY category_id) AS num
     , category.category_id
  FROM category
 LIMIT 3
;
+-----+-------------+
| num | …

MySQL pluggable Storage Engines (SE)

One of the big advantages of MySQL is its concept of Pluggable Storage Engines. This means you can choose the most optimal Storage Engine for your needs. This also has a disadvantage: You have to know what you are doing…

If you need some help deciding which is the right Storage Engine or even strategy for you or if you want us to make an independent and vendor neutral comparison of those Storage Engines consider our FromDual Consulting Services!

MySQL provides the following storage engines:

mysql> …

The battle against Oracle is probably over but has the real war begun yet?

According to different sources from the web the decision about the Oracle - Sun merger has been approved by the European commission soon. So at least in the West it is clear what is going on. Let us see what the East decides… [ 1 ], [ 2 ].

Oracles arch-enemy Microsoft has already brought its weapons in position against the target with its: Microsoft offers Oracle-phobes MySQL migration tool" [ 3 ], [ 4 ]. So far so good. Nothing new, nothing special.

What made me a bit edgy was the following …

Subscribe to RSS - Data Warehouse