Power BI is fast becoming a BI platform of choice for small to large enterprises. This is because of its capability to connect to large number of data sources, powerful analytical capabilities, and striking visualizations.
The stages involved in a Power BI project are invariably as:
We start with business and technical requirements gathering then move to data ingestion, data cleaning & transformation. The next stage is Data Modelling. Efforts required in the later stage of data analysis and the quality of insights gathered depend upon the quality of data model to a large extent. Thus it is important to understand what goes in a data model and how can we create a good data model.
So let us dive into that.
What is Data Modelling
Data Modelling is establishing relationships between different tables and creating custom calculations which will be required for data analysis
There are certain best practices in data modelling which will cover future pitfalls to a large extent in every project
- Filter unused rows in query editor before loading data to data model. Larger the data, slower will be the report.
- Always have a Date Table in your data model. Believe me, you will need it sooner than you think. There are plenty of scripts available for date tables
- Always review the relationship which Power BI has created automatically. Delete redundant relationships. You can always create these when you need these in future.
- Avoid many-to-many relationships as far as possible. I will explain in my next blog on what to do if you must use a many-to-many relationship.
- Organize the tables layout in relationship view in a logical manner. Lookup / dim tables surrounding the cluster of Fact tables in the middle. Reducing crossover of relationships. You may not be able to achieve perfect Star schema but try to be close.
- Store all your measures in a separate table. This helps in searching your measures at a later stage when business user comes rushing to you to fix a problem.
- Name your measure in a meaningful way. This is especially important if the data model has tens of measures.