How do I do Version Control in Power BI?

How do I do Version Control in Power BI? This question has been difficult to answer in the past, but is getting easier with new Power BI (and Fabric) features. Team and traditional enterprise concerns like version control have only recently received attention on the product roadmap.
How do I do Version Control in Power BI?

This question has been difficult to answer in the past, but is getting easier with new Power BI (and Fabric) features. Power BI was designed from birth to focus on empowering the individual analyst with an all-in-one power tool; team and traditional enterprise concerns like version control have only recently received attention on the product roadmap.

In this article, I'll share your options for Power BI version control and also give some decision points to guide which option could work best for you. But first let's define "version control".

Defined: Version Control

The more time and effort you put into your Power BI assets and the more people work on them, the more important version control becomes. There are many tools and approaches to controlling the versions of software products. At core, any approach to version control includes two things:

  1. Version tracking: The ability to pull up the past version of your Power BI file at any given point in time.
  2. Branching and merging: The ability to have more than one concurrent version of a Power BI report or dataset. For example, one person working on the reports and another on the dataset of the same PBIX. Independent work is not the hard part. The value of version control is a structured and tooling-assisted approach to merging changes from these concurrent branches of development into a single, deployable artifact.

Branching and merging is the more demanding of the two elements, so we will address this first.

Version control is also often referred to as source control. An application or tool that implements version control workflows is referred to as a version control system or VCS.

Decision Point #1: Do you want to use Git?

If your answer is, "What is Git?", then you can jump down to the next Decision Point. However, I'd also recommend you read the "Why Git and how do I get started with it?" section, no matter how you answer these questions.

But if your answer is "Yes please Git!" you've got some options.

Prior to preview features announced at Microsoft Build in May 2023, you only had the choice between PBIX in Git and PBIT in Git. Now you can also choose to use the new Power BI Projects as PBIP in Git. Of these options, I like the PBIP in Git as a best practice; however, this feature is in public preview as of August 2023. So learn up on it but be aware features could change before it is generally available (GA).

PBIX in Git

This option pretty much stinks. That's because a PBIX is the following things all bundled together:

  1. Report - A Power BI report made of pages and visuals and filters.
  2. Data model - A tabular data model made of tables and relationships, specifically the metadata definition of these things.
  3. Queries: Power Query (M) queries made of scripts that connect to data sources, transform data, and output tables
  4. Dataset - The fully loaded tabular model dataset - so the actual data you are analyzing, which may be very large in size, even after compression.

Git stores a compressed copy of every version you commit, and its speed is predicated on the versions it is controlling being relatively small, not megabytes for each version. PBIX uses a common compression format (ZIP) to bundle all these files together. That means it has two things that make using Git painful:

Binary format + Large size = Sad Git ☹️🤖

The large size can be helped with the Git-LFS extension, giving you back the version tracking feature of Git, but still not empowering branching and merging. This is because there is no native tooling to work with the opaque binary object that is a PBIX. Any viable workflow requires breaking apart the PBIX in some way, to avoid having to handle large binaries. Next option.

PBIT in Git

This was the best option before those Build announcements. A PBIT file (short for Power BI template) is the same as the PBIX Power BI Report, but without the included dataset (the actual, loaded data that may be very large in size). It holds only the metadata definition of report pages with their layouts, the code of your M queries, and the model structure of your data model. You create a PBIT in Power BI Desktop simply by doing a "File"->"Save As" command and choosing PBIT as the file type.

That fixes the large size concerns, but you still don't get the branching and merging you need. Only the PBIX and PBIT options are are officially GA. Of the two, this is the better one. But let's check out preview features announced in May 2023 to see what's coming in the near future. Despite being better, a PBIT is still stored as a ZIP archive, and is opaque to the tools available in Git.

PBIP in Git

Thanks to hard work by the inimitable Zoe Douglas and the Power BI team, these features in Power BI Desktop look extremely promising. You can save your Power BI Desktop file as a Power BI Project, which removes the dataset, but also breaks the file out into text-serialized files. That gives us both of what we are looking for: version tracking and branching and merging. This is because all version control systems can trivially handle text and code, and there is a wealth of tooling to provide structured diffs and controlled merges of two source files which have differences.

When you save your Power BI artifacts as a PBIP, it will save on the disk in a director structure like this:

F1.PBIX saved as PBIP, as shown in Git source repo

This F1.pbix file contained a page with a single textbox on it that contained the character 5 . After committing these files to Git, we changed the character to 6 and committed again. Here's what the resulting commit diff looks like this (shown here in a Git client tool that will wrap the lines so you can see them):

text diff of two versions of 'report.json' file, showing a single change on one line

This is amazing news! Not only can Git now recognize that a single change has occurred - so could a human being!

There are some limitations to this approach currently. For example, we can see that the JSON generated by PBIP projects can be lengthy considering the fact that an extremely small change was made. Merging two long-running branches could involve some squinting to figure out exactly what was changed. Moving objects, even slightly, also tends to cause these lengthy config JSON objects to change, which can be difficult to visually parse. To be clear, though, a piece of gnarly JSON that is difficult to visually parse is still miles better than an opaque binary file format that cannot be parsed, visually or with common devops tools.

However this opens an entirely new door for Power BI development teams to collaborate and work on Power BI solutions in parallel. And they can bring their features together at when they are completed using branching and merging. No longer will you have to carefully coordinate who is allowed to make changes and when. By performing a diff between two branches and pulling changes from both into a single new artifact, Power BI has finally caught up with other programming environments!

You should check out this feature and give feedback to the to the Microsoft team on the feature overall and on great ideas like this one: the ability to save PBIPs from the command line.

Backing your Power BI Workspace with Git

In a related, but not fully integrated, feature you can now back your Power BI workspaces with a Git repo hosted in Azure Dev Ops. You configure these in your Workspace settings like this:

configuration of Fabric workspace under 'Workspace settings' > 'Git integration', with connection details to an Azure DevOps repo

Once configured, you start to see the option to commit changes for files that have been changed in the Power BI Service, such as this report changed directly in the Power BI Service using the web editor. This will sync the live version from Power BI back to the Git repo. Changes can also flow from the Git repo to the workspace, e.g., if a user edited the file in Power BI Desktop and pushed those changes to the Git repo.

listing of workspace artifacts, showing varying sync states to the Git repo, including 'Uncommitted', 'Unsupported', and 'Synced'

This screenshot also highlights that not all objects are supported by this features, as you can see by the message next to the data pipeline (v1), the dashboard, and the Excel file. However, the report below them is already committed with the Azure Dev Ops Git repo, and matches the committed version.

To commit this change to the Git repo, I can press the Source Control button at top to see this box appear.

'Source control' dialogue in workspace, showing a commit dialogue before committing and syncing changes to the repo

I check the report I want to commit, and enter a commit message, click "Commit", and a new commit has been added to my Azure Dev Ops Git repo:

Azure DevOps 'Commits' view, showing the commit initiated from Fabric workspace

You can read detailed instructions on configuring your workspaces for Git repos here. Keep in mind that this only works with Git repos based on Azure Dev Ops (no GitHub, BitBucket, or others).

So should I back up my workspace in a repo, or use PBIPs?

My suggestion is both, just use two different repos.

Not every person who will be creating or editing Power BI content will know how to be a Git user. The whole promise of Power BI is to empower the intelligent analyst who knows their data, but may not be full-blown developer. That means not everyone can be forced into a path of using PBIPs, even if it is a best practice for critical datasets.

The way I see this playing out is that your central Power BI team that tend to develop the certified, enterprise-level datasets and reports will absolutely use PBIPs to track every feature and change to their mission-critical Power BI assets. For all the other changes happening in the Power BI Service (and Fabric workspaces) – a business analyst publishing a new version of a departmental report, or an executive saving a filtered copy of a report on online - these can at least be captured and backed up into Git using the workspace-level repos. If there is ever a need to promote an end-user maintained solution, then the source code is all there, waiting in a Git repo.

This gives you redundancy at all levels only at the expense of a little more storage.

This feature, along with PBIPs, are currently in preview. Keep an eye on announcements for when they reach GA. We suspect there are more great features to support full-blown automation still to come. Comments by the Power BI product team in various fora have indicated a much richer API surface area in the works as well, to enable automation workflows.

That wraps up our section that focuses on Git if you said "yes" to Decision Point #1. From here we deal with the non-Git based solutions.

Decision Point #2: Do multiple people need to work on the same file at the same time?

Say you have Anna and Bob who both need to work on the same PBIX file. Do they need to work on the same file at the same time? In other words, to borrow from electrical circuits, must Anna and Bob work on a single PBIX file in parallel or is it OK for them to work on the same PBIX in series?

Working on the same PBIX in parallel

If Anna and Bob work in parallel, their changes look like this:

flowchart of two parallel sets of changes made to a single original source,

Working on the same PBIX in series

If Anna and Bob work in series, then they coordinate their work to look like this:

flowchart of two sets of changes made to a single original source, with coordination to make them one at a time

Here's the thing: without Git or another version control system, in parallel isn't really possible. When you get to that "combined version" step in the drawing, there is no way to merge Anna and Bob's changes together, except by hand. So, if the parallel pattern is absolutely necessary, then please return to Decision Point #1. If parallel development is a must-have, then your answer to Decision Point #1 is "Yes."

‼️
Without Git or another version control system, developing in parallel isn't really possible

If Git is new to you, please check out "Why Git and how do I get started with it?"

However, there are many instances where development in series can work fine. This is the default option and, probably, what you have already been doing (or else Anna and Bob never share PBIX files in your organization!)

In series development is perfectly supportable, and you can still get great version tracking benefits in this pattern even without Git; we just lose the ability to branch and merge. We've reached our next Decision Point.

Decision Point #3: Do you need check-in comments?

From this point, we are going to rely on the version control of Teams, backed by SharePoint. If your company doesn't use Teams or SharePoint, please jump down to "What if we don't use Teams or SharePoint?"

Getting version comments using SharePoint's check-in and check-out feature

For a long time, SharePoint has had a feature called "Check In" and "Check Out". This acted more like centralized source control – or, getting books from your local library. In order for a person to edit a file, they first needed to "check out" the file and declare their intention to change it. This put a lock on the file, preventing others from changing it. Then that same person must check the file back in to create a new version. Otherwise, their change must be discarded.

This feature is rarely used in SharePoint 365 these days, but it is still present if you turn it on in the Document Library settings. When turned on, you then see options like this in SharePoint and Teams:

SharePoint file ellipsis menu: 'More' submenu showing 'Check out' action

This provides a useful signal to my team members that I'm working on the file. They will know that, if they need to change this file, they should talk to me first.

file in SharePoint checked out for editing, indicated by a red 'check out' icon and tooltip

When I check this file back in, I get the opportunity to add a comment.

SharePoint 'Check in' dialogue, showing comment text box

This comment is displayed in the SharePoint version history.

SharePoint version history of a file, showing version metadata and a check in comment

And the latest check-in comment can even be shown right in the Team folder.

Teams 'Files' section with 'Check In Comment' column for file view

If your team can follow this process faithfully, then you've got pretty solid version tracking while working in series on Power BI files. And the check-in/check-out features also help to address the coordination problem of working in series.

Drawbacks of the Check-Out, Check-In pattern, and how to handle them

This approach has the following possible drawbacks, however:

  • You have to remember the check out step before you start making significant changes. It's easy to forget this step, then suddenly, you find that you and a teammate have both been editing the same file and must merge your changes by hand.
  • Editors of your PBIX must remember to check their files back in! There are few feelings worse than when you need to make a quick bug fix on a file, only to find that it is checked out by your teammate out on a 3-week European vacation. You have no idea why the file is checked out to him or whether or not you can safely discard his changes.

It still relies on communication and human beings, but it's certainly a solid process. It brings a lot of benefits without extreme overhead.

Making the Check-In and Check-Out feature easier with a PowerApp solution from PowerBI.tips

The great folks at PowerBI.tips have put this solution together to make using these built in SharePoint features even easier. You can check this out at the link below. It wraps the check-in and check-out process in a much more intuitive PowerApp.

Power BI Version Control - Ready to use solution and free download
A full, easy to install Power BI Version Control solution allowing source control, check ins and local editing. Download for free now!

The Default Pattern: Automatic versioning in a Teams library

Not everyone realizes this, but there is a SharePoint site created for every Microsoft 365 Team. Also, a SharePoint site uses the same OneDrive app to synchronize files on your computer to a SharePoint site. In a Teams folder, you can sync those files to your computer just by clicking this button:

Teams 'Files' section for a team, showing the 'Sync' option

SharePoint's automatic version history

SharePoint gives you version control out of the box. It's on by default. Let's find it.

Starting from Teams, open a file in SharePoint:

Teams 'Files' section. A single file's ellipsis menu, showing 'Open in SharePoint' action

Now that you are in SharePoint, hover over a file and click on the three dots:

the same file as in Teams, shown in SharePoint. The SharePoint ellipsis menu for the file with the 'Version history' option

You will see all versions of the file listed, including the ability to restore back to any point in time.

three versions of the file shown with modification metadata: time, editor, and size

So how could this benefit your team right away? Simply save your PBIX files into a Teams folder, and you are automatically getting versions created. You can even synchronize this file locally so it feels the same as saving your PBIX anywhere else, and that empowers you to work offline. You get at least the partial benefits of version tracking with no additional work.

💡
If you do nothing else for Power BI Version Control, at least synchronize a Teams folder locally and save your PBIX files in that folder while you work on them. This will at least give you a set-it-and-forget-it method to return to a point in time.

Without a text serialization, diffing two versions is difficult. And merging would consist of manual edits to a file. This is the difference between a version history and a proper version control system.

This set up is far from perfect, but it's also way better than nothing!

What if we don't use Teams or SharePoint?


If your company doesn't use Teams or Microsoft 365 at all, then we would suggest you explore the "Default" pattern above, but using your cloud storage provider of choice. A similar pattern should be possible with Google Drive or AWS storage or DropBox. You will want to keep in mind, however, that PBIX files can be large, potentially consuming a large amount of cloud storage.

But check back with us often – we will keep you up to date on your best operational options for Power BI.

Why Git and how do I get started with it?

This is a sidebar for those who are new to Git. Skip over this if you already know about Git and how to use it.

Git is the industry standard version control system for code and programming projects – primarily collections of many small (kilobytes large, not megabytes large) files – and arguably the best tool of its kind that humans have yet invented. Git controls versions, and utilizes decades-old, battle-tested applications for diffing and merging text files. Git can use third party tools to provide diffing and merging capabilities, but no such tool exists for Power BI solutions. This is why the PBIP text serialization is critical for version control and modern development workflows.

Git is ideal for any person or team who needs to do version control, branching, and merging with text files, whether that team is working as a centralized group in an organization or as a loosely organized, decentralized team who might not even know each other. Git was written by Linus Torvalds, creator of the Linux kernel, and is one of the key technologies that enables the open source movement. It is also a linchpin to the rapid technological innovation in the 21st century.

That said, learning Git it is a safe investment. It's not going anywhere. The technology is completely free and can be used in most operating systems including Windows and Mac OS. Like many free technologies, you can optionally pay for online services to host it for you and provide value-added features; that's where services like GitHub and Azure DevOps enter the picture. But Git itself costs nothing.

While originally made with software developers in mind, it is a power tool for anyone with text files that need to be carefully managed and kept in harmony with each other. (Think writers, attorneys, contractors, designers, etc.)

Learning the basics about Git will have immediate payoffs for your organization and your own career, so we do recommend starting with just these topics:

  1. What's a Git repo?
  2. What's a Git commit and how do I make one?
  3. What's a Git branch and how do I make one?
  4. What's a Git merge and how do I make one?

After you learn these concepts, you can then create your own repos locally and play with Git in a safe sandbox where you make branches, commits and merges until you feel comfortable with them. And you don't need to worry about breaking anything - the whole point of Git is never to lose past versions so just go have fun with it and you'll master it in no time!

You can Git smart! Git has a lot of features, but don't be overwhelmed by everything you can do with it, just focus on the basics.

This article isn't really about Git (though based on how much I wrote here, seems like we need one!). The resources below are great for a beginner.

Conclusion: Take action on Power BI Version Control

Here's our final decision flow diagram summarized:

Our options keep expanding, so there's no excuse for having no version control at all on your Power BI files. If they are worth building, they are worth protecting!

If you have no interest in Git, then start by simply saving your PBIX files in synchronized folder on SharePoint.

I recently presented on this topic at our local user group, and you can find the recording of that here.

Not sure if you are aware of everything on in your Power BI tenant? Check out whole-tenant monitoring provided by Argus PBI.

Do you need to talk to someone about your Power BI strategy? You can contact me at brent@firstlightanalytics.com.

Read the full story

Sign up now to read the full story and get access to all posts for subscribers only.

Subscribe
Already have an account? Sign in

Simplify Power BI Operations, Governance, and Administration

Power BI Ops

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Power BI Ops.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.