Note: the Dataverse metadata is now available out of the box and so this approach is no longer relevant. I have an updated post here
On November 23 Microsft announced that DES (Data Export Services) will be deprecated and will reach end of life in November 2022. Microsoft is recommending that customers use Azure Synapse Link for Dataverse instead. DES has been around for a number of years and generally works well pushing Dataverse data to SQL Azure or to on-premise SQL Server using a data gateway. Azure Synapse is a collection of Azure cloud-based data services typically used by larger organizations for Data Lake and warehousing purposes. Azure Synapse is massive and I just starting to learn about it. As I learn more I will post any tips I think are useful. This first post covers option sets an approach I took to get their values into Azure Synapse.
The options sets values are actually available in the Data Lake. They are stored in a Json file which can be queried, covered in a later post, however, this approach makes them available as an external table along with the other Dataverse tables. You can then run SQL queries via your serverless SQL endpoint joining to the options set table as required. I took this approach as you cannot directly create a view or external table on a Serverless SQL pool.
To configure Azure Synapse Link for Dataverse you will need to have an Azure subscription and have created a Synapse Workspace with Azure Data Lake Gen 2 for storage. Once this is in place you can follow these instructions to configure Azure Synapse Link. You can then configure which Dataverse tables you want to synchronize to your Synapse data lake.
The Dataverse tables are synched as CSV files into your Synapse data lake. If you navigate to the containers in your data lake storage account you will see a container with a folder for each of the entities you are synching.
Within the folder, you will see a series of CSV files each containing a set of records that have been synchronised across.
One of the services a Synapse workspace comes with is a SQL Serverless pool. A serverless pool allows you to query your data lake files using SQL even though the underlying data is stored in CSV files. You can get your Serverless SQL endpoint from the Synapse Workspaceoverview page.
You can then this endpoint in SQL Server Management Studio to connect to your serverless pool and run your queries.
I found the synchronization to be remarkably fast and almost real-time. I wrote some code to create 1,000 accounts records and within a few seconds after they were all visible in Synapse.
What I did notice with Azure Synapse Link is that the stringmap table cannot be synched to Synapse. The stringmap table is a system table in Dataverse that contains the details of the option set fields. Without this you can only return the integer value of any option set value rather than the associated display text.
To overcome this I decided to create my own table to hold the option set values. Once I populate this with the option set data I can sync this to Synapse along with my other tables. I called my table Option Set Export and added custom fields to hold the options set values.
You need to enable Track Changes for this entity otherwise it is cannot be added to the tables to sync to Synapse
Initially, I thought about using a Dataflow to push the stringmap data into my new table however the stringmap table is not available in data flows. I used a Power Automate Flow instead. I used FetchXml in the flow to get the stringmap records otherwise the objecttypecodename field is not returned. The objecttypecodename contains the optionset entity name.
When the option set export table is synched to Synapse you can join to this table to get the text values of your option sets.
I hope you found this post interesting. As I mentioned I am new to Synapse so there may be better ways of doing this. It is worth noting that any tools that can query SQL Server can query the Synapse Serverless SQL endpoint including Power BI.