動態卷分區怎么刪除,azure_Azure Analysis Services中的動態分區(表格)

 2023-10-18 阅读 19 评论 0

摘要:azure 目的 (Objective) The real-life requirement 現實生活中的需求 Disclaimer: I assume dear Reader, that you are more than familiar with the general concept of partitioning and star schema modeling. The intended audience is people who used to be called BI

azure

目的 (Objective)

The real-life requirement

現實生活中的需求

Disclaimer: I assume dear Reader, that you are more than familiar with the general concept of partitioning and star schema modeling. The intended audience is people who used to be called BI developers in the past (with a good amount of experience), but they have all sorts of different titles nowadays that I can’t keep up with… I won’t provide a full Visual Studio solution that you can download and just run without any changes or configuration, but I will give you code can be used after parameterizing according to your own environment.

免責聲明:親愛的讀者,我想您對分區和星型模式建模的一般概念非常熟悉。 目標受眾是過去曾經被稱為BI開發人員( 具有豐富的經驗 )的人,但是如今他們擁有各種各樣的頭銜,我無法跟上……我不會提供完整的Visual您可以下載Studio解決方案,并且無需進行任何更改或配置即可直接運行它,但是我將為您提供可以根據您自己的環境進行參數化后使用的代碼。

So, with that out of the way, let’s start with some nostalgia: who wouldn’t remember all the nice and challenging partitioning exercises for OLAP cubes? 🙂 If you had a huge fact table with hundreds of millions of rows it was at least not an efficient option to do a full process on the measure group every time, but more often it was out of the question.

因此,從某種意義上說,讓我們從懷舊開始:誰不記得OLAP多維數據集所有出色而富挑戰性的分區練習了? 🙂如果您有一個龐大的事實表,其中包含成千上萬的行,那么,這至少不是每次對度量值組執行完整過程的有效選擇,但更多時候是不可能的。

In this example, I have a fact table with 500M+ rows that is updated hourly and I created monthly partitions. It is a neat solution and the actual processing takes about 3-4 minutes every hour, mostly because some big degenerate dimensions I couldn’t push out of scope. The actual measure group processing is usually 1-2 minutes and mostly involves 1-3 partitions.

在此示例中,我有一個包含500M +行的事實表,該表每小時更新一次,并創建每月分區。 這是一個很好的解決方案,實際處理每小時大約需要3-4分鐘,這主要是因為我無法排除某些較大的退化尺寸。 實際的度量值組處理通常為1-2分鐘,并且主要涉及1-3個分區。

I know OLAP is not dead (so it is said) but not really alive either. One thing is for sure: it is not available as PaaS (Platform as a Service) in Azure. So, if you want SSAS in the Cloud, that’s tabular. I assume migration/redesign from on-premise OLAP Cubes to Azure Tabular models is not uncommon. In the case of a huge table with an implemented partitioning solution, that should be ported as well.

我知道OLAP并沒有死(據說),但也沒有真正存活。 可以肯定的是:它在Azure中不能作為PaaS(平臺即服務)使用。 因此,如果您想在云中使用SSAS,那就是表格 。 我認為從本地OLAP多維數據集到Azure Tabular模型的遷移/重新設計并不少見。 對于具有已實現分區解決方案的大型表,也應將其移植。

Where Visual Studio provided a decent GUI for partitioning in the OLAP world, it’s not the case for tabular. It feels like a beta development environment that has been mostly abandoned because the focus has been shifted to other products (guesses are welcome, I’d say it’s Power BI but I often find the Microsoft roadmap confusing especially with how intensely Azure is extending and gaining an ever growing chunk in Microsoft’s income).

Visual Studio在OLAP世界中為分區提供了不錯的GUI,而表格格式則不是這種情況。 感覺就像是一個Beta開發環境,由于重點已轉移到其他產品而被放棄了( 歡迎猜測,我想說的是Power BI,但我經常發現Microsoft的路線圖令人困惑,尤其是與Azure的擴展和獲取程度有多大的困惑。微軟收入中不斷增長的一塊 )。

In short: let’s move that dynamic partitioning solution from OLAP into Azure Tabular!

簡而言之:讓我們將動態分區解決方案從OLAP遷移到Azure Tabular!

Goal

目標

The partitioning solution should accommodate the following requirements:

分區解決方案應滿足以下要求:

  • Based on a configuration table handle different granularity for partitions (monthly, annual, …)

    根據配置表,為分區處理不同的粒度(每月,每年,…)
  • Identify currently existing partitions

    識別當前存在的分區
  • Create a list of required partitions (this is mostly used at initialization)

    創建所需分區的列表(主要在初始化時使用)
  • Compare existing and required partitions: create / delete if needed

    比較現有分區和所需分區:根據需要創建/刪除
  • Based on the new set of data (e.g. in a staging table) update/process the relevant partitions

    基于新的數據集(例如在臨時表中)更新/處理相關分區
  • Keep logs of what should happen and what actually happens

    記錄應該發生什么以及實際發生什么

The process of Dynamic Partitioning

動態分區的過程

Used technology

二手技術

My solution consists of the below components:

我的解決方案包含以下組件:

  • SQL Server tables and stored procedures

    SQL Server表和存儲過程
  • SSIS to orchestrate the process

    SSIS協調流程
    1. C# scripts inside SSIS utilizing TOM (Tabular Object Model) – used in this solution

      SSIS中利用TOM(表格對象模型)的 C#腳本–在此解決方案中使用

      No, the second one is not Jerry 🙂 I am not sure the two methods would get on well in that cat-mouse relationship…

      不,第二個不是杰里(Jerry)🙂我不確定這兩種方法在貓鼠關系中能否相得益彰……

    2. Tabular Model Scripting Language (TMSL) where the objects are defined using JSON format 表格模型腳本語言(TMSL)的 PowerShell,其中使用JSON格式定義對象
  • Azure Analysis Services (secure connections to it)

    Azure Analysis Services(到它的安全連接)

Let’s get to it, going through the steps from the diagram one-by-one!

讓我們開始吧,一步一步地完成圖中的步驟!

動態表格分區 (Dynamic Tabular Partitioning)

Overview

總覽

The below objects are used in the solution.

解決方案中使用了以下對象。

Object name Type ?Functionality
ETL_Tabular_Partition_Config Table Stores metadata for partitions that are used when defining the new ones
ETL_Tabular_Partition_Grain_Mapping Table A simple mapping table between conceptual partition periods (e.g. Fiscal Month) and the corresponding Dim_Date column (e.g. Fiscal_Month_Code), this allows to tune partitioning periods dynamically
Dim_Date Table A fairly standard, pre-populated date table
ETL_Tabular_Partitions_Required Table The master list of changes for partitions, including all that needs to be created / deleted / processed (updated)
pr_InsertTabularPartitionsRequired Stored procedure That’s the heart of the SQL side of dynamic partitioning (details below)
ETL_Tabular_Partitions_Existing Table A simple list of partitions that currently exist in the deployed database
pr_InsertTabularPartitionsExisting Stored procedure A simple procedure that inserts a row into ETL_Tabular_Partitions_Existing and is called from a C# enumerator that loops through the existing partitions of the tabular database
Tabular_Partition.dtsx SSIS package This SSIS package is used as an orchestration of the different components of the project. In this 1st step the pr_InsertTabularPartitionsRequired stored procedure is called
對象名稱 類型 功能性
ETL_Tabular_Partition_Config 存儲定義新分區時使用的分區的元數據
ETL_Tabular_Partition_Grain_Mapping 概念分區周期(例如,財政月)和相應的Dim_Date列(例如,Fiscal_Month_Code)之間的簡單映射表,這允許動態調整分區周期
點心日期 相當標準的預填充日期表
ETL_Tabular_Partitions_Required 分區更改的主列表,包括所有需要創建/刪除/處理(更新)的更改
pr_InsertTabularPartitionsRequired 存儲過程 這是動態分區SQL方面的核心(詳細信息如下)
ETL_Tabular_Partitions_Existing 部署數據庫中當前存在的分區的簡單列表
pr_InsertTabularPartitions現有 存儲過程 一個簡單的過程,將一行插入到ETL_Tabular_Partitions_Existing中,并從C#枚舉器調用,該循環遍歷表格數據庫的現有分區
Tabular_Partition.dtsx SSIS套件 此SSIS包用作項目不同組件的編排。 在該1 工序中的pr_InsertTabularPartitionsRequired存儲過程被稱為

Date configuration

日期配置

For the date configuration, I use the ETL_Tabular_Partition_Config, the ETL_Tabular_Partition_Grain_Mapping and the Dim_Date table. A simplified version for demo purposes:

對于日期配置,我使用ETL_Tabular_Partition_Config,ETL_Tabular_Partition_Grain_Mapping和Dim_Date表。 出于演示目的的簡化版本:

    • This contains one row for each table in a tabular database assuming that only partitioning period is required (e.g. monthly), this can be further enhanced if needed

      假設只需要分區時間(例如每月),則表格數據庫中的每個表都包含一行,如果需要,可以進一步增強
    • Tabular_Database_Name and Table_Name are used to identify the objects on the server

      Tabular_Database_Name和Table_Name用于標識服務器上的對象
    • Partition_Name_Prefix is used in the naming of the partitions (e.g. Internet Sales – FY)

      Partition_Name_Prefix用于分區的命名(例如Internet Sales – FY)
    • Source_Object – fact table/view (used as the source of the tabular table)

      Source_Object –事實表/視圖(用作表格表的源)
    • source object that is the basis of partitioning (e.g. Transaction_Date_SK) 源對象中作為分區基礎的列(例如Transaction_Date_SK)
    • Stage_Table / Stage_Column– these are needed to identify the data range of the incremental dataset that is waiting to be pushed into the fact object from staging

      Stage_Table / Stage_Column –需要這些來標識正在等待從暫存過程推入事實對象的增量數據集的數據范圍
    • Partition_Grain – the key piece in this exercise to define the periodicity of the partitioning process (e.g. Fiscal Month)

      Partition_Grain –此練習中的關鍵部分,用于定義分區過程的周期性(例如,財政月)
    • Partition_Start_Date_SK / Partition_End_Date_SK – used as parameters to calculate the boundary for the list of partitions

      Partition_Start_Date_SK / Partition_End_Date_SK –用作計算分區列表邊界的參數
    • Partition_Start_Date / Partition_End_Date – calculated columns in the table as when it comes to dates it often helps to have them in surrogate key and in date format, too (the SK integer values are always unambiguous whereas the date values can be used in T-SQL date functions (e.g. EOMONTH or DATEADD) if needed)

      Partition_Start_Date / Partition_End_Date –表中的計算列涉及日期時,通常也有助于使它們具有代理鍵和日期格式(SK整數值始終是明確的,而日期值可以在T-SQL日期中使用功能(如需要,例如EOMONTH或DATEADD)
    • To find out how your conceptual partition grain (e.g. Fiscal Month) can be mapped to Dim_Date, i.e. what column in Dim_Date means that

      要了解如何將您的概念分區粒度(例如,財政月)映射到Dim_Date,即Dim_Date中的哪一列表示
    • Partition_Grain is often a user-friendly name where Partitioning_Dim_Date_Column is more technical

      Partition_Grain通常是一個用戶友好的名稱,其中Partitioning_Dim_Date_Column更具技術性
    • E.g. Fiscal Month and Fiscal_Month_of_Year_Code

      例如,會計月份和會計年度代碼
    • If you have a data warehouse of any type, most likely it already contains one so why not use what’s already there? Especially that it can contain a lot of logic around your own company’s fiscal structure.

      如果您有任何類型的數據倉庫,那么很可能已經包含一個數據倉庫,那么為什么不使用已有的數據倉庫呢? 特別是它可以包含圍繞您自己公司的財務結構的許多邏輯。
    • You can create a temp table on the fly or can use built-in T-SQL date functions but that adds an unnecessary complexity to the procedure without any benefit

      您可以動態創建臨時表,也可以使用內置的T-SQL日期函數,但這會給過程增加不必要的復雜性,而沒有任何好處
    • I included some sample values in the table diagram (usually such examples don’t belong to an entity-relationship diagram) to show what I mean by them

      我在表圖中包含了一些示例值(通常這樣的示例不屬于實體關系圖)以顯示它們的含義
    • The relationship to Dim_Date is defined using dynamic T-SQL in the stored procedure (see next section)

      與Dim_Date的關系是在存儲過程中使用動態T-SQL定義的(請參閱下一節)

檢查現有分區–腳本任務 (Check Existing Partitions – script task)

TOM – Tabular Object Model

TOM –表格對象模型

I chose C# for this script’s language and the TOM (Tabular Object Model) objects are required to interact with tabular servers and their objects. To use them some additional references are needed on the server (if you use ADF and SSIS IR in the cloud, these are available according to the Microsoft ADF team) that are part of the SQL Server 2016 Feature Pack. You can find more info about how to install it here:

我選擇C#作為該腳本的語言,并且TOM(表格對象模型)對象是與表格服務器及其對象進行交互所必需的。 要使用它們,服務器上需要一些其他引用( 如果您在云中使用ADF和SSIS IR,根據Microsoft ADF團隊的要求,這些引用是SQL Server 2016 Feature Pack的一部分。 您可以在此處找到有關如何安裝的更多信息:

Install, distribute, and reference the Tabular Object Model

安裝,分發和引用表格對象模型

And the official TOM Microsoft reference documentation can be very handy:

官方的TOM Microsoft參考文檔可能非常方便:

Understanding Tabular Object Model (TOM) in Analysis Services AMO

了解Analysis Services AMO中的表格對象模型(TOM)

The part that is related specifically to the partitions:

與分區特別相關的部分:

Create Tables, Partitions, and Columns in a Tabular model

在表格模型中創建表,分區和列

Variables

變數

The below variables are needed to be passed from the package to the script:

需要將以下變量從包傳遞到腳本:

  • Audit_Key (I use this everywhere, so each execution is logged separately)

    Audit_Key(我在所有地方都使用它,因此每個執行單獨記錄)
  • Tabular_Database_Name

    Tabular_Database_Name
  • Tabular_Table_Name

    Tabular_Table_Name
  • ConnStr_Configuration_Database – needed for the stored procedure that writes the results of the TOM script into t SQL table

    ConnStr_Configuration_Database –將TOM腳本結果寫入t SQL表的存儲過程所需
  • it should work with an on-prem tabular server as well), multiple forms of authentication can be used (它也應與本地表格服務器一起使用),可以使用多種形式的身份驗證( service principal with username/password in the connection string that is obtained from a key vault during runtime, Managed Service Identity, …), this is a quickly evolving area of Azure 從密鑰獲得的連接字符串中具有用戶名/密碼的服務主體)運行時的保管庫,托管服務標識等 ),這是Azure的快速發展領域

Make sure you include the above variables, so they can be used in the script later on:

確保包括上述變量,以便稍后可以在腳本中使用它們:

The syntax for referencing them (as it’s not that obvious) is documented here:

引用它們的語法(不太明顯)在此處記錄:

Using Variables in the Script Task

在腳本任務中使用變量

Main functionality

主要功能

The script itself does nothing else but loops through all existing partitions and calls a stored procedure row-by-row that inserts the details of that partition into a SQL table.

該腳本本身不執行其他任何操作,而是循環遍歷所有現有分區并逐行調用存儲過程,該存儲過程將該分區的詳細信息插入到SQL表中。

  • Identify existing partitions.TOM.cs (識別現有的partitions.TOM.cs ( I removed all comments that SSIS puts there by default and please make sure you don’t just copy&paste it as the namespace ST_… (line 12) is different for your script task!) 我刪除了SSIS默認情況下在其中添加的所有注釋,請確保您不要只是復制并粘貼它,因為命名空間ST_…(第12行)與您的腳本任務不同!)
  • pr_InsertTabularPartitionsExisting.sql pr_InsertTabularPartitionsExisting.sql
    • ETL_Tabular_Partitions_Existing.sql ETL_Tabular_Partitions_Existing.sql
    • ETL_Tabular_Partitions_Planned.sql ETL_Tabular_Partitions_Planned.sql

SQL整理分區處理方法 (SQL to sort out what to do with partitions)

All this logic is coded into pr_InsertTabularPartitionsRequired (feel free to use a better name if you dislike this one) and in high level it does the following:

所有這些邏輯都編碼為pr_InsertTabularPartitionsRequired(如果您不喜歡此名稱,請隨意使用更好的名稱),并在較高級別執行以下操作:

Gray means T-SQL, white is C# (see the previous section), dark grey is putting everything together.

灰色表示T-SQL,白色表示C#( 請參閱上一節 ),深灰色表示將所有內容組合在一起。

Here is the code of my procedure, it works assuming you have the three tables defined previously and you configured the values according to your databases / tables / columns.

這是我的過程的代碼,假設您已預先定義了三個表,并且已根據數據庫/表/列配置了值,則該代碼可以正常工作。

pr_InsertTabularPartitionsRequired.sql

pr_InsertTabularPartitionsRequired.sql

It is mostly self-explanatory, and the inline comments can guide you as well. Some additional comments:

它主要是不言自明的,內聯注釋也可以指導您。 一些其他評論:

  • Dynamic SQL must be used because the column that is used from Dim_Date cannot be hardcoded and that is part of the query that extracts the list of date periods from there

    必須使用動態SQL,因為不能對來自Dim_Date的列進行硬編碼,并且該列是從中提取日期期間列表的查詢的一部分
  • The CREATE TABLE is defined outside the dynamic SQL otherwise the scope of it is limited to the execution of that dynamic code and then the temp table is cleaned out of memory thus not usable later in the procedure’s session

    CREATE TABLE是在動態SQL之外定義的,否則它的范圍僅限于該動態代碼的執行,然后將temp表從內存中清除掉,因此無法在過程的會話中使用
  • I use EXECUTE sp_executesql @sql_string instead of EXEC (@sql_string) as best practice though in this case due to the low volume of data both perform satisfactorily but while there are good reasons to use EXECUTE sp_executesql instead of EXEC, the latter doesn’t really have any advantage apart from being quicker to type

    我使用EXECUTE sp_executesql @sql_string代替EXEC(@sql_string)作為最佳實踐,盡管在這種情況下,由于數據量較低,兩者都令人滿意,但是盡管有充分的理由使用EXECUTE sp_executesql代替EXEC,但后者并沒有真正的作用。除了打字速度快外,還有其他優勢
  • cte_partition_config – simply extracts the config data for the required tabular database and table cte_partition_config –只需提取所需表格數據庫和表的配置數據
  • cte_partition_required – the cross join is used to create as many rows with the config values as many partitions are needed, using the actual date-related information in the partition names. With a simple example: for monthly partitions in 2018 all the metadata for the tabular table needs to be read only once but the month values / descriptions are needed 12 times
  • cte_partition_required –交叉聯接用于使用分區名稱中與日期相關的實際信息來創建具有配置值的行,該行的配置值與所需的分區數相同。 舉一個簡單的例子:對于2018年的每月分區,表格表的所有元數據只需要讀取一次,但是月份值/描述則需要12次
  • Populate ETL_Tabular_Partitions_Required – a straightforward comparison between existing (see section 2) and required (or planned) partitions using some set theory 填充ETL_Tabular_Partitions_Required –使用一些集合理論在現有分區(請參閱第2節)和必需分區(或計劃分區)之間進行直接比較
    • If a partition doesn’t exist but is required => CREATE it

      如果一個分區不存在但是是必需的=>創建它
    • If it is not required but exists => DELETE it

      如果不是必需的但存在=>刪除它
    • If it’s both then (guess what?) it EXISTS 🙂 so at this point it can be left alone

      如果兩者都存在(猜測是什么?),則它存在IST,因此,此時可以將其單獨放置

Additionally, a WHERE clause for each partition is defined which can be used later when it is time to actually create them.

另外,為每個分區定義了一個WHERE子句,稍后可以在實際創建它們時使用。

  • List_Staging_Periods – this last step uses the config data to check what dates (usually that is the lowest level) exists in the data that is staged and was loaded into the final fact table/view to know which partitions need an update. E.g. if your incremental dataset has data only for the last 2 days and you are in the middle of the month, you only need to process the partition for the current month and leave the others as they are List_Staging_Periods –這最后一步使用配置數據檢查已暫存的數據中的日期(通常是最低級別),該日期已加載到最終的事實表/視圖中,以了解哪些分區需要更新。 例如,如果您的增量數據集僅具有最近兩天的數據,并且您處于該月的中旬,則只需處理當月的分區,而其余部分保持不變

創建新分區/刪除不需要的分區/進程 (Create new partitions / drop the ones not needed / Process)

Again, back to the C# realm.

再次回到C#領域。

Code Confusion

代碼混亂

One particular inconsistency caught me as I had to spend half an hour to figure out why removing a partition has a different syntax then processing. It might be totally straightforward with people having a .NET background but different than how T-SQL conceptually work.

一個特別的不一致引起了我的注意,因為我不得不花半個小時來弄清楚為什么刪除分區的語法與處理語法不同。 對于具有.NET背景但與T-SQL在概念上不同的人,這可能是完全簡單的。

Tabular_Table.Partitions.Remove(Convert.ToString(Partition["Partition_Name"]));Tabular_Table.Partitions[Convert.ToString(Partition["Partition_Name"])].RequestRefresh(RefreshType.Full);

Conceptually

從概念上講

  • Deletion – Collection.Action(Member of Collection)

    刪除– Collection.Action(集合成員)
  • Process – Collection(Member of Collection).Action(ActionType)

    流程–集合(集合成員).Action(動作類型)

Source query for new partitions

源查詢新分區

How to assign the right query for each partition? Yes, we have the WHERE conditions in the ETL_Tabular_Partitions_Required table but the other part of the query is missing which has the date filtering to ensure there are no overlapping partitions. For that I use a trick (I am sure you can think of other ways, but I found this next one easy to implement and maintain): I have a pattern partition in the solution itself under source control. It has to be in line with the up-to-date view/table definitions otherwise the solution can’t be deployed as the query would be incorrect. I just need to make sure it always stays empty. For that a WHERE condition like 1=2 is sufficient enough (as long as the basic arithmetic laws don’t change). Its naming is “table name – pattern

如何為每個分區分配正確的查詢? 是的,我們在ETL_Tabular_Partitions_Required表中具有WHERE條件,但缺少查詢的其他部分,該部分具有日期過濾功能以確保沒有重疊的分區。 為此,我使用了一個技巧( 我相信您可以想到其他方法,但是我發現下一個易于實現和維護 ):我在源代碼控制下的解決方案中有一個模式分區。 它必須與最新的視圖/表定義保持一致,否則該解決方案將無法部署,因為查詢將不正確。 我只需要確保它始終為空即可。 為此, 只要 1 = 2這樣的WHERE條件就足夠了( 只要基本算術定律不變 )。 它的名稱是“ 表名-模式

Then I look for that partition (see the details in the code at the end of the section), extract its source query, strip off the WHERE condition and then when looping through the new partitions, I just append the WHERE clause from the ETL_Tabular_Partitions_Required table.

然后,我尋找該分區( 請參閱本節末尾的代碼中的詳細信息 ),提取其源查詢,剝離WHERE條件,然后在遍歷新分區時,只需從ETL_Tabular_Partitions_Required表中追加WHERE子句。

string Tabular_Table_Name = "your table name";
string Tabular_Partition_Pattern_Name = Tabular_Table_Name + " - pattern";
//connect to tabular model
var Tabular_Server = new Server();
string Tabular_ConnStr = "your connection string";Tabular_Server.Connect(Tabular_ConnStr);
Database Tabular_Db = Tabular_Server.Databases[Tabular_Database_Name];
Model Tabular_Model = Tabular_Db.Model;
Table Tabular_Table = Tabular_Model.Tables[Tabular_Table_Name];Partition Patter_Partition = Tabular_Table.Partitions.Find(Tabular_Partition_Pattern_Name);

Note: I use SQL queries not M ones in my source but here’s the code that helps you get both types from the tabular database’s partition using .NET once you have identified the proper partition that contains the pattern:

注意:我在源代碼中使用的不是SQL查詢,但下面的代碼可幫助您在確定包含模式的適當分區后使用.NET從表格數據庫分區中獲取兩種類型的代碼:

For SQL

對于SQL

string Partition_Pattern_Query_SQL=
((Microsoft.AnalysisServices.Tabular.QueryPartitionSource)
(Pattern_Partition.Source.Partition).Source).Query.ToString();

For M

對于M

string Partition_Pattern_Query_M =
((Microsoft.AnalysisServices.Tabular.CalculatedPartitionSource)
(Pattern_Partition.Source.Partition).Source).Query.ToString();

Script steps

腳本步驟

Now I have the first half of the SQL query, I have the building blocks for this last step of the partitioning process:

現在,我有了SQL查詢的前半部分,有了分區過程的最后一步的構建塊:

  1. ETL_Tabular_Partitions_Required table. It stores the action flag that identifies what needs to be done with each of the partitions, too. ETL_Tabular_Partitions_Required表的內容,提取有關所有需要更改的分區的信息。 它還存儲了操作標志,該標志也標識了每個分區需要執行的操作。
  2. switch between these three options:開關開始循環瀏覽所有分區:
    1. Create AND process any partition (with the proper SQL query behind it) that’s needed but does not exist yet

      創建并處理需要但尚不存在的任何分區(后面帶有適當SQL查詢)
    2. Delete partitions that are not needed anymore

      刪除不再需要的分區
    3. Process the ones that already exist but has incoming new data

      處理已經存在但有新數據的數據

Don’t forget that after the loop the tabular model must be saved and that is when all the previously issued commands are actually executed at the same time:

不要忘記在循環之后必須保存表格模型,也就是說,實際上同時執行了所有先前發出的命令:

Tabular_Model.SaveChanges();

The code bits that you can customize to use in your own environment:

您可以自定義以在自己的環境中使用的代碼位:

PartitionActions.TOM.cs

分區動作.TOM.cs

結語 (Wrap up)

So, by now you should have an understanding of how partitioning works in tabular Azure Analysis Services and not just how the processing can be automated but the creation / removal of the partitions based on configuration data (instead of just defining all the partitions beforehand until e.g. 2030 for all months).

因此,到目前為止,您應該已經了解分區在表格式Azure Analysis Services中的工作方式,不僅是如何自動化處理,而且還基于配置數據創建/刪除分區( 而不是僅預先定義所有分區,直到例如到2030年為止 )。

The scripts – as I said at the beginning – cannot be used just as they are due to the complexity of the Azure environment and that the solution includes more than just a bunch of SQL tables and queries: .NET scripts and Azure Analysis Services.

正如我一開始所說的那樣,由于Azure環境的復雜性,無法使用它們,因為該解決方案不僅僅包含一堆SQL表和查詢:.NET腳本和Azure Analysis Services。

I aimed to use generic and descriptive variable and column names, but it could easily happen that I missed the explanation of something that became obvious to me during the development of this solution. In that case please feel free to get in touch with me using the comments section or sending an email to mi_technical@vivaldi.net

我的目標是使用通用的和描述性的變量名和列名,但是很容易發生這種情況,因為我錯過了在開發此解決方案時對我來說顯而易見的解釋。 在這種情況下,請隨時使用評論部分與我聯系或發送電子郵件至mi_technical@vivaldi.net

Thanks for reading!

謝謝閱讀!

翻譯自: https://www.sqlshack.com/dynamic-partitioning-in-azure-analysis-services-tabular/

azure

版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接:https://hbdhgg.com/4/144443.html

发表评论:

本站为非赢利网站,部分文章来源或改编自互联网及其他公众平台,主要目的在于分享信息,版权归原作者所有,内容仅供读者参考,如有侵权请联系我们删除!

Copyright © 2022 匯編語言學習筆記 Inc. 保留所有权利。

底部版权信息