什么是數據資源目錄,azure 入門_Azure數據目錄入門

 2023-10-18 阅读 24 评论 0

摘要:azure 入門 This article talks about Azure Data Catalog and how data professionals can use it to locate, understand and consume data sources. 本文討論了Azure數據目錄以及數據專業人員如何使用它來查找,理解和使用數據源。 什么是數據資源目錄、 As the name

azure 入門

This article talks about Azure Data Catalog and how data professionals can use it to locate, understand and consume data sources.

本文討論了Azure數據目錄以及數據專業人員如何使用它來查找,理解和使用數據源。

什么是數據資源目錄、 As the name suggests, it is a service in Azure that helps users organize, discover and register data sources. This fully managed cloud service acts as a central shared place in an organization for developers, analysts, data scientists and users to contribute their knowledge and help to locate, understand and consume data.

顧名思義,它是Azure中的一項服務,可幫助用戶組織,發現和注冊數據源。 這種完全托管的云服務充當組織中開發人員,分析師,數據科學家和用戶的中央共享場所,以貢獻他們的知識并幫助查找,理解和使用數據。

Data Catalog in Azure does not move data and it remains in its existing location, a copy of its structural and descriptive metadata is added to the Data Catalog, along with a reference to the data-source location. This metadata is indexed making the data easily searchable.

Azure中的數據目錄不會移動數據,而是保留在其現有位置,其結構性和描述性元數據的副本將添加到數據目錄中,并附帶對數據源位置的引用。 對該元數據建立索引,使數據易于搜索。

為什么我們需要一個Azure數據目錄? (Why do we need an Azure Data Catalog?)

  • Companies are generating and storing boatloads of data every day, and with this fast-growing data, discovering data sources are challenging for both data producers and data consumers

    公司每天都在生成和存儲大量的數據,而隨著數據的快速增長,發現數據源對數據生產者和數據消費者都構成了挑戰。
  • It becomes highly complex and time-consuming to create and maintain documentation of large data sources

    創建和維護大型數據源的文檔變得非常復雜且耗時
  • tribal knowledge (information that is known within a company) that exists within an organization and it becomes little challenging for a newcomer in the company to seek all this knowledge. Azure Data Catalog rightly addresses this issue by providing a platform to gain information about the data and hence, it makes data sources easily discoverable and understandable 一定數量的部族知識 (公司內部已知的信息),并且公司中的新人尋求所有這些知識幾乎沒有挑戰。 Azure數據目錄通過提供一個平臺來獲取有關數據的信息來正確解決此問題,因此,它使數據源易于發現和理解
  • With Data Catalog, developers no longer have to spend time looking and searching data using complex queries

    使用數據目錄,開發人員不再需要花費時間使用復雜的查詢來查找和搜索數據

Azure數據目錄過程涉及: (Azure Data Catalog process involves:)

安卓應用數據目錄、 Below are the steps that are usually followed as we proceed in the Data Catalog:

以下是我們在數據目錄中進行時通常遵循的步驟:

  1. Create a data catalog – this is the first step to provision a Data Catalog

    創建數據目錄–這是供應數據目錄的第一步
  2. Register and annotate assets – Users can register their data sources, and also add annotations with tags, documents and understandable descriptions

    注冊和注釋資產–用戶可以注冊其數據源,還可以添加帶有標簽,文檔和易于理解的描述的注釋
  3. Discover and consume assets – Users can easily search and filter assets with indexed metadata

    發現和使用資產–用戶可以輕松地使用索引的元數據搜索和過濾資產
  4. Connect to Data – This lets you connect and pull data into various tools like Excel, Power BI, SSDT etc.

    連接到數據–這使您可以連接數據并將數據拉入各種工具,例如Excel,Power BI,SSDT等。

使用Azure數據目錄時要記住的重要點 (Important points to remember while working with Azure Data Catalog)

To set up a Data Catalog, you are supposed to be the owner or co-owner of an Azure subscription.

要設置數據目錄,您應該是Azure訂閱的所有者或共同所有者。

大數據目錄、Only one Data Catalog is supported per organization (i.e. per tenant) and you cannot have additional catalogs even if you have multiple subscriptions.

每個組織(即每個租戶)僅支持一個數據目錄,即使您有多個訂閱,也無法擁有其他目錄。

Data Catalog only supports work or school accounts, so in order to create a data catalog in Azure, you need to have a work or school account.

數據目錄僅支持工作或學校帳戶 ,因此,要在Azure中創建數據目錄,您需要擁有工作或學校帳戶。

azure cli。 Without any further delay, let’s see Azure Data Catalog in action –

無需再拖延,讓我們看看運行中的Azure數據目錄–

This article assumes you have basic knowledge of Azure, familiar with working with Azure SQL database and have an Azure Subscription.

本文假定您具有Azure的基本知識,熟悉使用Azure SQL數據庫并具有Azure訂閱 。

如何創建Azure數據目錄? (How to create an Azure Data Catalog?)

數據資源目錄和元數據、 You can create Data Catalog like any other Azure resource through the Azure portal. Go to the portal, search for Data Catalog, and mention a name for your data catalog. You will also have to specify the subscription name, the location for the?catalog, and the pricing tier (free or standard edition). Then select?Create. Finally, go to the?Azure Data Catalog?home page and select Publish?Data.

您可以通過Azure門戶像其他任何Azure資源一樣創建數據目錄。 轉到門戶網站,搜索“ 數據目錄” ,并為您的數據目錄命名。 您還必須指定訂閱名稱,目錄位置和定價層(免費版或標準版)。 然后選擇創建 。 最后,轉到Azure數據目錄主頁,然后選擇“ 發布數據”。

Create a Data Catalog

Alternatively, you can go to the Azure Data Catalog provision page, and type in Data Catalog Name, the subscription you may want to use, and the location for the catalog as shown below.

數據服務目錄? 或者,可以轉到“ Azure數據目錄設置”頁面 ,然后鍵入“ 數據目錄名稱” ,您可能要使用的訂閱以及目錄的位置 ,如下所示。

Provisioning Data Catalog in Azure.

azure云服務器,Scroll a little down to select the Pricing, this service is offered in two editions. For this demo, I am selecting the FREE EDITION.

向下滾動以選擇Pricing ,此服務提供兩個版本。 對于此演示,我選擇 免費版。

Pricing in Data Catalog in Azure.

I am keeping everything as default for the below categories, your ID is automatically added as a catalog user and an administrator. You can further add catalog users and catalog administrators to the catalog. And finally, click Create Catalog to create a Data Catalog named, OurSalesData in Azure.

微軟azure? 我將以下類別的所有內容保留為默認值,您的ID將自動添加為目錄用戶和管理員。 您可以進一步將目錄用戶和目錄管理員添加到目錄中。 最后,單擊“ 創建目錄”以在Azure中創建一個名為OurSalesData的數據目錄。

Data Catalog settings in Azure.

The Data Catalog is successfully created and you can view the same in the Azure portal as shown below. Resource group, DataCatalogs-EastUS is created automatically and the catalog resides in this. Also, if you notice, I already have SQL Server and SQL database resources created in my account.

數據目錄已成功創建,您可以在Azure門戶中查看數據目錄,如下所示。 資源組DataCatalogs-EastUS是自動創建的,目錄位于其中。 另外,如果您注意到,我已經在我的帳戶中創建了SQL Server和SQL數據庫資源。

Data Catalog created in the Portal.

Click on the Data Catalog to view properties of the catalog and you can also edit them.

單擊數據目錄以查看目錄的屬性,您也可以對其進行編輯。

Data Catalog properties and overview.

啟動桌面應用程序以在Azure數據目錄中注冊數據源 (Launch the desktop application to register your data sources in Azure Data Catalog)

Now coming back to the Data Catalog page, after clicking on Create Catalog button above, you will be taken to the below screen.

現在回到“數據目錄”頁面,單擊上面的“創建目錄”按鈕后,您將進入以下屏幕。

Data Catalog home page.

There are two options with which you can register or publish your data sources in the Data Catalog, – Launch Application and Create Manual Entry. I personally do not prefer the “Create Manual Entry” option, as it would be a challenging and time-consuming activity for larger data sources. It is better to go with the “Launch Application” option as it is just a click-once application.

您可以使用兩個選項在“數據目錄”中注冊或發布數據源:“啟動應用程序”和“創建手動輸入”。 我個人不喜歡“創建手動輸入”選項,因為對于較大的數據源而言,這將是一項艱巨而耗時的活動。 最好使用“啟動應用程序”選項,因為它只是一個單擊一次的應用程序。

Install this application:

安裝此應用程序:

Installing the application.

Once, this application is successfully installed, you are brought in to the Sign in page. Sign-in using the same credentials that you used to access the catalog in the portal.

成功安裝此應用程序后,您將進入“ 登錄”頁面。 使用與訪問門戶中的目錄相同的憑據登錄。

Data Catalog Sign in page.

選擇數據源 (Selecting a data source )

Let’s head over to select a data source in order to register it in your Data Catalog.

讓我們先選擇一個數據源,以便將其注冊到您的數據目錄中。

You can register tons of data sources like SQL Server, Reporting Services, HDFS, Hive, HANA database, Azure Data Lake Analytics etc. as shown below in the Data Catalog. Since I already have a SQL database in my account, I will go with SQL Server as the data source. Click on SQL Server and select NEXT.

您可以注冊大量數據源,例如SQL Server,Reporting Services,HDFS,Hive,HANA數據庫,Azure Data Lake Analytics等,如下數據目錄中所示。 由于我的帳戶中已經有一個SQL數據庫,因此我將選擇SQL Server作為數據源。 單擊SQL Server并選擇NEXT

Selecting data source in the Data Catalog.

Provide SQL Server Name, the authentication Type, and also the database (mysqldb, in this case) that you want to register and click CONNECT.

提供SQL Server名稱,身份驗證類型以及要注冊的數據庫(在這種情況下為mysqldb),然后單擊CONNECT。

SQL Server details.

在Azure數據目錄中注冊數據源 (Register a data source in Azure Data Catalog)

Expand your database and select SalesLT, you will be provided with all the objects under Available objects that you want to register in your data catalog. I have selected all of them using a double right arrow (>>). Also, click on Include Preview option to preview sample data later.

展開數據庫并選擇SalesLT,將為您提供要在數據目錄中注冊的“可用對象”下的所有對象。 我使用向右雙箭頭(>>)選擇了所有這些對象。 另外,單擊“ 包括預覽”選項以稍后預覽樣本數據。

Registering data sources in the Data Catalog.

The registration of objects has been done and you can also register more objects using ‘register more objects’ option. For now, let’s click on VIEW PORTAL to discover our data.

對象的注冊已完成,您也可以使用“注冊更多對象”選項注冊更多對象。 現在,讓我們單擊“查看門戶”以發現我們的數據。

Selected objects are registered and can be viewed on the portal.

如何發現和注釋Azure數據目錄中的數據源 (How to discover and annotate data sources in an Azure Data Catalog)

Suppose that we want to look for the information related to any order in the database, for this, you can type ‘order’ in the search bar and you will find two SQL Server tables related to orders.

假設我們要在數據庫中查找與任何訂單相關的信息,為此,您可以在搜索欄中鍵入“ order”,您將找到兩個與訂單相關SQL Server表。

Discovering data in the data catalog.

You can further annotate this data asset by providing a friendly name (I have typed in OrdersIn2020 as a friendly name), some description, who is the expert, etc. in the Properties tab as shown below.

您可以通過在“ 屬性”選項卡中提供一個友好名稱(我在OrdersIn2020中輸入的友好名稱),一些描述,誰是專家等來進一步注釋此數據資產,如下所示。

Annotate data sources.

Click on the Preview icon to view a sample of the data it contains.

單擊預覽圖標以查看其中包含的數據的示例。

Preview data tab.

We can also add meaningful descriptions and tags to all the columns present in the table in the Columns tab. This will not only help us know where the attribute is located but also depicts what this data attribute is all about.

我們還可以向“ 列”選項卡中表中存在的所有列添加有意義的描述和標記。 這不僅可以幫助我們知道屬性的位置,還可以描述此數據屬性的全部含義。

Annotate columns in the data catalog.

At times, tags and descriptions are not enough to provide a clear understanding of the data asset. To make it more understandable for data consumers, you can add documentation related to this data asset in the Documentation tab as shown below. This will help provide a complete and detailed explanation of data assets.

有時,標簽和描述不足以提供對數據資產的清晰理解。 為了使數據使用者更容易理解,可以在“ 文檔”選項卡中添加與此數據資產相關的文檔 ,如下所示。 這將有助于提供對數據資產的完整而詳細的解釋。

Documentation in the data catalog.

如何連接到Azure數據目錄中的數據源 (How to connect to data sources in an Azure Data Catalog)

Once we are done registering, locating and annotating data, we can also connect to the data source using Data Catalog service. This service offers multiple options to connect to a data source. You can do so by clicking the ‘Open In …’ icon in the horizontal tile. You will find, we can connect our data source to Excel, SSDT and Power BI.

完成數據的注冊,定位和注釋后,我們還可以使用數據目錄服務連接到數據源。 該服務提供了多個選項以連接到數據源。 您可以通過點擊水平磁貼中的“ 打開方式... ”圖標來實現。 您會發現,我們可以將數據源連接到Excel,SSDT和Power BI。

Connect to a data source in the Data Catalog.

To connect this data source in Power BI Desktop (provided Power BI Desktop is installed on the client computer), click the Power BI Desktop option from the contextual menu.

要在Power BI Desktop中連接此數據源(客戶端計算機上已安裝了Power BI Desktop),請從上下文菜單中單擊Power BI Desktop選項。

Data users can now view, analyze and visualize their data in the Power BI Desktop app as shown below.

數據用戶現在可以在Power BI Desktop應用程序中查看,分析和可視化其數據,如下所示。

Data connected in the Power BI Desktop.

You can also go over this Microsoft documentation, to know more about Data Catalog service in Azure.

您也可以瀏覽此Microsoft文檔 ,以了解有關Azure中數據目錄服務的更多信息。

結論 (Conclusion)

We discussed important facts about Azure Data Catalog in this short article. Along the way, we also saw how this tool makes the lives of users easier by discovering, understanding and consuming data sources. If you have any questions, please feel free to ask in the comments section below.

在這篇簡短的文章中,我們討論了有關Azure數據目錄的重要事實。 在此過程中,我們還看到了該工具如何通過發現,理解和使用數據源使用戶的生活更輕松。 如果您有任何疑問,請隨時在下面的評論部分中提問。

翻譯自: https://www.sqlshack.com/getting-started-with-azure-data-catalog/

azure 入門

版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接:https://hbdhgg.com/1/144879.html

发表评论:

本站为非赢利网站,部分文章来源或改编自互联网及其他公众平台,主要目的在于分享信息,版权归原作者所有,内容仅供读者参考,如有侵权请联系我们删除!

Copyright © 2022 匯編語言學習筆記 Inc. 保留所有权利。

底部版权信息