Which Azure Data Factory components should you recommend using together to import the daily inventory data from SQL to Data Lake Storage? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Correct Answer:
Explanation Box 1: Self-hosted integration runtime A self-hosted IR is capable of nunning copy activity between a cloud data stores and a data store in private network. Scenario: Daily inventory data comes from a Microsoft SQL server located on a private network. Box 2: Schedule trigger Daily schedule Box 3: Copy activity Scenario: Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
Question 157
You are designing a new application that uses Azure Cosmos DB. The application will support a variety of data patterns including log records and social media mentions. You need to recommend which Cosmos DB API to use for each data pattern. The solution must minimize resource utilization. Which API should you recommend for each data pattern? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Correct Answer:
Explanation Log records: SQL Social media mentions: Gremlin You can store the actual graph of followers using Azure Cosmos DB Gremlin API to create vertexes for each user and edges that maintain the "A-follows-B" relationships. With the Gremlin API, you can get the followers of a certain user and create more complex queries to suggest people in common. If you add to the graph the Content Categories that people like or enjoy, you can start weaving experiences that include smart content discovery, suggesting content that those people you follow like, or finding people that you might have much in common with. References: https://docs.microsoft.com/en-us/azure/cosmos-db/social-media-apps
Question 158
You need to implement an Azure Storage account that will use a Blob service endpoint that uses zone- redundant storage (ZRS). The storage account must only accept connections from a virtual network over Azure Private Link. What should you include in the implementation?
Correct Answer: A
You can use private endpoints for your Azure Storage accounts to allow clients on a virtual network (VNet) to securely access data over a Private Link. When creating the private endpoint, you must specify the storage account and the storage service to which it connects. You need a separate private endpoint for each storage service in a storage account that you need to access, namely Blobs, Data Lake Storage Gen2, Files, Queues, Tables, or Static Websites. Note: The private endpoint uses an IP address from the VNet address space for your storage account service. Network traffic between the clients on the VNet and the storage account traverses over the VNet and a private link on the Microsoft backbone network, eliminating exposure from the public internet. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-private-endpoints Design for high availability and disaster recovery Testlet 1 Case study This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided. To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study. At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section. To start the case study To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question. Overview Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas. Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout. Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks. Requirements. Business Goals Litware wants to create a new analytics environment in Azure to meet the following requirements: * See inventory levels across the stores. Data must be updated as close to real time as possible. * Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products. * Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data. Requirements. Technical Requirements Litware identifies the following technical requirements: * Minimize the number of different Azure services needed to achieve the business goals. * Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware. * Ensure that the analytical data store is accessible only to the company's on-premises network and Azure services. * Use Azure Active Directory (Azure AD) authentication whenever possible. * Use the principle of least privilege when designing security. * Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed. * Limit the business analysts' access to customer contact information, such as phone numbers, because this type of data is not analytically relevant. * Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion. Requirements. Planned Environment Litware plans to implement the following environment: * The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure. * Customer data, including name, contact information, and loyalty number, comes from Salesforce, a SaaS application, and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table. * Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table. * Daily inventory data comes from a Microsoft SQL server located on a private network. * Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year. * Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours. * Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.
Question 159
What should you do to improve high availability of the real-time data processing solution?
Correct Answer: A
Guarantee Stream Analytics job reliability during service updates Part of being a fully managed service is the capability to introduce new service functionality and improvements at a rapid pace. As a result, Stream Analytics can have a service update deploy on a weekly (or more frequent) basis. No matter how much testing is done there is still a risk that an existing, running job may break due to the introduction of a bug. If you are running mission critical jobs, these risks need to be avoided. You can reduce this risk by following Azure's paired region model. Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-job-reliability Design for high availability and disaster recovery Testlet 2 Case study This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided. To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study. At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section. To start the case study To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question. Overview You develop data engineering solutions for Graphics Design Institute, a global media company with offices in New York City, Manchester, Singapore, and Melbourne. The New York office hosts SQL Server databases that stores massive amounts of customer data. The company also stores millions of images on a physical server located in the New York office. More than 2 TB of image data is added each day. The images are transferred from customer devices to the server in New York. Many images have been placed on this server in an unorganized manner, making it difficult for editors to search images. Images should automatically have object and color tags generated. The tags must be stored in a document database, and be queried by SQL. You are hired to design a solution that can store, transform, and visualize customer data. Requirements Business The company identifies the following business requirements: * You must transfer all images and customer data to cloud storage and remove on-premises servers. * You must develop an analytical processing solution for transforming customer data. * You must develop an image object and color tagging solution. * Capital expenditures must be minimized. * Cloud resource costs must be minimized. Technical The solution has the following technical requirements: * Tagging data must be uploaded to the cloud from the New York office location. * Tagging data must be replicated to regions that are geographically close to company office locations. * Image data must be stored in a single data store at minimum cost. * Customer data must be analyzed using managed Spark clusters. * Power BI must be used to visualize transformed customer data. * All data must be backed up in case disaster recovery is required. Security and optimization All cloud data must be encrypted at rest and in transit. The solution must support: * parallel processing of customer data * hyper-scale storage of images * global region data replication of processed image data
Question 160
Which Azure service should you recommend for the analytical data store so that the business analysts and data scientists can execute ad hoc queries as quickly as possible?
Correct Answer: A
There are several differences between a data lake and a data warehouse. Data structure, ideal users, processing methods, and the overall purpose of the data are the key differentiators. Scenario: Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks. Note: Azure Synapse Analytics formerly known as Azure SQL Data Warehouse. Design Azure data storage solutions Question Set 7