Instant Access Amazon.DAS-C01.v2024-02-15.q109 Actual Practice Test Engine for Free (Page 14)

Question 61

A company wants to improve the data load time of a sales data dashboard. Data has been collected as .csv files and stored within an Amazon S3 bucket that is partitioned by date. The data is then loaded to an Amazon Redshift data warehouse for frequent analysis. The data volume is up to 500 GB per day.
Which solution will improve the data loading performance?

A.Compress .csv files and use an INSERT statement to ingest data into Amazon Redshift.
B.Split large .csv files, then use a COPY command to load data into Amazon Redshift.
C.Use Amazon Kinesis Data Firehose to ingest data into Amazon Redshift.
D.Load the .csv files in an unsorted key order and vacuum the table in Amazon Redshift.

Question 62

A company has collected more than 100 TB of log files in the last 24 months. The files are stored as raw text in a dedicated Amazon S3 bucket. Each object has a key of the form year-month-day_log_HHmmss.txt where HHmmss represents the time the log file was initially created. A table was created in Amazon Athena that points to the S3 bucket. One-time queries are run against a subset of columns in the table several times an hour.
A data analyst must make changes to reduce the cost of running these queries. Management wants a solution with minimal maintenance overhead.
Which combination of steps should the data analyst take to meet these requirements? (Choose three.)

A.Convert the log files to Apace Avro format.
B.Add a key prefix of the form date=year-month-day/ to the S3 objects to partition the data.
C.Convert the log files to Apache Parquet format.
D.Add a key prefix of the form year-month-day/ to the S3 objects to partition the data.
E.Drop and recreate the table with the PARTITIONED BY clause. Run the ALTER TABLE ADD PARTITION statement.
F.Drop and recreate the table with the PARTITIONED BY clause. Run the MSCK REPAIR TABLE statement.

Question 63

A data analytics specialist is setting up workload management in manual mode for an Amazon Redshift environment. The data analytics specialist is defining query monitoring rules to manage system performance and user experience of an Amazon Redshift cluster.
Which elements must each query monitoring rule include?

A.A workload name, a unique rule name, and a query runtime-based condition
B.A queue name, a unique rule name, and a predicate-based stop condition
C.A unique rule name, a query runtime condition, and an AWS Lambda function to resubmit any failed queries in off hours
D.A unique rule name, one to three predicates, and an action

Question 64

A healthcare company uses AWS data and analytics tools to collect, ingest, and store electronic health record (EHR) data about its patients. The raw EHR data is stored in Amazon S3 in JSON format partitioned by hour, day, and year and is updated every hour. The company wants to maintain the data catalog and metadata in an AWS Glue Data Catalog to be able to access the data using Amazon Athena or Amazon Redshift Spectrum for analytics.
When defining tables in the Data Catalog, the company has the following requirements:
Choose the catalog table name and do not rely on the catalog table naming algorithm. Keep the table updated with new partitions loaded in the respective S3 bucket prefixes.
Which solution meets these requirements with minimal effort?

A.Run an AWS Glue crawler that connects to one or more data stores, determines the data structures, and writes tables in the Data Catalog.
B.Use the AWS Glue console to manually create a table in the Data Catalog and schedule an AWS Lambda function to update the table partitions hourly.
C.Use the AWS Glue API CreateTable operation to create a table in the Data Catalog. Create an AWS Glue crawler and specify the table as the source.
D.Create an Apache Hive catalog in Amazon EMR with the table schema definition in Amazon S3, and update the table partition with a scheduled job. Migrate the Hive catalog to the Data Catalog.

Question 65

A marketing company is using Amazon EMR clusters for its workloads. The company manually installs third- party libraries on the clusters by logging in to the master nodes. A data analyst needs to create an automated solution to replace the manual process.
Which options can fulfill these requirements? (Choose two.)

A.Place the required installation scripts in Amazon S3 and execute them using custom bootstrap actions.
B.Place the required installation scripts in Amazon S3 and execute them through Apache Spark in Amazon EMR.
C.Install the required third-party libraries in the existing EMR master node. Create an AMI out of that master node and use that custom AMI to re-create the EMR cluster.
D.Use an Amazon DynamoDB table to store the list of required applications. Trigger an AWS Lambda function with DynamoDB Streams to install the software.
E.Launch an Amazon EC2 instance with Amazon Linux and install the required third-party libraries on the instance. Create an AMI and use that AMI to create the EMR cluster.

Question 61

Question 62

Question 63

Question 64

Question 65

Download PDF File