Unit 5: Exam Version B – Solutions
1. Which of the following is LEAST likely to be an important consideration for a company when
using crowdsourcing to analyze a large data set?
A. How to assure the quality of work re
...
Unit 5: Exam Version B – Solutions
1. Which of the following is LEAST likely to be an important consideration for a company when
using crowdsourcing to analyze a large data set?
A. How to assure the quality of work received from multiple sources.
B. Whether any data being shared contains private information or if there are any
intellectual property rights issues.
C. Will the participants be located geographically close enough to each other to
share information.
D. Which tasks within the analysis require human intervention, and which just need
computing power.
In most models used for crowdsourcing data analysis, there would not be a need for
participants to share data with one another directly. It is more likely that data would only
be shared directly with the company. Additionally, it would not usually be required for
participants to be geographically close when sharing data - most likely they would be
able to share the relevant information using the internet.
2. A police department keeps a database containing every crime report made, dating back to
2010. The following information is collected and stored for each report made:
● The date the crime was reported
● The date on which the reported crime was committed.
● The type of crime reported
● The name of the person reporting the crime (if given)
A sample portion of the database is shown below.
Date of
Report
Date of
Crime
Crime
Reported
Name of
Reporter
18/10/2017 17/10/2017 Theft May Robertson
18/10/2017 14/10/2017 Fraud Amir Solanki
18/10/2017 18/10/2017 Assault -
18/10/2017 18/10/2017 Theft Peter Jones
19/10/2017 19/10/2017 Theft Alex Martinez
19/10/2017 19/10/2017 Vandalism Candice Hall
Which of the following CANNOT be determined using only the information in the database?
A. How many more murders were committed in summer months than winter months.
B. The total number of reports of vandalism or fraud in 2017.
Unit 5 AP Computer Science Principles
2
C. How effective police action was in reducing total thefts between 2015 and 2016.
D. The average number of crimes reported per week in 2015.
Queries could be constructed to find the information for all answer choices except C.
While it would be possible to determine the reduction in thefts between these years
(assuming there was one) it would not be possible to determine the impact that police
action had on this reduction. Many other factors may also have been responsible for any
reduction.
3. When we wish to analyze unstructured data the first step is often to develop a framework for
the data and organize the data according to this (for example extracting information from an
image by using text recognition). Which of the following statements about this process is true?
A. The structured data is more useable than the unstructured data.
B. The structured data is more useful than the unstructured data.
C. All the data from the unstructured format will still be present in the structured format.
D. Each source of unstructured data can only be structured according to a single
framework.
When unstructured data is organ
[Show More]