Mathematics > QUESTIONS & ANSWERS > Dougherty Valley HighMATH STATISTICSData_100_HW_1 ( DETAILED SOLUTIONS , 100% CORRECT ) (All)
Data 100, Fall 2020 Homework 1 Due Date: Thursday, September 3, 11:59PM Total Points: 24 Submission Instructions You must submit this assignment to Gradescope by Thursday, September 3rd, at 11:5 ... 9 PM. While Gradescope accepts late submissions, you will not receive any credit for a late submission if you do not have prior accommodations (e.g. DSP). You can work on this assignment in any way you like. • One way is to download this PDF, print it out, and write directly on these pages (we’ve provided enough space for you to do so). Alternatively, if you have a tablet, you could save this PDF and write directly on it. • Another way is to use some form of LaTeX. Overleaf is a great tool. • You could also write your answers on a blank sheet of paper. Regardless of what method you choose, the end result needs to end up on Gradescope, as a PDF. If you wrote something on physical paper (like options 1 and 3 above), you will need to use a scanning application (e.g. CamScanner) in order to submit your work. When submitting on Gradescope, you must assign pages to each question correctly (it prompts you to do this after submitting your work). This significantly streamlines the grading process for our tutors. Failure to do this may result in a score of 0 for any questions that you didn’t correctly assign pages to. If you have any questions about the submission process, please don’t hesitate to ask on Piazza. Collaborators Data science is a collaborative activity. While you may talk with others about the homework, we ask that you write your solutions individually. If you do discuss the assignments with others please include their names at the top of your submission. 1 Homework 1 2 Preliminary: Sums Here’s a recap of some basic algebra written in sigma notation. The facts are all just applications of the ordinary associative and distributive properties of addition and multiplication, written compactly and without the possibly ambiguous ”...”. But if you are ever unsure of whether you’re working correctly with a sum, you can always try writing Pn i=1 ai as a1 + a2 + · · · + an and see if that helps. • You can use any reasonable notation for the index over which you are summing, just as in Python you can use any reasonable name in ‘for name in list‘. Thus Pn i=1 ai = Pn k=1 ak. • Pn i=1(ai + bi) = Pn i=1 ai + Pn i=1 bi • Pn i=1 d = nd • Pn i=1(cai + d) = c Pn i=1 ai + nd We commonly use sigma notation to compactly write the definition of the arithmetic mean (commonly known as the average): ¯ x = 1 n (x1 + x2 + ... + xn) = 1 n Pn i=1 xi. Summations 1. (6 points) For each of the statements below, either prove that it is true by using the definitions above, or show that it is false by providing a counterexample. For our purposes, each ai and xi is a real number. Hint: One way to prove something is to start with one side of the equation, and manipulate it through a valid series of steps until it looks like the other side of the equation. (a) Pn i=1 aixi Pn i=1 ai = Pn i=1 xi (Assume Pn i=1 ai 6= 0) (b) Pn i=1 a3xi = na3x¯ (c) Pn i=1 aixi = na¯x¯ qf ai xi E ai I i Xi True E ai E ai asXi a E I Yi recast EI ti T has5 True E E aix Ei aiE xi nlklEI iailnlt.LI ixi a n x n a n 5 False Homework 1 3 Calculus 2. (4 points) Let !(x) = 1 1 + e!x . (a) Show that !("x) = 1 " !(x). (b) Show that the derivative can be written as: d dx!(x) = !(x)(1 " !(x)) Minimization 3. (3 points) Consider the function f(c) = n1 Pn i=1(xi " c)2. In this scenario, suppose that our data points x1, x2, ..., xn are fixed, and that c is the only variable. Using calculus, determine the value of c that minimizes f(c). You must justify that this is indeed a minimum, and not a maximum. OC x 1 ex out e e OC H I 1 HI ext e x Lte Y t Gex l o Cx I Cite Y Y e e x l Te x e x e ex it Ee x e x te x 2 of L E 2CXi c f c ht Enie Cy c 2 2C y f y f Jo f E 2C Yi 4 C Y JE 2 o E Xi C 270 O In E F Xi thE El C min I c Homework 1 4 Probability and Statistics 4. (4 points) Much of data analysis involves interpreting proportions – lots and lots of related proportions. So let’s recall the basics. It might help to start by reviewing the main rules from Data 8, with particular attention to what’s being multiplied in the multiplication rule. (a) The Pew Research Foundation publishes the results of numerous surveys, one of which is about the trust that Americans have in groups such as the military, scientists, and elected o!cials to act in the public interest. A table in the article summarizes the results. Pick one of the options (1) or (2) to answer the question below; if you pick (1), tell us what p is. Then, explain your choice. The percent of surveyed U.S. adults who had a great deal of confidence in both scientists and religious leaders 1. is equal to p%. 2. cannot be found with the information in the article. (b) Toyota is one of most commonly owned makes of cars in our county (Alameda). A car heading from Berkeley to San Francisco is pulled over on the freeway for speeding. Suppose I tell you that the car is either a Toyota or a Lamborghini, and you have to guess which of the two is more likely. What would you guess, and why? Make some reasonable assumptions and explain them (data scientists often have to do this), and justify your answer. 0 The percent who have a great deal of confidence in scientists was 39 To 1740 for religious leaders Without more info we cannot assume an overlap in the two groups The Toyota is more likely bio first of all as aforementioned the Toyota is one of the most common cars On the other hand the Lamborghini is uncommon Therefore from the numerous cars if one was parted over it is more likely to be a Toyota Homework 1 5 5. (3 points) Consider the following scenario: Only 1% of 40-year-old women who participate in a routine mammography test have breast cancer. 80% of women who have breast cancer will test positive, but 9.6% of women who don’t have breast cancer will also get positive tests. Suppose we know that a woman of this age tested positive in a routine screening. What is the probability that she actually has breast cancer? (Note: You must show all of your work, and also simplify your final answer to 3 decimal places.) PC cancer pos PCpost cancer P cancer p pos PCpost cancer P cancer t PC post no Peno O 8 Co 01 O 8C O Ol t o 096 Co 99 O O 78 7 87 Homework 1 6 6. (2 points) Suppose we collected a sample of 200 students at UC Berkeley, and 150 of them happened to be Canadian (so, if we were to select a student uniformly at random from our sample, there is a 0.75 chance that they are Canadian). For inferential purposes, we choose to bootstrap this sample 500,000 times. That is, we simulate the act of re-sampling (with replacement) 200 students from our observed sample, and each time we record the number of Canadians in our re-sample. We provide a histogram of the sampling distribution below. What is the standard deviation of the sampling distribution shown above? Select the closest option below, and explain your answer. A. 1.5 B. 6.1 C. 12.4 D. 10.1 Hint: While it is possible to calculate the answer, the histogram has all of the information you need. 0 Looking at the histogram we can assume normality and therefore apply the 68 95 99 7 rule which gives us the Homework 1 7 Welcome Survey 7. (2 points) In order for the teaching sta↵ to best ensure you have a stellar Data 100 experience, we’ve put together a short welcome survey for us to get to know more about you. When you have finished the survey, you will receive a codeword. Please write this codeword as your answer to question 7. I Central beef overview [Show More]
Last updated: 3 years ago
Preview 1 out of 7 pages
Buy this document to get the full access instantly
Instant Download Access after purchase
Buy NowInstant download
We Accept:
Can't find what you want? Try our AI powered Search
Connected school, study & course
About the document
Uploaded On
Jun 24, 2021
Number of pages
7
Written in
All
This document has been written for:
Uploaded
Jun 24, 2021
Downloads
0
Views
57
Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Scholarfriends · High quality services·