Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Ecology Research Report Frequency Data using R-studio, Assignments of Ecology and Environment

Description About Frequency Data and analyze data with R-studio (includes code).

Typology: Assignments

2020/2021

Available from 03/24/2022

tarika-arjune
tarika-arjune 🇺🇸

66 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Frequency Data
Tarika Arjune
Dr. Wei Fang
Ecology
November 12th, 2021
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Ecology Research Report Frequency Data using R-studio and more Assignments Ecology and Environment in PDF only on Docsity!

Frequency Data

Tarika Arjune

Dr. Wei Fang

Ecology

November 12th, 2021

Section 1: Introduction (10 pts) Frequency of a specific measurement in a sample is the number of observations having a particular value of the measurement. The frequency distribution shows how often each value of the variable occurs in the sample. We use the frequency distribution of a sample to inform us about the distribution of the variable in the population from the data set it originated. The frequency distribution will show the shape of the data which allows us to detect patterns. Frequency distribution allows us to visualize data for a single variable and the frequency distribution of the variable gives the number of occurrences for all values in the data. Relative frequency is the proportion of observations having a given measurement which is calculated as the frequency divided by the total observations. The relative frequency distribution is the proportion of occurrences of each value in the data set. One learning outcome is to be able to calculate a confidence interval for a proportion. Other learning outcomes includes make a hypothesis test about proportions, fitting frequency data to a model, and testing for a fit to a Poisson distribution. Section 2: Question 1-4 (R scripts & graphs & answers to individual questions) (Q1 10pts, Q2 20pts, Q3 20 pts, Q4 30 pts)

  1. Many hospitals still have signs posted everywhere banning cell phone use. These bans originated from studies on earlier versions of cell phones. In one such experiment, out of 510 tests with cell phones operating at near-maximum power, six disrupted a piece of medical equipment enough to hinder interpretation of data or cause equipment to malfunction. A more recent study found zero instances of disruption of medical equipment out of 300 tests. a. For the older data, use binom.confint() with the Agresti-Coull method to calculate the estimated proportion of equipment disruption. What is the 95% confidence interval for this proportion? library(binom) The 95% confidence interval. binom.confint(x = 6, n = 510, method = "ac")

b. Use chisq.test() to test the null hypothesis that the selection of the stockings was independent of position. X^2 test statistic is given by where Oi is the observed frequencies and Ei is the expected frequency at the ith position The following table will help us get the values: Position Subjects Expected (O-E) (O-E)^2 Chi-sq Far-Left 6 13 -7 49 3. Left-Middle 9 13 -4 16 1. Right-Middle 16 13 3 9 0. Far-Right 21 13 8 64 4. Total 52 52 10. The critical value of Chi-square at 3 df is 7.8147. Since the calculated value of X^2 is >the critical value, we reject the null hypothesis and conclude that the selection of stockings are not independent of the position. c. (Optional) The function chisq.test() can take the data either as a data frame, as above, or as a vector of the observed counts, as a parameter called x as input: chisq.test(x = c(6,9,16,21), p = c(0.25,0.25,0.25,0.25)) Try it using the specification of the counts, to see that you get the same answer as in (b). chisq.test(x = c(6,9,16,21), p = c(0.25,0.25,0.25,0.25)) Chi-squared test for given probabilities data: c(6, 9, 16, 21) X-squared = 10.615, df = 3, p-value = 0.

  1. Many people believe that the month in which a person is born predicts significant attributes of that person in later life. Such astrological beliefs have little scientific support, but are there

circumstances in which birth month can have a strong effect on later life? One prediction is that elite athletes will disproportionately have been born in the months just after the age cutoff used to separate levels for young players of the sport. The prediction is that those athletes that are oldest within an age group will do better by being relatively older, and therefore will gain more confidence and attract more coaching attention than the relatively younger players in their same groups. As a result, they may be more likely to dedicate themselves to the sport and do well later. In the case of soccer, the cutoff for different age groups is generally August. a. The birth months (by three month interval) of soccer players competing in the Under- 20’s World Tournament are recorded in the data file “soccer_birth_quarter.csv” (from Barnsley et al. 1992). Plot these data. Do you see a pattern? getwd() library(readr) library(readr) soccer_birth_quarter <- read_csv("DataForLabs/soccer_birth_quarter.csv") View(soccer_birth_quarter) library(ggplot2) ggplot(data=soccer_birth_quarter,aes(x=birth_quarter)) +geom_histogram(stat="count")

Cardiactable MMlist<-read.csv("DataForLabs/cardiac arrests out of hospital.csv",stringsAsFactors = TRUE) Cardiactable 0 1 2 3 4 5 6 36 79 60 41 28 10 7 b. What is the mean number of heart attacks per week? lambda<-mean(cardiac_arrests_out_of_hospital$out_of_hospital_cardiac_arrests) lambda

c. For the mean you just calculated, use dpois() to calculate the probability of 0 heart attacks in a week assuming a Poisson distribution. Multiply that probability by the number of data points to calculate the expected frequency of 0 in these data under the null hypothesis of a Poisson distribution. dpois(x=0,lambda=2.015326,log=FALSE)

sum(cardiac_arrests_out_of_hospital) 526 261*0.

d. Here is a table of the expected frequencies under the null hypothesis. (The expected frequency for zero heart attacks should match your calculation above.) Are these frequencies acceptable for use in a χχ^2 goodness of fit test? Number of heart attacks Expected 0 34. 1 70. 2 70.

Number of heart attacks Expected 3 47. 4 23. 5 9. 6 or more expected_frequency<-261*dpois(x=0:6,lambda=2.015326,log=FALSE) expected_frequency 34.785283 70.103686 70.640891 47.454808 23.909227 9.636977 3.

e. Create vectors in R for the observed and expected frequencies. observed_frequency<-Cardiactable observed_frequency 0 1 2 3 4 5 6 36 79 60 41 28 10 7 f. Calculate the χχ^2 for this hypothesis test, using chisq.test()$statistic. chisq.test(observed_frequency,p=expected_frequency,rescale.p=TRUE)$statistic X-squared

g. How many degrees of freedom should this χχ^2 goodness of fit test have? Degree of freedom= h. Calculate the P -value for this test, using pchisq(). pchisq(q=8.693435,df=5,lower.tail=FALSE)