Statistics Project, Statistics Homework
Your Name
Instructor
Subject
Date of Submission
Statistics Project, Statistics Homework
Question 11.)a.)> Project=read. table (“c:/xyz/data.csv”, sep=”,”,header=T)
> attach (project)
b.)> hist (physician,col=”blue
c.)> sd (physician)
d.) [1] 1591.87
e.) Regression
(X, Y) ~ N(mx, my, sx^2, sy^2, r), r being the correlation, not
Covariance), where mx and my are the respective means of x and y, whereas sx^2 and sy^2 are the respective variances.
Question 36. a and b) mvrnorm (n = 1, mu, Sigma, tol = 1e-6, empirical = FALSE, EISPACK = FALSE)
sigma-a positive-definite symmetric matrix specifying the covariance matrix of the variables.
n-the number of samples required.
mu-a vector giving the means of the variables
tol-tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma.
empirical-logical. If true, mu and Sigma specify the empirical not population mean and covariance matrix.
EISPACK-logical. Set to true to reproduce results from MASS versions prior to 3.1-21
Sigma <- matrix(c(1,1,1,1),2,2)
Sigma
var(mvrnorm(n=500, rep(0, 30), Sigma))
var(mvrnorm(n=500, rep(0, 30), Sigma, empirical = TRUE))
mvrnorm(500,rep(0,3),0,0,1,1)
1.) No. These estimates are not valid since it does not take into consideration of the total number voters in the survey.
2.)
a.) Cluster sampling is a sampling technique where the entire population is divided into groups, or clusters and a random sample of these clusters are selected. In this case, the group selected is a few clinics in Chicago area as opposed to the entire clinics in Chicago. All observations in the selected clinics are included in the sample. This may be as a result of limited capital to conduct the survey, limited amount of time and some other factors.
First Stage Sampling (FSU): Chicago
Second Stage Sampling (SSU): Selected clinics
Once the data from the questionnaire has been complied, sections of household which have accessed to a gun are noted down and the section of those that do not access to a gun also noted. The number of households that has accessed to a gun is divided by the total number of data gotten from the questionnaire, after which the result is multiplied by 100% to get the proportion of the households which have access to a gun.
Standard error of the proportion of children whose household has access to a gun is estimated from the average of the proportion of the same households that has access to a gun.
b.) The sampling population is the total number of parents who attend the selected clinics in Chicago. This sampling procedure does result in a representative sample of households with children due to the following reasons;
More testing is required
It’s not as accurate as the simple random sample especially if the sample is the same
This is a second-stage cluster sampling.
4a.) Cluster sampling is a sampling technique where the entire population is divided into groups, or clusters and a random sample of these clusters are selected. All observations in the selected clusters are included in the sample. Jacoby and Handlin choose 26 journals from a list of 1285 scholarly journals. Cluster sampling is typically used when the researcher cannot get a complete list of the scholarly journals they wish to study but can get a complete list of groups or ‘clusters’ of the journals. It is also used when a random sample would produce a list of subjects so widely scattered that surveying them would prove to be far too expensive, for example, examining all the 1285 scholarly journals.
4 b. ) > Data=(read.table(“c:/xyz/jay.csv”,sep=”,”,header=T))
> attach(Data)
> sum(nonprob)
[1] 137
> sum(Data)
[1] 288
> sum(prob)
[1] 3
> sum(numemp)
[1] 148
> mean(nonprob)
[1] 5.269231
4c) Proportion that used non-probability method = 137/288
> sum(nonprob)/sum(Data)
[1] 0.4756944
> sd(nonprob)
[1] 10.09775
From the above results, it seems that experts have confidence in using non-probability sampling. This is seen by the number of those who prefer using the non-probability method being overwhelmingly more than those that do prefer the probability method.
Works Cited
Gentleman, Robert. R Programming for Bioinformatics. Boca Raton: CRC Press, 2009. Print.
Matloff, Norman S. The Art of R Programming: Tour of Statistical Software Design. San Francisco: No Starch Press, 2011. Print.
Leave a Reply
Want to join the discussion?Feel free to contribute!