Comparing Two Proportions

Comparing Two Proportions

In the Module Overview, you’ll have noticed the textbook assignment for this module.
ideos are optional, and are there for your reference. Watch the video if you so choose, then complete the practice problems from your textbook.
You’ll notice, I’ve only assigned odd-numbered exercises. This is because I want you to be able to check your work as you go along. Please use  good judgment and academic integrity when you complete the assignments; don’t merely copy or paraphrase the answers–that will earn you a 0–but use them to guide your answers. I recommend you complete the assignment, then check the answers, and then make corrections in a different colored pen/cil or font. This way you really have the opportunity to learn to nuances–and there are nuances–in each problem. Please
make sure to show all your work, step by step, if applicable.
If you have any questions, especially concerning my definition of copying or paraphrasing, please feel free to email me.
Feel free to do the problems by hand and then scan them to upload. I really like Genius Scan. You can also take a picture of your work instead, but I ask you make sure everything is legible before you upload the picture. You may feel more comfortable typing your answers, which is fine; just be sure that ALL STATISTICAL DISPLAYS are included (no matter the format you choose).
Complete bookwork – pg 618; #21*, 23, 27, 29*, 33, 35*, 37*

Measures of Central Tendency Paper

Measures of Central Tendency Paper

The mean salary is often used to describe the salaries of employees of a company. However, the median salary may be a better measure of the salaries in comparison to the mean. Research a career you are interested in and calculate the mean and median salaries using at least ten data points. Include the calculations and the data source(s). Which is the better measure of central tendency? Why? Review and respond to the comments posted by your peers and offer your insight on this topic. Do you agree or disagree with their selection? Why or why not?

Check this good simple and easy to use resource for making box plots: http://www.shodor.org/interactivate/activities/BoxPlot/
This is a good place for an example : http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Descriptive.htm
If you have an even number of data points, then you average the 5th and 6th elements (when placed in numerical order) to obtain the median.
The median divides the original data set into equal sized halves; repeat the procedure with these new smaller data sets to find the first and third
quartiles.

Pay attention to whether these new smaller sets have an even or odd number of elements!
This post seems to focus on the mean and median, but there is another measure of central tendency – the mode.
In what situations and what variables would the mode be the best choice as the measure of central tendency?
This side question has nothing to do with salaries, they are continuous quantitative variables.
(Needs to be at least 150 words)

Tools for Data Analysis

Tools for Data Analysis

To complete the Assignment, compose a cohesive document that addresses the following:
Create a table outlining practical applications for each tool discuss in “The Seven Quality Tools” (Stauffer, 2013). Include the following within your table:
Strengths: Why that tool works well for those applications
Tips for use Cautions relevant to the tool
Choose an online example, or an example from your experience, in which the tool was used. Provide a link to your example. Analyze how it was used within the organization. For each tool listed, find an online example where the tool was used properly and provide the link to your example and a brief description of how it was used and your analysis of its effectiveness or whether there was a better tool and why.

Types of Reliability and Validity

Types of Reliability and Validity

Investigate an individual, standardized cognitive or academic assessment like the WISC, WJ, KTEA or WIAT and discuss the concepts listed below that you are able to find in the technical manual of the assessment:

• Test-Retest Reliability
• Interrater Reliability
• Internal Consistency
• Confidence Intervals
• Standard Error Measurement
• Face Validity
• Construct Validity
• Criterion-Related Validity
• Content Validity
• External Validity

Solved Statistics Questions

Solved Statistics Questions

Hint: For exercise 1-39, you go to:

Data Analysis

Random Number Generation

Number of Variables: 5

Number of Random Numbers: 8

Parameters:

• 578

Output Range: \$A\$1

Here, I am choosing 5 columns and 8 rows to get 40 random numbers.  You may choose other number as long as they multiply to 40.

Attachment: Sample Exercises:

Chapter 1: Solution to Sample Exercises

Page 5 – Number 1-3

A bar chart is used whenever you want to display data that has already been categorized while a histogram is used to display data over a range of values for the factor under consideration.  Another fundamental difference is that there typically are gaps between the bars on a bar chart but there are no gaps between the bars of a histogram.

Page 5 – Number 1-7 (This is also a homework exercise; do solve it yourself)

The appropriate chart in this case is a histogram where the horizontal axis contains the number of missed days and the area of the bars represent the number of employees who missed each number of days.

Note: Leave no gap between the bars.

Page 18 – Number 1-26

To determine the range of employee numbers for the first part selected in a systematic random sample use the following:

Part range =

Thus, the first employee selected will come from parts 1-180.  Once that employee number is randomly selected, the second employee will be the one numbered 100 higher than the first, and so on.

Page 19 – Number 1-40

1. The population should be all users of cross-country ski lots in Colorado.

1. Several sampling techniques could be selected. Be sure that some method of ensuring randomness is discussed.  In addition, some students might give greater weight to frequent users of the lots.  In which case the population would really be user days rather than individual users.

1. Students using Excel should use the Data, Data Analysis, Random Number Generation. Students’ answers will differ since Excel generates different streams of random numbers each time it is used.  Since the application requires integer numbers, the Decrease Decimal option should be used.

Page 23 – Number 1-49

1. Cross-sectional
2. Time-series
3. Cross-sectional
4. Cross-sectional
5. Time-Series

Solved Statistics Exam Questions

Solved Statistics Exam Questions

Section A

1. Let θ be an unknown  parameter and θˆ an estimator of θ based on data  with sample size

n.

(a)    Define the bias of the estimator θˆ.

(b)    Define the mean squared  error of the estimator θˆ. (c)   Define the standard error of the estimator θˆ.

(d)    Define what  it means for θˆ to be a consistent estimator of θ.

1. Based on a sample of data, a hypothesis  test is to be performed of the null hypothesis  that a parameter θ = θ0  versus an alternative hypothesis  that θ > θ0.  Let T denote  the  test statistic for the test,  with larger values of T corresponding  to supporting the alternative hypothesis.

(a)    Define mathematically the p-value corresponding  to the test. (b)    Explain  in words what  the p-value measures.

(c)    Show that when  the  null  hypothesis  is true,  considering  the  p-value  as a random variable,   its  cumulative distribution  function  is  that of the  continuous   uniform distribution on the unit  interval.

(d)    Hence explain  why a test  which  rejects  the  null  when  the  p-value  is less than  α

controls  the type 1 error rate  at level α.

1. Let X1, . . . , Xn  be  independent  and  identically   distributed N (µ, σ2),  with  µ  and  σ2 unknown.    A confidence  interval  (L, U )  for µ is to  be  constructed based  on  the  data X1, . . . , Xn.

(a)    Define what  it means for (L, U ) to be a 95% confidence interval  for µ.

(b)    State  the definition for a t-distribution on n − 1 degrees of freedom.

 σ/√n

(c)    State  the  distribution of the  sample  mean  X,  and  use this  to  show that  X−µ   ∼

N (0, 1).

(d)    State  the function of the sample variance  S2  that is chi-squared  distributed on n − 1 degrees of freedom.

(e)    Hence derive an expression for a 95% confidence interval  for µ.

Section B

1. A student is performing a Monte-Carlo simulation  experiment using the software package R to investigate the coverage probability of a confidence interval  for a parameter θ. Their program generates N independent datasets using the true value of θ that they choose, and on each, the confidence interval  is calculated. Let (Li, Ui) denote  the confidence interval from the ith  simulation.  Let π denote  the confidence interval’s  true  coverage level.

(a)    What   is the  distribution of the  number  of simulations   for which  the  confidence interval  includes the true  parameter value θ?

(b)    Give an expression for an estimator πˆ  of π based on the simulation  experiment.

(c)    Derive the approximate distribution of πˆ  assuming  that the  number  of simulations

N  is large.

(d)    Assuming  N  is  large,  use  your  answer  to  part   (c)  to  derive  expressions  for  a symmetric  95% confidence interval  for π.

(e)    Assuming  that π ≈ 0.95, derive how large a value of N  should  be used to ensure that the 95% confidence interval  for π has width  0.05.

1. Let X1, . . . , Xn be independent and identically  distributed continuous  random  variables with  common probability density  function  f(x; θ) = θxθ−1  for 0 < x < 1 and  θ > 0 an unknown  parameter. It can be shown that

θ                                                                 θ

E(X1) = 1 + θ                              Var(X1) = (θ + 1)2(θ + 2)

1                                                      1

E(log(X1)) = −θ                          Var(log(X1)) = θ2

(a)    Prove  that f(x; θ) is indeed a valid probability density  function.

(b)   Derive the maximum  likelihood estimator of θ given X1, . . . , Xn.

(c)   Prove  that the maximum  likelihood estimator is consistent for θ.

(d)    Derive the form of the critical region of the most powerful test of the null hypothesis that θ = θ0  versus the alternative hypothesis  that θ = θ1, for θ1 > θ0. 

1. Let X1, . . . , Xn be independent and identically distributed random  variables  each drawn from the binomial distribution with k > 1 trials  and success probability 0 ≤ π ≤ 1. Thus the probability mass function  of Xi, i = 1, . . . , n is

for x ∈ {0, 1, 2, . . . , k}.

P (Xi = x) =

k

πx

x

(1 − π)

k−x

(a)    Derive  an  expression  for the  maximum  likelihood  estimator of π  given  the  data

X1, . . . , Xn.

In an 1889 study of the human  sex ratio conducted  based on hospital  records in Germany, the  number  of boys among  6,115 families each of which had  12 children  was recorded. The following table  shows the distribution of number  of boys from the study.

No. boys              0    1      2        3        4        5          6          7          8        9        10      11    12

No. of families    3    24    104    286    670    1033    1343    1112    829    478    181    45    7

(b)    Assuming that the number  of boys in each family is an independent and identically distributed draw  from a binomial  distribution, calculate  the  maximum  likelihood estimate  of the probability π that each birth  is a boy.

(c)    Estimate the standard error of your estimate  of π.

(d)    Calculate Pearson’s  goodness of fit test  statistic using the data.

(e)    Use your  answer  to  (d)  to judge  whether  the  binomial  model  fits  the  data  well and  what  implications   your  finding  has  for  inference  about  π.    To  help  answer the  question  it  may  be useful to  know the  following quantiles  for the  chi-squared distributions on 10, 11, and 12 degrees of freedom.

χ2                             2

10,0.05  = 3.94   χ10,0.95 = 18.31

χ2                             2

11,0.05  = 4.57   χ11,0.95 = 19.68

χ2                             2

12,0.05  = 5.23   χ12,0.95 = 21.03

(f)    Suggest a reason for why you think  the binomial  model either  fits well or does not fit well, according to what  you found in part  (e).

Section C

1. A clinical trial is to be conducted to compare a new treatment with an existing treatment for patients recently  infected  with  human  immunodeficiency  virus  (HIV).  The  outcome of interest is CD4 count,  which is a measure  of activity  of the immune  system.  The aim of the new treatment is to increase CD4 count compared  to the existing treatment.

In 500  words or less,  describe how you would design the trial, what variables you would measure on the patients, and how you would perform the statistical analysis.  You should give reasons for the choices of design and analysis that you make.

Descriptive Statistics Data Analysis Plan

Descriptive Statistics Data Analysis Plan

Written Assignments Data Set – This is the data set that you will use for all three written
assignments.
Assignment #1_ Descriptive Statistics Data Analysis Plan Instructions – which contains complete
instructions for this assignment.
Assignment #1_ Descriptive Statistics Data Analysis Plan Template – which contains the template
to be completed and turned in for this assignment.
You must include Income and two other SE variables. You will choose those two SE variables
from Marital Status, Age of Head of Household and Family Size. You are required to choose a
qualitative variable. I highly recommend Marital Status. It is the only qualitative variable in our
dataset. It also will make Written Assignment #3 easier if you have this variable for setting up
your two independent groups for a hypothesis test (t test). Then you will choose two Exp
variables from Food, Annual Expenditures, Entertainment, and Education. These are the only
variables we can use over the 8 week course. Be sure to include all subjects (people) for the
variables you use, not just the people who match your scenario.

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

SPSS Assignment Help. SPSS Term Papers. SPSS Analysis Help. SPSS Graduate Papers. SPSS Data Analysis Help

Index Construction and Use SPSS

Homework 5: Index Construction & Use
This assignment continues a series of labs and homeworks in which you utilize statistical skills for basic
research. For this assignment, you will again manipulate variables and construct a basic index, as you have
done in several earlier assignments. However, for this assignment, you will take an additional step, using the
index that you create for simple bivariate descriptions of the sample. The index will be the “dependent variable”:
Specifically, you will use ordinal measures to compare support for possible explanations of variation in the index.
Instructions
You will be using the data file hw5.sav to examine variation in respondents’ satisfaction with four areas of their
lives (family, friends, finance, and job). You will then create a summary measure of overall satisfaction, and will
explore how (and whether) that summary measure varies in two ways: across educational levels and with
frequency of sexual activity. Finally, you will briefly explore interactions among these possible influences on
satisfaction. (Note that most of the recoding has been done for you – this is not always the case.)
Requirements & Questions
You must submit your output file (complete but cleaned) and typed answers to these questions. Typed. Probably
with a computer, maybe with some other device, possibly a typewriter. But not a pen, pencil, or crayon. Typed.
1. Univariate analyses of component and independent variables:
• Perform a univariate analysis of SATFAM, SATFIN, SATJOB, and SATFRND – For each, you should
look at and briefly summarize the frequency distribution, as well as basic summary statistics for central
tendency and dispersion. Go beyond just reporting the data and say something interesting (here and
below). For example, about which issues are the respondents the most/least happy?
• Look briefly at the distributions of EDUC and SEXFREQ. (Note, in particular, the percent of the sample
who refused to answer or otherwise did not have an answer for SEXFREQ.)
2. Construct and assess index:
• Construct an index (including variable labels and value labels, at least for the extremes), called
HAPPY, as the summation of values for the four components listed above.
• Perform a univariate analysis of HAPPY – look at and briefly summarize the frequency distribution, as
well as basic summary statistics for central tendency and dispersion..
• What is this variable conceptually? What does it measure, and what does it mean? What does it tell us
that the individual components do not?
• Interpret the “alpha” for your index – is the index reliable? is it a good one? why or why not?
3. Bivariate analyses – what makes people happy?
• Using correlations and chi-square, what can you say about the relationship between educational
attainment and overall satisfaction (i.e. between HAPPY and EDUC)? (You will need to request a
crosstab to get chisquare, but ignore the table itself, for now.) Is it strong? statistically significant?
• Using correlations and chi-square, what can you say about the relationship between frequency of
sexual activity and overall satisfaction (i.e. between HAPPY and SEXFREQ)? (You will need to request
a crosstab to get chisquare, but ignore the table itself, for now.) Is it strong? statistically significant?
4. Discussion/conclusions
• What can you infer from these findings about what makes people happy? (Hint: Did either of the two
independent variables (EDUC and SEXFREQ) have a statistically significant effect on the dependent
variable (HAPPY)?)
• Bonus: Put that at a conceptual level, thinking about what broader concepts these variables might
operationalize. Of what larger concept might education be a specific instance, indicator, or aspect?