Sampling

From Practical Statistics for Educators
Jump to: navigation, search

Sampling refers to drawing a sample (a subset) from a population (the full set).

• The usual goal in sampling is to produce a representative sample (i.e., a sample that is similar to the population on all characteristics, except that it includes fewer people because it is a sample rather than the complete population).

• Metaphorically, a perfect representative sample would be a "mirror image" of the population from which it was selected (again, except that it would include fewer people).

contributed by Karen Burke, EdD


Terminology Used in Sampling

Here are some important terms used in sampling:

• A sample is a set of elements taken from a larger population.

• The sample is a subset of the population which is the full set of elements or people or whatever you are sampling.

• A statistic is a numerical characteristic of a sample, but a parameter is a numerical characteristic of population.

• Sampling error refers to the difference between the value of a sample statistic, such as the sample mean, and the true value of the population parameter, such as the population mean. Note: some error is always present in sampling. With random sampling methods, the error is random rather than systematic.

• The response rate is the percentage of people in the sample selected for the study who actually participate in the study.

• A sampling frame is just a list of all the people that are in the population. Here is an example of a sampling frame (a list of all the names in my population, and they are numbered). Note that the following sampling frame also has information on age and gender included in case you want to draw some samples and do some calculations.

contributed by Karen Burke, EdD


Random Sampling Techniques

The two major types of sampling in quantitative research are random sampling and nonrandom sampling.

• The former produces representative samples.

• The latter does not produce representative samples.


Simple Random Sampling

The first type of random sampling is called simple random sampling.

• It's the most basic type of random sampling.

• It is an equal probability sampling method (which is abbreviated by EPSEM).

• Remember that EPSEM means "everyone in the sampling frame has an equal chance of being in the final sample."

• You should understand that using an EPSEM is important because that is what produces "representative" samples (i.e., samples that represent the populations from which they were selected)!


You will see below that, simple random samples are not the only equal probability sampling method (EPSEM). It is the most basic and well known, however.

Sampling experts recommend random sampling "without replacement" rather than random sampling "with replacement" because the former is a little more efficient in producing representative samples (i.e., it requires slightly fewer people and is therefore a little cheaper).


"How do you draw a simple random sample?"

• One way is to put all the names from your population into a hat and then select a subset (e.g., pull out 100 names from the hat) or use a table of random numbers.

• These days, researchers often use computer programs to randomly select their samples.

• To use a computer program (called a random number generator) you must make sure that you give each of the people in your population a number. Then the program will give you a list of randomly selected numbers within the range you give it. After getting the random numbers, you identify the people with those randomly selected numbers and try to get them to participate in your research study!

• If you decide to use a table of random numbers here’s what you need to do. First, pick a place to start, and then move in one direction (e.g., move down the columns). Use the number of digits in the table that is appropriate for your population size (e.g., if there are 2500 people in the population then use 4 digits). Once you get the set of randomly selected numbers, find out who those people are and try to get them to participate in your research study. Also, if you get the same number twice, just ignore it and move on to the next number.


Systematic Sampling

Systematic sampling is the second type of random sampling.

• It is an equal probability sampling method (EPSEM).

• Remember simple random sampling was also an EPSEM.


Systematic sampling involves three steps:

• First, determine the sampling interval, which is symbolized by "k," (it is the population size divided by the desired sample size).

• Second, randomly select a number between 1 and k, and include that person in your sample. • Third, also include each kth element in your sample. For example if k is 10 and your randomly selected number between 1 and 10 was 5, then you will select persons 5, 15, 25, 35, 45, etc. • When you get to the end of your sampling frame you will have all the people to be included in your sample.

• One potential (but rarely occurring) problem is called periodicity (i.e., there is a cyclical pattern in the sampling frame). It could occur when you attach several ordered lists to one another (e.g., if you had took lists from multiple teachers who had all ordered their lists on some variable such as IQ). On the other hand, stratification within one overall list is not a problem at all (e.g., if you have one list and have it ordered by gender, or by IQ). Basically, if you are attaching multiple lists to one another, there could be a problem. It would be better to reorganize the lists into one overall list (i.e., sampling frame).


Stratified Random Sampling

The third type of random sampling is called stratified random sampling.

• First, stratify your sampling frame (e.g., divide it into the males and the females if you are using gender as your stratification variable).

• Second, take a random sample from each group (i.e., take a random sample of males and a random sample of females). Put these two sets of people together and you now have your final sample. (Note that you could also take a systematic sample from the joined lists if that’s easier.)


There are actually two different types of stratified sampling.

The first type of stratified sampling, and most common, is called proportional stratified sampling.

• In proportional stratified sampling you must make sure the subsamples (e.g., the samples of males and females) are proportional to their sizes in the population.

• Note that proportional stratified sampling is an equal probability sampling method (i.e., it is EPSEM), which is good!


The second type of stratified sampling is called disproportional stratified sampling.

• In disproportional stratified sampling, the subsamples are not proportional to their sizes in the population.

Here is an example showing the difference between proportional and disproportional stratified sampling:

• Assume that your population is 75% female and 25% male. Assume also that you want a sample of size 100 and you want to stratify on the variable called gender.

• For proportional stratified sampling, you would randomly select 75 females and 25 males from the population.

• For disproportional stratified sampling, you might randomly select 50 females and 50 males from the population.


Cluster Random Sampling

In this type of sampling you randomly select clusters rather than individual type units in the first stage of sampling.

A cluster has more than one unit in it (e.g., a school, a classroom, a team).


The first type of cluster sampling is called one-stage cluster sampling.

• To select a one-stage cluster sample, you first select a random sample of clusters.

• Then you include in your final sample all of the individual units that are in the selected clusters.


The second type of cluster sampling is called two-stage cluster sampling.

• In the first stage you take a random sample of clusters (i.e., just like you did in one-stage cluster sampling).

• In the second stage, you take a random sample of elements from each of the clusters you selected in stage one (e.g., in stage two you might randomly select 10 students from each of the 15 classrooms you selected in stage one).


Important points about cluster sampling:

• Cluster sampling is an equal probability sampling method (EPSEM) ONLY if the clusters are approximately the same size. (Remember that EPSEM is very important because that is what produces representative samples.)

• When clusters are not the same size, you must fix the problem by using the technique called "probability proportional to size" (PPS) for selecting your clusters in stage one. This will make your cluster sampling an equal probability sampling method (EPSEM), and it will, therefore, produce representative samples.


contributed by Karen Burke, EdD

Nonrandom Sampling Techniques

The other major type of sampling used in quantitative research is nonrandom sampling (i.e., when you do not use one of the random sampling techniques). There are four main types of nonrandom sampling:

• The first type of nonrandom sampling is called convenience sampling (i.e., it simply involves using the people who are the most available or the most easily selected to be in your research study).

• The second type of nonrandom sampling is called quota sampling (i.e., it involves setting quotas and then using convenience sampling to obtain those quotas). A set of quotas might be given to you as follows: find 25 African American males, 25 European American males, 25 African American females, and 25 European American females. You use convenience sampling to actually find the people, but you must make sure you have the right number of people for each quota.

• The third type of nonrandom sampling is called purposive sampling (i.e., the researcher specifies the characteristics of the population of interest and then locates individuals who match those characteristics). For example, you might decide that you want to only include "boys who are in the 7th grade and have been diagnosed with ADHD" in your research study. You would then, try to find 50 students who meet your "inclusion criteria" and include them in your research study.

• The fourth type of nonrandom sampling is called snowball sampling (i.e., each research participant is asked to identify other potential research participants who have a certain characteristic). You start with one or a few participants, ask them for more, find those, ask them for some, and continue until you have a sufficient sample size.

contributed by Karen Burke, EdD


Random Selection and Random Assignment

In random selection (using an equal probability selection method), you select a sample from a population using one of the random sampling techniques discussed earlier.

• The resulting random sample will be like a "mirror image" of the population, except for chance differences.

• For example, if you randomly select (e.g., using simple random sampling) 1000 people from the adult population in Ann Arbor, Michigan, the sample will look like the adult population of Ann Arbor.


In random assignment, you start with a set of people (you already have a sample, which very well may be a convenience sample), and then you randomly divide that set of people into two or more groups (i.e., you take the full set and randomly divide it into subsets).

• You are taking a set of people and “assigning” them to two or more groups.

• The groups or subsets will be "mirror images" of each other (except for chance differences).

• For example, if you start with a convenience sample of 100 people and randomly assign them to two groups of 50 people, the two groups will be "equivalent" on all known and unknown variables.

• Random assignment generates similar groups, and it is used in the strongest of the experimental research designs.

contributed by Karen Burke, EdD


Determining the Sample Size When Random Sampling is Used

Would you like to know the answer to the question "How big should my sample be?"

I will start with my four "simple" answers to your question:

• Try to get as big of a sample as you can for your study (i.e., because the bigger the sample the better).

• If your population is size 100 or less, then include the whole population rather than taking a sample (i.e., don't take a sample; include the whole population).

• Look at other studies in the research literature and see how many they are selecting.

• For an exact number, just look at Figure 7.5 which shows recommended sample sizes.

contributed by Karen Burke, EdD

Samplesize.jpg


Final Thoughts

I want to make a few more points about sample size in this chapter. In particular, note that you will need larger samples under these circumstances:

• When the population is very heterogeneous.

• When you want to breakdown the data into multiple categories.

• When you want a relatively narrow confidence interval (e.g., note that the estimate that 75% of teachers support a policy plus or minus 4% is more narrow than the estimate of 75% plus or minus 5%).

• When you expect a weak relationship or a small effect.

• When you use a less efficient technique of random sampling (e.g., cluster sampling is less efficient than proportional stratified sampling).

• When you expect to have a low response rate. The response rate is the percentage of people in your sample who agree to be in your study.

contributed by Karen Burke, EdD