An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

- Publications
- Account settings
- Advanced Search
- Journal List
- Ann Indian Acad Neurol
- v.14(4); Oct-Dec 2011

## Design, data analysis and sampling techniques for clinical research

Karthik suresh.

Department of Pulmonary and Critical Care Medicine, Johns Hopkins University School of Medicine, Trivandrum, India

## Sanjeev V. Thomas

1 Department of Neurology, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, India

## Geetha Suresh

2 Department of Justice Administration, University of Louisville, Louiseville, USA

Statistical analysis is an essential technique that enables a medical research practitioner to draw meaningful inference from their data analysis. Improper application of study design and data analysis may render insufficient and improper results and conclusion. Converting a medical problem into a statistical hypothesis with appropriate methodological and logical design and then back-translating the statistical results into relevant medical knowledge is a real challenge. This article explains various sampling methods that can be appropriately used in medical research with different scenarios and challenges.

## Problem Identification

Clinical research often starts from questions raised at the bedside in hospital wards. Is there an association between neurocysticercosis (NCC) and epilepsy? Are magnetic resonance imaging changes good predictors of multiple sclerosis? Is there a benefit in using steroids in pyogenic meningitis? Typically, these questions lead us to set up more refined research questions. For example, do persons with epilepsy have a higher probability of having serological (or computed tomography [CT] scan) markers for NCC? What proportion of persons with multiple lesions in the brain has Multiple Sclerosis (MS) Do children with pyogenic meningitis have a lesser risk of mortality if dexamethasone is used concomitantly with antibiotics?

Designing a clinical study involves narrowing a topic of interest into a single focused research question, with particular attention paid to the methods used to answer the research question from a cost, viability and overall effectiveness standpoint. In this paper, we focus attention on residents and younger faculty who are planning short-term research projects that could be completed in 2–3 years. Once we have a fairly well-defined research question, we need to consider the best strategy to address these questions. Further considerations in clinical research, such as the clinical setting, study design, selection criteria, data collection and analysis, are influenced by the disease characteristics, prevalence, time availability, expertise, research grants and several other factors. In the example of NCC, should we use serological markers or CT scan findings as evidence of NCC? Such a question raises further questions. How good are serologic markers compared with CT scans in terms of identifying NCC? Which test (CT or blood test) is easier, safer and acceptable for this study? Do we have the expertise to carry out these laboratory tests and make interpretations? Which procedure is going to be more expensive? It is very important that the researcher spend adequate time considering all these aspects of his study and engage in discussion with biostatisticians before actually starting the study.

The major objective of this article is to explain these initial steps. We do not intend to provide a tailor-made design. Our aim is to familiarize the reader with different sampling methods that can be appropriately used in medical research with different scenarios and challenges.

One of the first steps in clinical study is choosing an appropriate setting to conduct the study (i.e., hospital, population-based). Some diseases, such as migraine, may have a different profile when evaluated in the population than when evaluated in the hospital. On the other hand, acute diseases such as meningitis would have a similar profile in the hospital and in the community. The observations in a study may or may not be generalizable, depending on how closely the sample represents the population at large.

Consider the following studies. Both De Gans et al .[ 1 ] and Scarborough et al .[ 2 ] looked at the effect of adjunctive Dexamethasone in bacterial meningitis. Both studies are good examples of using the hospital setting. Because the studies involved acute conditions, they utilize the fact that sicker patients will seek hospital care to concentrate their ability to find patients with meningitis. By the same logic, it would be inappropriate to study less-acute conditions in such a fashion as it would bias the study toward sicker patients.

On the other hand, consider the study by Holroyd et al .[ 3 ] investigating therapies in the treatment of migraine. Here, the authors intentionally chose an outpatient setting (the patients were referred to the study clinic from a network of other physician clinics as well as local advertisements) so that their population would not include patients with more severe pathology (requiring hospital admission).

If the sample was restricted to a particular age group, sex, socioeconomic background or stage of the disease, the results would be applicable to that particular group only. Hence, it is important to decide how you select your sample. After choosing an appropriate setting, attention must be turned to the inclusion and exclusion criteria. This is often locale specific. If we compare the exclusion criteria for the two meningitis studies mentioned above, we see that in the study by de Gans,[ 1 ] patients with shunts, prior neurosurgery and active tuberculosis were specifically excluded; in the Scarbrough study, however, such considerations did not apply as the locale was considerably different (sub-saharan Africa vs. Europe).

## Validity (Precision) and Reliability (Consistency)

Clinical research generally requires making use of an existing test or instrument. These instruments and investigations have usually been well validated in the past, although the populations in which such validations were conducted may be different. Many such questionnaires and patient self-rating scales (MMSE or QOLIE, for instance) were developed in another part of the world. Therefore, in order to use these tests in clinical studies locally, they require validation. Socio-demographic characteristics and language differences often influence such tests considerably. For example, consider a scale that uses the ability to drive a motor car as a Quality of Life measure. Does this measure have the same relevance in India as in the USA, where only a small minority of people drive their own vehicles? Hence, it is very important to ensure that the instruments that we use have good validity.

Validity is the degree to which the investigative goals are measured accurately. The degree to which the research truly measures what it intended to measure[ 4 ] determines the fundamentals of medical research. Peace, Parrillo and Hardy[ 5 ] explain that the validity of the entire research process must be critically analyzed to the greatest extent possible so that appropriate conclusions can be drawn, and recommendations for development of sound health policy and practice can be offered.

Another measurement issue is reliability. Reliability refers to the extent to which the research measure is a consistent and dependable indicator of medical investigation. In measurement, reliability is an estimate of the degree to which a scale measures a construct consistently when it is used under the same condition with the same or different subjects. Reliability (consistency) describes the extent to which a measuring technique consistently provides the same results if the measurement is repeated. The validity (accuracy) of a measuring instrument is high if it measures exactly what it is supposed to measure. Thus, the validity and reliability together determine the accuracy of the measurement, which is essential to make valid statistical inference from a medical research.

Consider the following scenario. Kasner et al .[ 6 ] established reliability and validity of a new National Institute of Health Stroke Scale (NIHSS) generation method. This paper provides a good example of how to test a new instrument (NIH stroke score generation via retrospective chart review) with regards to its reliability and validity. To test validity, the investigators had multiple physicians review the same set of charts and compared the variability within the scores calculated by these physicians. To test reliability, the investigators compared the new test (NIHSS calculated by chart review) to the old test (NIHSS calculated at the bedside at the time of diagnosis). They reported that, overall, 88% of the estimated scores deviated by less than five points from the actual scores at both admission and discharge.

A major purpose of doing research is to infer or generalize research objectives from a sample to a larger population. The process of inference is accomplished by using statistical methods based on probability theory. A sample is a subset of the population selected, which is an unbiased representative of the larger population. Studies that use samples are less-expensive, and study of the entire population is sometimes impossible. Thus, the goal of sampling is to ensure that the sample group is a true representative of the population without errors. The term error includes sampling and nonsampling errors. Sampling errors that are induced by sampling design include selection bias and random sampling error. Nonsampling errors are induced by data collection and processing problems, and include issues related to measurement, processing and data collection errors.

## Methods of sampling

To ensure reliable and valid inferences from a sample, probability sampling technique is used to obtain unbiased results. The four most commonly used probability sampling methods in medicine are simple random sampling, systematic sampling, stratified sampling and cluster sampling.

In simple random sampling, every subject has an equal chance of being selected for the study. The most recommended way to select a simple random sample is to use a table of random numbers or a computer-generated list of random numbers. Consider the study by Kamal et al .[ 7 ] that aimed to assess the burden of stroke and transient ischemic attack in Pakistan. In this study, the investigators used a household list from census data and picked a random set of households from this list. They subsequently interviewed the members of the randomly chosen households and used this data to estimate cerebrovascular disease prevalence in a particular region of Pakistan. Prevalence studies such as this are often conducted by using random sampling to generate a sampling frame from preexisting lists (such as census lists, hospital discharge lists, etc.).

A systematic random sample is one in which every k th item is selected. k is determined by dividing the number of items in the sampling frame by sample size.

A stratified random sample is one in which the population is first divided into relevant strata or subgroups and then, using the simple random sample method, a sample is drawn from each strata. Deng et al .[ 8 ] studied IV tissue Plasminogen Activator (tPA) usage in acute stroke among hospitals in Michigan. In order to enroll patients across a wide array of hospitals, they employ a stratified random sampling in order to construct the list of hospitals. They stratified hospitals by number of stroke discharges, and then randomly picked an equal number of hospitals within each stratum. Stratified random sampling such as this can be used to ensure that sampling adequately reflects the nature of current practice (such as practice and management trends across the range of hospital patient volumes, for instance).

A cluster sample results from a two-stage process in which the population is divided into clusters, and a subset of the clusters is randomly selected. Clusters are commonly based on geographic areas or districts and, therefore, this approach is used more often in epidemiologic research than in clinical studies.[ 9 ]

## Random samples and randomization

Random samples and randomization (aka, random assignment) are two different concepts. Although both involve the use of the probability sampling method, random sampling determines who will be included in the sample. Randomization, or random assignment, determines who will be in the treatment or control group. Random sampling is related to sampling and external validity (generalizability), whereas random assignment is related to design and internal validity.

In experimental studies such as randomized controlled trials, subjects are first selected for inclusion in the study on the basis of appropriate criteria; they are then assigned to different treatment modalities using random assignment. Randomized controlled trials that are considered to be the most efficient method of controlling validity issues by taking into account all the potential confounding variables (such as other outside factors that could influence the variables under study) are also considered most reliable and impartial method of determining the impact of the experiment. Any differences in the outcome of the study are more likely to be the result of difference in the treatments under consideration than due to differences because of groups.

Scarborough et al .,[ 2 ] in a trial published in the New England Journal of Medicine , looked at corticosteroid therapy for bacterial meningitis in sub-saharan Africa to see whether the benefits seen with early corticosteroid administration in bacterial meningitis in the developed world also apply to the developing world. Interestingly, they found that adjuvant Dexamethasone therapy did not improve outcomes in meningitis cases in sub-saharan Africa. In this study, they performed random assignment of therapy (Dexamethasone vs. placebo). It is useful to note that the process of random assignment usually involves multiple sub-steps, each designed to eliminate confounders. For instance, in the above-mentioned study, both steroids and placebo were packaged similarly, in opaque envelopes, and given to patients (who consented to enroll) in a randomized fashion. These measures ensure the double-blind nature of the trial. Care is taken to make sure that the administrators of the therapy in question are blinded to the type of therapy (steroid vs. placebo) that is being given.

## Sample size

The most important question that a researcher should ask when planning a study is “How large a sample do I need?” If the sample size is too small, even a well-conducted study may fail to answer its research question, may fail to detect important effects or associations, or may estimate those effects or associations too imprecisely. Similarly, if the sample size is too large, the study will be more difficult and costly, and may even lead to a loss in accuracy. Hence, optimum sample size is an essential component of any research. Careful consideration of sample size and power analysis during the planning and design stages of clinical research is crucial.

Statistical power is the probability that an empirical test will detect a relationship when a relationship in fact exists. In other words, statistical power explains the generalizability of the study results and its inferential power to explain population variability. Sample size is directly related to power; ceteris paribus, the bigger a sample, the higher the statistical power.[ 10 ] If the statistical power is low, this does not necessarily mean that an undetected relationships exist, but does indicate that the research is unlikely to find such links if they exist.[ 10 ] Flow chart relating research question, sampling and research design and data analysis is shown in Figure 1 .

Overall framework of research design

The power of a study tells us how confidently we can exclude an association between two parameters. For example, regarding the prior research question of the association between NCC and epilepsy, a negative result might lead one to conclude that there is no association between NCC and epilepsy. However, the study might not have been sufficiently powered to exclude any possible association, or the sample size might have been too small to reveal an association.

The sample sizes seen in the two meningitis studies mentioned earlier are calculated numbers. Using estimates of prevalence of meningitis in their respective communities, along with variables such as size of expected effect (expected rate difference between treated and untreated groups) and level of significance, the investigators in both studies would have calculated their sample numbers ahead of enrolling patients. Sample sizes are calculated based on the magnitude of effect that the researcher would like to see in his treatment population (compared with placebo). It is important to note variables such as prevalence, expected confidence level and expected treatment effect need to be predetermined in order to calculate sample size. As an example, Scarborough et al .[ 2 ] state that “on the basis of a background mortality of 56% and an ability to detect a 20% or greater difference in mortality, the initial sample size of 660 patients was modified to 420 patients to detect a 30% difference after publication of the results of a European trial that showed a relative risk of death of 0.59 for corticosteroid treatment.” Determining existing prevalence and effect size can be difficult in areas of research where such numbers are not readily available in the literature. Ensuring adequate sample size has impacts for the final results of a trial, particularly negative trials. An improperly powered negative trial could fail to detect an existing association simply because not enough patients were enrolled. In other words, the result of the sample analysis would have failed to reject the null hypothesis (that there is no difference between the new treatment and the alternate treatment), when in fact it should have been rejected, which is referred to as type II error. This statistical error arises because of inadequate power to explain population variability. Careful consideration of sample size and power analysis is one of the prerequisites of medical research. Another prerequisite is appropriate and adequate research design, which will be addressed in the next issue.

Source of Support: Nil,

Conflict of Interest: Nil.

## Importance sampling

by Marco Taboga , PhD

Importance sampling is a variance reduction technique. We use it to reduce the variance of the approximation error that we make when we approximate an expected value with Monte Carlo integration .

In this lecture, we explain how importance sampling works and then we show with an example how effective it can be.

Table of contents

## Equivalent expectations

Discrete vectors, continuous vectors, importance samples, approximation errors, the ideal sample, intuition and main takeaways, the problem, the solution, the python code.

Importance sampling is based on a simple method used to compute expected values in many different but equivalent ways.

The next proposition shows how the technique works for discrete random vectors.

An almost identical proposition holds for continuous random vectors.

This technique is called importance sampling .

The next proposition shows when this ideal situation is achievable.

Before showing an example, let us summarize the main takeaways from this lecture:

Let us now illustrate importance sampling with an example.

As a consequence, if we use a standard Monte Carlo approximation:

This results in a high variance of the approximation error.

If you run this example code, you can see that indeed the importance sampling approximation achieves a significant reduction in the approximation error (from 0.0080 to 0.0012).

The output is:

## How to cite

Please cite as:

Taboga, Marco (2021). "Importance sampling", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/asymptotic-theory/importance-sampling.

Most of the learning materials found on this website are now available in a traditional textbook format.

- Binomial distribution
- Set estimation
- Likelihood ratio test
- Central Limit Theorem
- Multinomial distribution
- Law of Large Numbers
- Gamma function
- Delta method
- Maximum likelihood
- Mathematical tools
- Fundamentals of probability
- Probability distributions
- Asymptotic theory
- Fundamentals of statistics
- About Statlect
- Cookies, privacy and terms of use
- IID sequence
- Null hypothesis
- Probability space
- Type II error
- Distribution function
- Continuous mapping theorem
- To enhance your privacy,
- we removed the social buttons,
- but don't forget to share .

- Original Paper
- Published: 31 July 2021

## The Importance of Importance Sampling: Exploring Methods of Sampling from Alternatives in Discrete Choice Models of Crime Location Choice

- Sophie Curtis-Ham ORCID: orcid.org/0000-0001-8093-4804 1 ,
- Wim Bernasco 2 , 3 ,
- Oleg N. Medvedev 1 &
- Devon L. L. Polaschek 1

Journal of Quantitative Criminology volume 38 , pages 1003–1031 ( 2022 ) Cite this article

670 Accesses

2 Citations

7 Altmetric

Metrics details

The burgeoning field of individual level crime location choice research has required increasingly large datasets to model complex relationships between the attributes of potential crime locations and offenders’ choices. This study tests methods of sampling aiming to overcome computational challenges involved in the use of such large datasets.

Using police data on 38,120 residential and non-residential burglary, commercial and personal robbery and extra-familial sex offense locations and the offenders’ pre-offense activity locations (e.g., home, family members’ homes and prior crime locations), and in the context of the conditional logit formulation of the discrete spatial choice model, we tested a novel method for importance sampling of alternatives. The method over-samples potential crime locations near to offenders’ activity locations that are more likely to be chosen for crime. We compared variants of this method with simple random sampling.

Importance sampling produced results more consistent with those produced without sampling compared with simple random sampling, and provided considerable computational savings. There were strong relationships between the locations of offenders’ prior criminal and non-criminal activities and their crime locations.

## Conclusions

Importance sampling from alternatives is a relatively simple and effective method that enables future studies to use larger datasets (e.g., with more variables, wider study areas, or more granular spatial or spatio-temporal units) to yield greater insights into crime location choice. By examining non-residential burglary and sexual offenses, in New Zealand, the substantive results represent a novel contribution to the growing literature on offenders’ spatial decision making.

This is a preview of subscription content, access via your institution .

## Access options

Buy single article.

Instant access to the full article PDF.

Price includes VAT (Russian Federation)

Rent this article via DeepDyve.

We use the terms ‘decision’ and ‘choice’ to refer to location choice as revealed in behaviour. The choice may not feel like a choice to the offender and may not even take place consciously. It can also reflect a decision to visit a place for a non-criminal purpose, whereupon a crime opportunity is identified and acted on (Ruiter 2017 ).

A dataset might be too large to hold in RAM, or it might take days, weeks or even months to run the model, depending on the speed of the processor.

Note that the chosen alternative is always included in the sample; the random sample is taken from the remaining alternatives (McFadden 1977 ; Ben-Akiva and Lerman 1985 ).

The fact that both sample sizes (of alternatives and of robberies) totalled 6,000 is coincidental. There is no reason why this should be the case.

This study forms part of a wider programme of research for which the data were divided into ‘training’ and ‘test’ samples (50% each). The training data were used for all analyses where models were trained (such as the present study). The test data were reserved for later studies testing model accuracy when applied to new data.

The 2018 SA2 shapefile and metadata were downloaded from https://datafinder.stats.govt.nz/layer/92212-statistical-area-2-2018-generalised/ . We excluded 83 SA2s which cover large bodies of water along coastlines and over lakes.

There were large changes in residential population in many SA2s over the data period due to the Christchurch earthquakes and housing developments in response to increasing urban populations. Census 2013 data was used for offenses occurring between 2009 and 2015, and census 2018 data were used for offenses occurring between 2016 and 2018.

Business demography statistics remained consistent over the data period so 2018 was used for simplicity.

Industry categories included: G Retail Trade, H Accommodation and Food Services, K Financial and Insurance Services, L Rental, Hiring and Real Estate Services. M Professional, Scientific and Technical Services, K Financial and Insurance Services, L Rental, Hiring and Real Estate Services, M Professional, Scientific and Technical Services, N Administrative and Support Services, R Arts and Recreation Services, S Other Services. See Curtis-Ham et al. ( 2021 ) for details of how ‘commercial’ robberies were identified.

All industries as for commercial robbery plus: I Transport, Postal and Warehousing, J Information Media and Telecommunications, O Public Administration and Safety, P Education and Training, Q Health Care and Social Assistance.

We also compared bootstrapped versions of the single stratum importance sampling strategy (DIS1) and the smallest simple random sampling (SRS1), since the ‘strategy to beat’ to produce robust results with the smallest dataset was the single stratum importance sample, to which the smaller simple random sampling strategy was closest in sample size. The estimates and standard errors for 20 bootstrap iterations were combined using Rubin’s rule (Rubin 1987 ) implemented in the Amelia package in R (King et al. 2000 ). The bootstrapped strategies produced the same pattern as the single iterations, as shown in Fig. 6 in Appendix 5. Of note, bootstrapping the simple random sampling did not produce estimates any closer to those from the full model.

We are grateful to an anonymous reviewer for contributing this explanation.

Altizio A, York D (2007) Robbery of convenience stores. U.S. Department of Justice, Office of Community Oriented Policing Services, Washington, DC

Ben-Akiva ME, Bowman JL (1998) Integration of an activity-based model system and a residential location model. Urban Stud 35:1131–1153. https://doi.org/10.1080/0042098984529

Article Google Scholar

Ben-Akiva ME, Lerman SR (1985) Discrete choice analysis: Theory and application to travel demand. MIT Press, Cambridge, MA

Google Scholar

Bernasco W (2006) Co-offending and the choice of target areas in burglary. J Investig Psych Offender Profil 3:139–155. https://doi.org/10.1002/jip.49

Bernasco W (2010) Modeling micro-level crime location choice: application of the discrete choice framework to crime at places. J Quant Criminol 26:113–138. https://doi.org/10.1007/s10940-009-9086-6

Bernasco W (2017) Modeling offender decision making with secondary data. In: Bernasco W, Van Gelder J-L, Elffers H (eds) The Oxford handbook on offender decision making. Oxford University Press, Oxford, England, pp 569–586

Bernasco W (2019) Adolescent offenders’ current whereabouts predict locations of their future crimes. PLoS ONE 14:e0210733. https://doi.org/10.1371/journal.pone.0210733

Bernasco W, Jacques S (2015) Where do dealers solicit customers and sell them drugs? a micro-level multiple method study. J Contemp Crim Justice 31:376–408. https://doi.org/10.1177/1043986215608535

Bernasco W, Nieuwbeerta P (2005) How do residential burglars select target areas? a new approach to the analysis of criminal location choice. Br J Criminol 45:296–315. https://doi.org/10.1093/bjc/azh070

Bernasco W, Block R, Ruiter S (2013) Go where the money is: modeling street robbers’ location choices. J Econ Geogr 13:119–143. https://doi.org/10.1093/jeg/lbs005

Bernasco W, Johnson SD, Ruiter S (2015) Learning where to offend: effects of past on future burglary locations. Appl Geogr 60:120–129. https://doi.org/10.1016/j.apgeog.2015.03.014

Bernasco W, Ruiter S, Block R (2017) Do street robbery location choices vary over time of day or day of week? a test in Chicago. J Res Crime Delinq 54:244–275. https://doi.org/10.1177/0022427816680681

Bhat C, Govindarajan A, Pulugurta V (1998) Disaggregate attraction-end choice modeling formulation and empirical analysis. Transp Res Rec 1645:60–68. https://doi.org/10.3141/1645-08

Bichler G, Malm A, Christie-Merrall J (2012) Urban backcloth and regional mobility patterns as indicators of juvenile crime. In: Andresen MA, Kinney JB (eds) Patterns, prevention, and geometry of crime. Routledge, London, England, pp 118–136

Bowman JL, Ben-Akiva ME (2001) Activity-based disaggregate travel demand model system with activity schedules. Transp Res Part A 35:1–28. https://doi.org/10.1016/S0965-8564(99)00043-9

Brantingham PL, Brantingham PJ (1991) Notes on the geometry of crime. In: Brantingham PJ, Brantingham PL (eds) Environmental criminology, 2nd edn. Waveland Press, Prospect Heights, IL, pp 27–54

Brantingham PL, Brantingham PJ (1993) Environment, routine, and situation: Toward a pattern theory of crime. In: Clarke RV, Felson M (eds) Routine activity and rational choice. Transaction Publishers, Piscataway, NJ, pp 259–294

Bright D, Whelan C, Morselli C (2020) Understanding the structure and composition of co-offending networks in Australia. Australian Institute of Criminology Australia

Buil-Gil D, Moretti A, Langton SH (2021) The accuracy of crime statistics: assessing the impact of police data bias on geographic crime analysis. J Exp Criminol. https://doi.org/10.1007/s11292-021-09457-y

Chiu Y-N, Leclerc B (2020) Predictors and Contexts of Unsolved and Solved Sexual Offenses. Crime Delinq 66:1268–1295. https://doi.org/10.1177/0011128719879027

Clare J, Fernandez J, Morgan F (2009) Formal evaluation of the impact of barriers and connectors on residential burglars’ macro-level offending location choices. Aust N Z J Criminol 42:139–158. https://doi.org/10.1375/acri.42.2.139

Curtis-Ham S, Bernasco W, Medvedev ON, Polaschek DLL (2020) A framework for estimating crime location choice based on awareness space. Crime Sci 9:1–14. https://doi.org/10.1186/s40163-020-00132-7

Curtis-Ham S, Bernasco W, Medvedev ON, Polaschek DLL (2021) A national examination of the spatial extent and similarity of offenders’ activity spaces using police data. ISPRS Int J Geo-Inf 10(2):47. https://doi.org/10.3390/ijgi10020047

Duncombe W, Robbins M, Wolf DA (2001) Retire to where? a discrete choice model of residential location. Int J Popul Geogr 7:281–293. https://doi.org/10.1002/ijpg.227

Frejinger E, Bierlaire M, Ben-Akiva M (2009) Sampling of alternatives for route choice modeling. Transportation Research Part B: Methodological 43:984–994. https://doi.org/10.1016/j.trb.2009.03.001

Frith MJ (2019) Modelling taste heterogeneity regarding offence location choices. J Choice Modell 33:100187. https://doi.org/10.1016/j.jocm.2019.100187

Frith MJ, Johnson SD, Fry HM (2017) Role of the street network in burglars’ spatial decision-making. Criminology 55:344–376. https://doi.org/10.1111/1745-9125.12133

Golledge R (1999) Human wayfinding and cognitive maps. In: Golledge R (ed) Wayfinding behavior: Cognitive mapping and other spatial processes. Johns Hopkins University Press, Baltimore, MD, pp 5–45

Guevara CA, Ben-Akiva ME (2013a) Sampling of alternatives in multivariate extreme value (MEV) models. Transportation Research Part B: Methodological 48:31–52. https://doi.org/10.1016/j.trb.2012.11.001

Guevara CA, Ben-Akiva ME (2013b) Sampling of alternatives in logit mixture models. Transportation Research Part B: Methodological 58:185–198. https://doi.org/10.1016/j.trb.2013.08.011

Guevara CA, Chorus CG, Ben-Akiva ME (2016) Sampling of alternatives in random regret minimization models. Transp Sci 50:306–321. https://doi.org/10.1287/trsc.2014.0573

Hanayama A, Haginoya S, Kuraishi H, Kobayashi M (2018) The usefulness of past crime data as an attractiveness index for residential burglars. J Investigative Psychology and Offender Profiling 15:257–270. https://doi.org/10.1002/jip.1507

Hassan MN, Rashidi TH, Nassir N (2019) Consideration of different travel strategies and choice set sizes in transit path choice modelling. Transportation (dordrecht). https://doi.org/10.1007/s11116-019-10075-x

Huybers T (2005) Destination choice modelling: what’s in a name? Tour Econ 11:329–350. https://doi.org/10.5367/000000005774352999

Jonnalagadda N, Freedman J, Davidson WA, Hunt JD (2001) Development of microsimulation activity-based model for San Francisco: destination and mode choice models. Transp Res Rec 1777:25–35. https://doi.org/10.3141/1777-03

Kim J, Lee S (2017) Comparative analysis of traveler destination choice models by method of sampling alternatives. Transp Plan Technol 40:465–478. https://doi.org/10.1080/03081060.2017.1300242

King G, Honaker J, Joseph A, Scheve K (2000) Analyzing incomplete political science data: an alternative algorithm for multiple imputation. American Political Science Review 95:49–69

Lammers M (2014) Are arrested and non-arrested serial offenders different? a test of spatial offending patterns using DNA found at crime scenes. J Res Crime Delinq 51:143–167. https://doi.org/10.1177/0022427813504097

Lammers M (2018) Co-offenders’ crime location choice: do co-offending groups commit crimes in their shared awareness space? Br J Criminol 58:1193–1211. https://doi.org/10.1093/bjc/azx069

Lammers M, Menting B, Ruiter S, Bernasco W (2015) Biting once, twice: the influence of prior on subsequent crime location choice. Criminology 53:309–329. https://doi.org/10.1111/1745-9125.12071

Lemp JD, Kockelman KM (2012) Strategic sampling for large choice sets in estimation and application. Transp Res Part A 46:602–613. https://doi.org/10.1016/j.tra.2011.11.004

Li M-T, Chow L-F, Zhao F, Li S-C (2005) Geographically stratified importance sampling for the calibration of aggregated destination choice models for trip distribution. Transp Res Rec 1935:85–92. https://doi.org/10.3141/1935-10

Long D, Liu L, Feng J, Zhou S (2018) Assessing the influence of prior on subsequent street robbery location choices: A case study in ZG city. China Sustain 10:1818. https://doi.org/10.3390/su10061818

McFadden D (1977) Modelling the choice of residential location. Yale University, Cowles Foundation for Research in Economics

McFadden D (1984) Econometric analysis of qualitative response models. In: Griliches P, Intriligator MD (eds) Handbook of econometrics. Elsevier, Amsterdam, The Netherlands, pp 105–142

Menting B (2018) Awareness x opportunity: testing interactions between activity nodes and criminal opportunity in predicting crime location choice. Br J Criminol 58:1171–1192. https://doi.org/10.1093/bjc/azx049

Menting B, Lammers M, Ruiter S, Bernasco W (2016) Family matters: effects of family members’ residential areas on crime location choice. Criminology 54:413–433. https://doi.org/10.1111/1745-9125.12109

Menting B, Lammers M, Ruiter S, Bernasco W (2020) The influence of activity space and visiting frequency on crime location choice: findings from an online self-report survey. Br J Criminol 60:303–322. https://doi.org/10.1093/bjc/azz044

Mersman O (2019) microbenchmark: Accurate timing functions. Version 1.4–7URL https://CRAN.R-project.org/package=microbenchmark

Nerella S, Bhat CR (2004) Numerical analysis of effect of sampling of alternatives in discrete choice models. Transp Res Rec 1894:11–19. https://doi.org/10.3141/1894-02

Nevo A (2001) Measuring market power in the ready-to-eat cereal industry. Econometrica 69:307–342

Nguyen HTA, Chikaraishi M, Fujiwara A, Zhang J (2017) Mediation effects of income on travel mode choice: Analysis of short-distance trips based on path analysis with multiple discrete outcomes. Transp Res Rec 2664:23–30. https://doi.org/10.3141/2664-03

Park H, Park D, Kim C et al (2013) A comparative study on sampling strategies for truck destination choice model: case of Seoul metropolitan area. Can J Civ Eng 40:19–26. https://doi.org/10.1139/cjce-2012-0433

R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

Rossmo DK (2000) Geographic profiling. CRC Press, Boca Raton, FL

Rubin D (1987) Multiple Imputation for Nonresponse in Surveys, 1st edn. Wiley, NY

Book Google Scholar

Ruiter S (2017) Crime location choice. In: Bernasco W, Van Gelder J-L, Elffers H (eds) The Oxford handbook of offender decision making. Oxford University Press, Oxford, pp 398–420

Schönfelder S, Axhausen KW (2002) Measuring the size and structure of human activity spaces: The longitudinal perspective. ETH, Zurich

Shiftan Y (1998) Practical approach to model trip chaining. Transp Res Rec 1645:17–23. https://doi.org/10.3141/1645-03

Song G, Bernasco W, Liu L et al (2019) Crime feeds on legal activities: Daily mobility flows help to explain thieves’ target location choices. J Quant Criminol. https://doi.org/10.1007/s10940-019-09406-z

Taylor N (2002) Robbery against service stations and pharmacies: recent trends. Australian Institute of Criminology, Canberra, Australia

Therneau T (2020) A Package for Survival Analysis in R. Version 3.1–12URL https://CRAN.R-project.org/package=survival

Townsley M (2016) Offender mobility. In: Wortley R, Townsley M (eds) Environmental criminology and crime analysis. Routledge, London, England, pp 142–161

Townsley M, Birks D, Bernasco W et al (2015) Burglar target selection: a cross-national comparison. J Res Crime Delinq 52:3–31. https://doi.org/10.1177/0022427814541447

Townsley M, Birks D, Ruiter S et al (2016) Target selection models with preference variation between offenders. J Quant Criminol 32:283–304. https://doi.org/10.1007/s10940-015-9264-7

van Daele S, Vander Beken T (2010) Journey to crime of “itinerant crime groups.” Policing Int J. 33:339–353. https://doi.org/10.1108/13639511011044920

van Daele S, Vander Beken T, Bruinsma GJN (2012) Does the mobility of foreign offenders fit the general pattern of mobility? Eur J Criminol 9:290–308. https://doi.org/10.1177/1477370812440065

van Sleeuwen SEM, Ruiter S, Menting B (2018) A time for a crime: temporal aspects of repeat offenders’ crime location choices. J Res Crime Delinq 55:538–568. https://doi.org/10.1177/0022427818766395

Vandeviver C, Bernasco W (2020) “Location, location, location”: effects of neighborhood and house attributes on burglars’ target selection. J Quant Criminol 36:779–821. https://doi.org/10.1007/s10940-019-09431-y

Vandeviver C, Neutens T, van Daele S et al (2015) A discrete spatial choice model of burglary target selection at the house-level. Appl Geogr 64:24–34. https://doi.org/10.1016/j.apgeog.2015.08.004

von Haefen RH, Domanski A (2013) Estimating mixed logit models with large choice sets. In: International Choice Modelling Conference. Sydney

Download references

## Acknowledgements

We gratefully acknowledge the assistance of the NZ Police staff who provided access to and advice on the data used in this research and who reviewed the manuscript prior to submission.

This research forms part of SCH’s PhD thesis, which is funded by a University of Waikato doctoral scholarship.

## Author information

Authors and affiliations.

Te Puna Haumaru NZ Institute of Security and Crime Science & Te Kura Whatu Oho Mauri School of Psychology, Te Whare Wānanga o Waikato University of Waikato, Hamilton, 3240, New Zealand

Sophie Curtis-Ham, Oleg N. Medvedev & Devon L. L. Polaschek

Netherlands Institute for the Study of Crime and Law Enforcement (NSCR), 1081 HV, Amsterdam, The Netherlands

Wim Bernasco

Department of Spatial Economics, School of Business and Economics, Vrije Universiteit Amsterdam, 1081 HV, Amsterdam, The Netherlands

You can also search for this author in PubMed Google Scholar

## Contributions

Conceptualization: Sophie Curtis-Ham; Methodology: Sophie Curtis-Ham, Wim Bernasco; Formal analysis and investigation: Sophie Curtis-Ham; Writing—original draft preparation: Sophie Curtis-Ham; Writing—equations and accompanying text: Wim Bernasco; Writing—review and editing: Sophie Curtis-Ham, Wim Bernasco, Oleg Medvedev, Devon Polaschek; Funding acquisition: Sophie Curtis-Ham; Resources: Sophie Curtis-Ham; Supervision: Devon Polaschek, Oleg Medvedev. All authors read and approved the final manuscript.

## Corresponding author

Correspondence to Sophie Curtis-Ham .

## Ethics declarations

Conflicts of interest.

SCH is employed as a researcher at New Zealand Police. This study was not conducted as a part of that employment.

## Ethics approval

This research study was conducted retrospectively from data obtained for operational purposes. Ethics approval was obtained from the Psychology Research and Ethics Committee of the University of Waikato (reference #19:13). Approval of access to data for this study was obtained from the NZ Police Research Panel (reference EV-12–462). The results presented in this paper are the work of the authors and do not represent the views of New Zealand Police.

## Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary Information

Below is the link to the electronic supplementary material.

## Supplementary file1 (PDF 134 kb)

Supplementary file2 (txt 59 kb), rights and permissions.

Reprints and Permissions

## About this article

Cite this article.

Curtis-Ham, S., Bernasco, W., Medvedev, O.N. et al. The Importance of Importance Sampling: Exploring Methods of Sampling from Alternatives in Discrete Choice Models of Crime Location Choice. J Quant Criminol 38 , 1003–1031 (2022). https://doi.org/10.1007/s10940-021-09526-5

Download citation

Accepted : 06 July 2021

Published : 31 July 2021

Issue Date : December 2022

DOI : https://doi.org/10.1007/s10940-021-09526-5

## Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

- Crime location choice
- Discrete choice modelling
- Police data
- Routine activity nodes
- Sampling from alternatives

Advertisement

- Find a journal
- Publish with us

## Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

- Knowledge Base

Methodology

- Sampling Methods | Types, Techniques & Examples

## Sampling Methods | Types, Techniques & Examples

Published on September 19, 2019 by Shona McCombes . Revised on June 22, 2023.

When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample . The sample is the group of individuals who will actually participate in the research.

To draw valid conclusions from your results, you have to carefully decide how you will select a sample that is representative of the group as a whole. This is called a sampling method . There are two primary types of sampling methods that you can use in your research:

- Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group.
- Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data.

You should clearly explain how you selected your sample in the methodology section of your paper or thesis, as well as how you approached minimizing research bias in your work.

## Table of contents

Population vs. sample, probability sampling methods, non-probability sampling methods, other interesting articles, frequently asked questions about sampling.

First, you need to understand the difference between a population and a sample , and identify the target population of your research.

- The population is the entire group that you want to draw conclusions about.
- The sample is the specific group of individuals that you will collect data from.

The population can be defined in terms of geographical location, age, income, or many other characteristics.

It is important to carefully define your target population according to the purpose and practicalities of your project.

If the population is very large, demographically mixed, and geographically dispersed, it might be difficult to gain access to a representative sample. A lack of a representative sample affects the validity of your results, and can lead to several research biases , particularly sampling bias .

## Sampling frame

The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population).

## Sample size

The number of individuals you should include in your sample depends on various factors, including the size and variability of the population and your research design. There are different sample size calculators and formulas depending on what you want to achieve with statistical analysis .

## Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research . If you want to produce results that are representative of the whole population, probability sampling techniques are the most valid choice.

There are four main types of probability sample.

## 1. Simple random sampling

In a simple random sample, every member of the population has an equal chance of being selected. Your sampling frame should include the whole population.

To conduct this type of sampling, you can use tools like random number generators or other techniques that are based entirely on chance.

## 2. Systematic sampling

Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals.

If you use this technique, it is important to make sure that there is no hidden pattern in the list that might skew the sample. For example, if the HR database groups employees by team, and team members are listed in order of seniority, there is a risk that your interval might skip over people in junior roles, resulting in a sample that is skewed towards senior employees.

## 3. Stratified sampling

Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the sample.

To use this sampling method, you divide the population into subgroups (called strata) based on the relevant characteristic (e.g., gender identity, age range, income bracket, job role).

Based on the overall proportions of the population, you calculate how many people should be sampled from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup.

## 4. Cluster sampling

Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select entire subgroups.

If it is practically possible, you might include every individual from each sampled cluster. If the clusters themselves are large, you can also sample individuals from within each cluster using one of the techniques above. This is called multistage sampling .

This method is good for dealing with large and dispersed populations, but there is more risk of error in the sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are really representative of the whole population.

In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a chance of being included.

This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias . That means the inferences you can make about the population are weaker than with probability samples, and your conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative of the population as possible.

Non-probability sampling techniques are often used in exploratory and qualitative research . In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial understanding of a small or under-researched population.

## 1. Convenience sampling

A convenience sample simply includes the individuals who happen to be most accessible to the researcher.

This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results. Convenience samples are at risk for both sampling bias and selection bias .

## 2. Voluntary response sampling

Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey).

Voluntary response samples are always at least somewhat biased , as some people will inherently be more likely to volunteer than others, leading to self-selection bias .

## 3. Purposive sampling

This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research.

It is often used in qualitative research , where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statistical inferences, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your arguments.

## 4. Snowball sampling

If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to “snowballs” as you get in contact with more people. The downside here is also representativeness, as you have no way of knowing how representative your sample is due to the reliance on participants recruiting others. This can lead to sampling bias .

## 5. Quota sampling

Quota sampling relies on the non-random selection of a predetermined number or proportion of units. This is called a quota.

You first divide the population into mutually exclusive subgroups (called strata) and then recruit sample units until you reach your quota. These units share specific characteristics, determined by you prior to forming your strata. The aim of quota sampling is to control what or who makes up your sample.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

- Student’s t -distribution
- Normal distribution
- Null and Alternative Hypotheses
- Chi square tests
- Confidence interval
- Quartiles & Quantiles
- Cluster sampling
- Stratified sampling
- Data cleansing
- Reproducibility vs Replicability
- Peer review
- Prospective cohort study

Research bias

- Implicit bias
- Cognitive bias
- Placebo effect
- Hawthorne effect
- Hindsight bias
- Affect heuristic
- Social desirability bias

## A faster, more affordable way to improve your paper

Scribbr’s new AI Proofreader checks your document and corrects spelling, grammar, and punctuation mistakes with near-human accuracy and the efficiency of AI!

Proofread my paper

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.

## Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, June 22). Sampling Methods | Types, Techniques & Examples. Scribbr. Retrieved November 28, 2023, from https://www.scribbr.com/methodology/sampling-methods/

## Is this article helpful?

## Shona McCombes

Other students also liked, population vs. sample | definitions, differences & examples, simple random sampling | definition, steps & examples, sampling bias and how to avoid it | types & examples, what is your plagiarism score.

## Random Assignment in Psychology: Definition & Examples

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Educator, Researcher

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, Ph.D., is a qualified psychology teacher with over 18 years experience of working in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

In psychology, random assignment refers to the practice of allocating participants to different experimental groups in a study in a completely unbiased way, ensuring each participant has an equal chance of being assigned to any group.

In experimental research, random assignment, or random placement, organizes participants from your sample into different groups using randomization.

Random assignment uses chance procedures to ensure that each participant has an equal opportunity of being assigned to either a control or experimental group.

The control group does not receive the treatment in question, whereas the experimental group does receive the treatment.

When using random assignment, neither the researcher nor the participant can choose the group to which the participant is assigned. This ensures that any differences between and within the groups are not systematic at the onset of the study.

In a study to test the success of a weight-loss program, investigators randomly assigned a pool of participants to one of two groups.

Group A participants participated in the weight-loss program for 10 weeks and took a class where they learned about the benefits of healthy eating and exercise.

Group B participants read a 200-page book that explains the benefits of weight loss. The investigator randomly assigned participants to one of the two groups.

The researchers found that those who participated in the program and took the class were more likely to lose weight than those in the other group that received only the book.

## Importance

Random assignment ensures that each group in the experiment is identical before applying the independent variable.

In experiments , researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. Random assignment increases the likelihood that the treatment groups are the same at the onset of a study.

Thus, any changes that result from the independent variable can be assumed to be a result of the treatment of interest. This is particularly important for eliminating sources of bias and strengthening the internal validity of an experiment.

Random assignment is the best method for inferring a causal relationship between a treatment and an outcome.

## Random Selection vs. Random Assignment

Random selection (also called probability sampling or random sampling) is a way of randomly selecting members of a population to be included in your study.

On the other hand, random assignment is a way of sorting the sample participants into control and treatment groups.

Random selection ensures that everyone in the population has an equal chance of being selected for the study. Once the pool of participants has been chosen, experimenters use random assignment to assign participants into groups.

Random assignment is only used in between-subjects experimental designs, while random selection can be used in a variety of study designs.

## Random Assignment vs Random Sampling

Random sampling refers to selecting participants from a population so that each individual has an equal chance of being chosen. This method enhances the representativeness of the sample.

Random assignment, on the other hand, is used in experimental designs once participants are selected. It involves allocating these participants to different experimental groups or conditions randomly.

This helps ensure that any differences in results across groups are due to manipulating the independent variable, not preexisting differences among participants.

## When to Use Random Assignment

Random assignment is used in experiments with a between-groups or independent measures design.

In these research designs, researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.

There is usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable at the onset of the study.

## How to Use Random Assignment

There are a variety of ways to assign participants into study groups randomly. Here are a handful of popular methods:

- Random Number Generator : Give each member of the sample a unique number; use a computer program to randomly generate a number from the list for each group.
- Lottery : Give each member of the sample a unique number. Place all numbers in a hat or bucket and draw numbers at random for each group.
- Flipping a Coin : Flip a coin for each participant to decide if they will be in the control group or experimental group (this method can only be used when you have just two groups)
- Roll a Die : For each number on the list, roll a dice to decide which of the groups they will be in. For example, assume that rolling 1, 2, or 3 places them in a control group and rolling 3, 4, 5 lands them in an experimental group.

## When is Random Assignment not used?

- When it is not ethically permissible: Randomization is only ethical if the researcher has no evidence that one treatment is superior to the other or that one treatment might have harmful side effects.
- When answering non-causal questions : If the researcher is just interested in predicting the probability of an event, the causal relationship between the variables is not important and observational designs would be more suitable than random assignment.
- When studying the effect of variables that cannot be manipulated: Some risk factors cannot be manipulated and so it would not make any sense to study them in a randomized trial. For example, we cannot randomly assign participants into categories based on age, gender, or genetic factors.

## Drawbacks of Random Assignment

While randomization assures an unbiased assignment of participants to groups, it does not guarantee the equality of these groups. There could still be extraneous variables that differ between groups or group differences that arise from chance. Additionally, there is still an element of luck with random assignments.

Thus, researchers can not produce perfectly equal groups for each specific study. Differences between the treatment group and control group might still exist and the results of a randomized trial may sometimes be wrong, but this is absolutely okay.

Scientific evidence is a long and continuous process, and the groups will tend to be equal in the long run when data is aggregated in a meta-analysis.

Additionally, external validity (i.e., the extent to which the researcher can use the results of the study to generalize to the larger population) is compromised with random assignment.

Random assignment is challenging to implement outside of controlled laboratory conditions and might not represent what would happen in the real world at the population level.

Random assignment also can be more costly than simple observational studies where an investigator is just observing events without intervening with the population.

Randomization also can be time-consuming and challenging, especially when participants refuse to receive the assigned treatment or do not adhere to recommendations.

## What is the difference between random sampling and random assignment?

Random sampling refers to randomly selecting a sample of participants from a population. Random assignment refers to randomly assigning participants to treatment groups from the selected sample.

## Does random assignment increase internal validity?

Yes, random assignment ensures that there are no systematic differences between the participants in each group, enhancing the internal validity of the study.

## Does random assignment reduce sampling error?

Yes, with random assignment, participants have an equal chance of being assigned to either a control group or an experimental group, resulting in a sample that is, in theory, representative of the population.

Random assignment does not completely eliminate sampling error because a sample is only an approximation of the population from which it is drawn. However, random sampling is a way to minimize sampling errors.

## When is random assignment not possible?

Random assignment is not possible when the experimenters cannot control the treatment or independent variable.

For example, if you want to compare how men and women perform on a test, you cannot randomly assign subjects to these groups.

Participants are not randomly assigned to different groups in this study, but instead assigned based on their characteristics.

## Does random assignment eliminate confounding variables?

Yes, random assignment eliminates the influence of any confounding variables on the treatment because it distributes them at random among the study groups. Randomization invalidates any relationship between a confounding variable and the treatment.

## Why is random assignment of participants to treatment conditions in an experiment used?

Random assignment is used to ensure that all groups are comparable at the start of a study. This allows researchers to conclude that the outcomes of the study can be attributed to the intervention at hand and to rule out alternative explanations for study results.

Further Reading

Bogomolnaia, A., & Moulin, H. (2001). A new solution to the random assignment problem . Journal of Economic theory , 100 (2), 295-328.

Krause, M. S., & Howard, K. I. (2003). What random assignment does and does not do . Journal of Clinical Psychology , 59 (7), 751-766.

- Sign Up Now
- -- Navigate To -- CR Dashboard Connect for Researchers Connect for Participants
- Log In Log Out Log In
- Recent Press
- Papers Citing Connect
- MTurk Toolkit
- Prime Panels
- Managed Research
- Connect for Participants
- Connect for Researchers
- Connect AI Training
- Health & Medicine
- Conferences
- Knowledge Base
- The Online Researcher’s Guide To Sampling

## What Is the Purpose of Sampling in Research?

## Quick Navigation:

Defining random vs. non-random sampling.

- Why is Sampling Important for Researchers?

## Collect Richer Data

The importance of knowing where to sample.

- Different Use Cases for Online Sampling

## Academic Research

Market research, public polling, user testing.

By Aaron Moss, PhD, Cheskie Rosenzweig, PhD, & Leib Litman, PhD

## Online Researcher’s Sampling Guide, Part 1: What Is the Purpose of Sampling in Research?

Every ten years, the U.S. government conducts a census—a count of every person living in the country—as required by the constitution. It’s a massive undertaking.

The Census Bureau sends a letter or a worker to every U.S. household and tries to gather data that will allow each person to be counted. After the data are gathered, they have to be processed, tabulated and reported. The entire operation takes years of planning and billions of dollars, which begs the question: Is there a better way?

As it turns out, there is.

Instead of contacting every person in the population, researchers can answer most questions by sampling people. In fact, sampling is what the Census Bureau does in order to gather detailed information about the population such as the average household income, the level of education people have, and the kind of work people do for a living. But what, exactly, is sampling, and how does it work?

At its core, a research sample is like any other sample: It’s a small piece or part of something that represents a larger whole.

So, just like the sample of glazed salmon you eat at Costco or the double chocolate brownie ice cream you taste at the ice cream shop, behavioral scientists often gather data from a small group (a sample) as a way to understand a larger whole (a population). Even when the population being studied is as large as the U.S.—about 330 million people—researchers often need to sample just a few thousand people in order to understand everyone.

Now, you may be asking yourself how that works. How can researchers accurately understand hundreds of millions of people by gathering data from just a few thousand of them? Your answer comes from Valery Ivanovich Glivenko and Francesco Paolo Cantelli.

Glivenko and Cantelli were mathematicians who studied probability. At some point during the early 1900s, they discovered that several observations randomly drawn from a population will naturally take on the shape of the population distribution. What this means in plain English is that, as long as researchers randomly sample from a population and obtain a sufficiently sized sample, then the sample will contain characteristics that roughly mirror those of the population.

“Ok. That’s great,” you say. But what does it mean to randomly sample people, and how does a researcher do that?

Random sampling occurs when a researcher ensures every member of the population being studied has an equal chance of being selected to participate in the study. Importantly, ‘the population being studied’ is not necessarily all the inhabitants of a country or a region. Instead, a population can refer to people who share a common quality or characteristic. So, everyone who has purchased a Ford in the last five years can be a population and so can registered voters within a state or college students at a city university. A population is the group that researchers want to understand.

In order to understand a population using random sampling, researchers begin by identifying a sampling frame —a list of all the people in the population the researchers want to study. For example, a database of all landline and cell phone numbers in the U.S. is a sampling frame. Once the researcher has a sampling frame, he or she can randomly select people from the list to participate in the study.

However, as you might imagine, it is not always practical or even possible to gather a sampling frame. There is not, for example, a master list of all the people who use the internet, purchase coffee at Dunkin’, have grieved the death of a parent in the last year, or consider themselves fans of the New York Yankees. Nevertheless, there are very good reasons why researchers may want to study people in each of these groups.

When it isn’t possible or practical to gather a random sample, researchers often gather a non-random sample. A non-random sample is one in which every member of the population being studied does not have an equal chance of being selected into the study.

Because non-random samples do not select participants based on probability, it is often difficult to know how well the sample represents the population of interest. Despite this limitation, a wide range of behavioral science studies conducted within academia, industry and government rely on non-random samples. When researchers use non-random samples, it is common to control for any known sources of sampling bias during data collection. By controlling for possible sources of bias, researchers can maximize the usefulness and generalizability of their data.

## Why Is Sampling Important for Researchers?

Everyone who has ever worked on a research project knows that resources are limited; time, money and people never come in an unlimited supply. For that reason, most research projects aim to gather data from a sample of people, rather than from the entire population (the census being one of the few exceptions). This is because sampling allows researchers to:

Contacting everyone in a population takes time. And, invariably, some people will not respond to the first effort at contacting them, meaning researchers have to invest more time for follow-up. Random sampling is much faster than surveying everyone in a population, and obtaining a non-random sample is almost always faster than random sampling. Thus, sampling saves researchers lots of time.

The number of people a researcher contacts is directly related to the cost of a study. Sampling saves money by allowing researchers to gather the same answers from a sample that they would receive from the population.

Non-random sampling is significantly cheaper than random sampling, because it lowers the cost associated with finding people and collecting data from them. Because all research is conducted on a budget, saving money is important.

Sometimes, the goal of research is to collect a little bit of data from a lot of people (e.g., an opinion poll). At other times, the goal is to collect a lot of information from just a few people (e.g., a user study or ethnographic interview). Either way, sampling allows researchers to ask participants more questions and to gather richer data than does contacting everyone in a population.

Efficient sampling has a number of benefits for researchers. But just as important as knowing how to sample is knowing where to sample . Some research participants are better suited for the purposes of a project than others. Finding participants that are fit for the purpose of a project is crucial, because it allows researchers to gather high-quality data.

For example, consider an online research project. A team of researchers who decides to conduct a study online has several different sources of participants to choose from. Some sources provide a random sample, and many more provide a non-random sample. When selecting a non-random sample, researchers have several options to consider. Some studies are especially well-suited to an online panel that offers access to millions of different participants worldwide. Other studies, meanwhile, are better suited to a crowdsourced site that generally has fewer participants overall but more flexibility for fostering participant engagement.

To make these options more tangible, let’s look at examples of when researchers might use different kinds of online samples.

## Different Use Cases of Online Sampling

Academic researchers gather all kinds of samples online. Some projects require random samples based on probability sampling methods. Most other projects rely on non-random samples. In these non-random samples, researchers may sample a general audience from crowdsourcing websites or selectively target members of specific groups using online panels . The variety of research projects conducted within academia lends itself to many different types of online samples.

Market researchers often want to understand the thoughts, feelings and purchasing decisions of customers or potential customers. For that reason, most online market research is conducted in online panels that provide access to tens of millions of people and allow for complex demographic targeting. For some projects, crowdsourcing sites, such as Amazon Mechanical Turk, allow researchers to get more participant engagement than is typically available in online panels, because they allow researchers to select participants based on experience and to award bonuses.

Public polling is most accurate when it is conducted on a random sample of the population. Hence, lots of public polling is conducted with nationally representative samples. There are, however, an increasing number of opinion polls conducted with non-random samples. When researchers poll people using non-random methods, it is common to adjust for known sources of bias after the data are gathered.

User testing requires people to engage with a website or product. For this reason, user testing is best done on platforms that allow researchers to get participants to engage deeply with their study. Crowdsourcing platforms are ideal for user testing studies, because researchers can often control participant compensation and reward people who are willing to make the effort in a longer study.

Online research is big business. There are hundreds of companies that provide researchers with access to online participants, but only a few facilitate research across different types of online panels or direct you to the right panel for your project. At CloudResearch, we are behavioral and computer science experts with the knowledge to connect you with the right participants for your study and provide expert advice to ensure your project’s successful conclusion. Learn more by contacting us today.

## Continue Reading: The Online Researcher’s Guide to Sampling

## Part 2: How to Reduce Sampling Bias in Research

## Part 3: How to Build a Sampling Process for Marketing Research

## Part 4: Pros and Cons of Different Sampling Methods

Related articles, what is data quality and why is it important.

If you were a researcher studying human behavior 30 years ago, your options for identifying participants for your studies were limited. If you worked at a university, you might be...

## How to Identify and Handle Invalid Responses to Online Surveys

As a researcher, you are aware that planning studies, designing materials and collecting data each take a lot of work. So when you get your hands on a new dataset,...

## SUBSCRIBE TO RECEIVE UPDATES

- Name * First Name Last Name
- I would like to request a demo of the Sentry platform
- Comments This field is for validation purposes and should be left unchanged.
- Name * First name Last name
- Email This field is for validation purposes and should be left unchanged.

- Name * First Last
- Name This field is for validation purposes and should be left unchanged.
- Name * First and Last
- Please select the best time to discuss your project goals/details to claim your free Sentry pilot for the next 60 days or to receive 10% off your first Managed Research study with Sentry.
- Phone This field is for validation purposes and should be left unchanged.

- Email * Enter Email Confirm Email
- Organization
- Job Title *

## The Importance of Sampling Methods in Research Design

In research design , population and sampling are two important terms. A population is a group of individuals that share common connections. A sample is a subset of the population. The sample size is the number of individuals in a sample. The more representative the sample of thepopulation, the more confident the researcher can be in the quality of the results.

## Types of Sampling Methods

Illustration of the importance of sampling:

A researcher might want to study the adverse health effects associated with working in a coal mine. However, it would be impossible to study a large population of coal workers. So, the researcher would need to narrow down the population and build a sample to collect data. This sample might be a group of coal workers in one city.

Sampling methods are as follows:

Probability Sampling is a method wherein each member of the population has the same probability of being a part of the sample.

Non-probability Sampling is a method wherein each member of the population does not have an equal chance of being selected. When the researcher desires to choose members selectively,non-probability sampling is considered. Both sampling techniques are frequently utilized. However, one works better than others depending on research needs.

## Qualitative and Quantitative Research

In Qualitative research , non-numerical data is used to study elements in their natural settings. This helps to interpret and measure how these elements affect humans or other living beings. There are three main types of qualitative sampling:

- Purposive sampling: Pre-selected criteria related to research hypothesis determines the participants for research, for example, a study on cancer rates for individuals who live near a nuclear power station.
- Quota sampling: The researcher establishes participant quotas before forming a sample . Selection of participants that meet certain traits like gender, age, health, etc.
- Snowball sampling: The participants in the study refer other individuals who fit the traits required for the study, to the researcher.

Quantitative research is used to categorize, rank, and measure numerical data. Researchers establish general laws of behavior found in different contexts and settings. The goal is to test a theory and support or reject it.

The three main types of quantitative sampling are:

- Random sampling: Random sampling is when all individuals in a population have an equal chance of being selected.
- Stratified sampling: Stratified sampling is when the researcher defines the types of individuals in the population based on specific criteria for the study. For example, a study on smoking might need to break down its participants by age, race, or socioeconomic status.
- Systematic sampling: Systemic sampling is choosing a sample on an orderly basis. To build the sample, look at the target population and choose every fifth, tenth, or twentieth name, based upon the needs of the sample size.

## The Importance of Selecting an Appropriate Sampling Method

Sampling yields significant research result. However, with the differences that can be present between a population and a sample, sample errors can occur. Therefore, it is essential to use the most relevant and useful sampling method.

Below are three of the most common sampling errors.

- Sampling bias occurs when the sample does not reflect the characteristics of the population.
- Sample frame errors occur when the wrong sub-population is used to select a sample. This can be due to gender, race, or economic factors.
- Systematic errors occur when the results from the sample differ significantly from the results of the population.

What is your experience with research design and sampling methods? Have you faced some of the challenges mentioned in this article? Please share your thoughts in the comments.

Rate this article Cancel Reply

Your email address will not be published.

## Enago Academy's Most Popular

- Manuscript Preparation
- Publishing Research

## 3 Quick Tips on How Researchers Can Handle Lack of Literature in Original Research

Many a times, I have heard fellow researchers saying that they were unable to find…

- Manuscripts & Grants
- Reporting Research

## How to Turn Your Thesis Into a Journal Article

In many cases, publishing thesis is often one of the requirements for graduate students to…

- Career Corner
- PhDs & Postdocs

When Your Thesis Advisor Asks You to Quit

I was two months into the third year of my PhD when it happened. In…

Virtual Defense: Top 5 Online Thesis Defense Tips

A Master’s or Ph.D. research defense is that momentous event you have been waiting for!…

- Industry News
- Publishing News

## Enago Releases Global Survey Report on Research Labs Health

New York, USA: Enago, a global leader in editing and publication support services, recently conducted a comprehensive global health…

3 Critical Tips to Maximize Your Potential As an Academic Researcher

How to Get Hired in Your Dream Positions: 4 Quick Tips for Enterprising Researchers!

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

- 2000+ blog articles
- 50+ Webinars
- 10+ Expert podcasts
- 50+ Infographics
- 10+ Checklists
- Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

What support would you need for successful conference participation?

## Importance Sampling

Probability functions, look-up table, uniform vs. importance samples.

Numerical integration is a central component of a complex radiometric simulation tool. In order to accurately perform integrations of discrete functions in a reasonable amount of time, the function is usually sampled. Regular or uniform sampling of some discrete functions can give rise to artifacts including aliasing. In addition, uniform sampling can be inefficient because computing the contribution for each sample of the function can be costly and uniform sample doesn’t prioritize getting the contributions for the "most important" regions of the function.

The basic idea behind importance-sampled integration is to evaluate an estimate of a function’s expected value. This is done by generating random samples that follow the approximate probability density of the function begin integrated and using the result to estimate the integral.

Specifically, importance-sampled integration is based on the fact that an integral of arbitrary dimension, D, can be re-expressed as

\( \int_A f(a)da = \int_A f(a)\frac{p(a)}{p(a)}da = E\left\{\frac{f(a)}{p(a)}\right\} \)

where p( a ) is the probability density function corresponding to the D-dimensional vector a and A is the D-dimensional space over which f( a ) is integrated.

The former equation shows that we can easily solve the integral if we have knowledge of the weighted expected value of the function, E{f( a )/p( a )}. While it is difficult to evaluate the true expected value of an arbitrary function, we can estimate it by evaluating f( a )/p( a ) for N random samples a n following the probability density function,

\( E\left\{\frac{f(a)}{p(a)}\right\} \approx \frac{1}{N}\sum_{n=1}^{N}\frac{f(a_n)}{p(a_n)}. \)

The law of large numbers ensures that we will obtain the expected value in the limit,

\begin{eqnarray*} \lim_{N \to \infty}\frac{1}{N}\sum_{n=1}^{N}\frac{f(a_n)}{p(a_n)}=E\left\{\frac{f(a)}{p(a)}\right\}. \end{eqnarray*}

The key idea for importance sampling is that there is no requirement that p( a ) have any particular form. In one dimension, and when the probability density function is uniform, the integral simplifies to

\begin{eqnarray*} \int_b^c f(a)da&=&(c-b)\cdot E\left\{f(a)\right\}, \end{eqnarray*}

which is the uniform/regular sampling case where the probability function is constant (with value 1/( c - b )).

We can improve upon the efficiency of the integral (i.e. the rate at which we converge to the true integral value) by having p( a ) closely resemble f( a ). In other words, if we place more samples where they matter most (where f( a ) is greatest), we’ll get to the final result a lot faster. This is important when we’re integrating over an optical property like a reflectance or a scattering phase function where we know the relative importance of a particular sample (i.e. the magnitude of the optical property), but the evaluation of that sample is expensive (e.g. tracing a ray through the scene to collect the incident radiance).

In order to illustrate the process for importance sampling a function, the following example function will be used:

\begin{equation*} f( \theta ) = \cos^5 \left ( \frac{|180 - \theta|}{2} \right ) \end{equation*}

This function approximates a "scattering phase function", which is a function that describes the probability of an incident photon interacting with a particle (or molecule) and scattering in a new direction. The magnitude of the function as a function of angle describes the directional probability. The plot below shows the example phase function plotted in polar coordinates. The magnitude of the function is significantly higher in the forward scattering direction (angles near 180 degrees).

In the context of a ray tracer, an efficient way to compute the new scattering direction for a photon is sought. As the function illustrates, a photon should be scattered in the forward directions with a significantly higher probability. Therefore, we want to method to use this function to drive a semi-random mechanism that will compute new photon directions (angles) such that the ensemble direction statistics are correct. If a large number of new directions are pulled from this mechanism, the normalized histogram of the directions should reproduce the input function.

The plot below illustrates the same phase function as a 1D Cartesian function. The function has been area normalized to produce a probability distribution function (PDF).

The cumulative distribution function (CDF) is created by accumulating the PDF function. If the PDF was correctly area normalized, the CDF will eventually reach a peak value of 1.

The key component in an importance sampling scheme is the function that will reproject an otherwise uniform distribution of random samples. This function is usually implemented as either an analytical function (when one can be derived) or as a look-up table.

For this example, a look-up table (LUT) approach is used where a uniformly disributed random input values between 0 → 1 can be directly indexed into a finite array of output values. To create the LUT to map input to output values, the input angles for the CDF were range normalized and then the CDF was transposed. The plot below shows the look-up table (LUT) created from the CDF.

A diagonal line (slope = 1) would produce a null projection. In this example, it can be seen that an input value of 0.2 results in an output value of nearly 0.4 . The shape of this LUT projects the smaller (nearer 0) and larger (nearer 1) values closer to 0.5. This projection will end up remapping uniformly distributed points closer to the central value of 0.5, which is what is desired to get more samples closer to the center lobe of the input PDF.

The plot below shows how the PDF would be sampled with 20 uniformly distributed samples. The green points on the curve show the locations of the 20 samples, which uniformly sample the regions of the function with low magnitude as well as those regions where the magnitude is much higher.

The plot below shows how the same PDF would be sampled if the same 20 uniformly distributed samples are reprojected by the look-up table (LUT). Unlike the uniform sampling scheme, these samples are clustered near the highest magnitude regions of the PDF.

Although the concept of importance sampling was illustrated here for a 1D function, these same techniques can be applied to multi-dimensional functions. The challange in developing importance sampling schemes is creating an efficient redistribution mechanism.

The DIRSIG model using importance sampling in the following areas:

Sampling (1D) scattering phase functions

Sampling (2D) bi-directional reflectance distribution functions (BRDFs).

The Plagiarism Checker Online For Your Academic Work

Start Plagiarism Check

Editing & Proofreading for Your Research Paper

Get it proofread now

Online Printing & Binding with Free Express Delivery

Configure binding now

- Academic essay overview
- The writing process
- Structuring academic essays
- Types of academic essays
- Academic writing overview
- Sentence structure
- Academic writing process
- Improving your academic writing
- Titles and headings
- APA style overview
- APA citation & referencing
- APA structure & sections
- Citation & referencing
- Structure and sections
- APA examples overview
- Commonly used citations
- Other examples
- British English vs. American English
- Chicago style overview
- Chicago citation & referencing
- Chicago structure & sections
- Chicago style examples
- Citing sources overview
- Citation format
- Citation examples
- College essay overview
- Application
- How to write a college essay
- Types of college essays
- Commonly confused words
- Definitions
- Dissertation overview
- Dissertation structure & sections
- Dissertation writing process
- Graduate school overview
- Application & admission
- Study abroad
- Master degree
- Harvard referencing overview
- Language rules overview
- Grammatical rules & structures
- Parts of speech
- Punctuation
- Methodology overview
- Analyzing data
- Experiments
- Observations
- Inductive vs. Deductive
- Qualitative vs. Quantitative
- Types of validity
- Types of reliability
- Sampling methods
- Theories & Concepts
- Types of research studies
- Types of variables
- MLA style overview
- MLA examples
- MLA citation & referencing
- MLA structure & sections
- Plagiarism overview
- Plagiarism checker
- Types of plagiarism
- Printing production overview
- Research bias overview
- Types of research bias
- Example sections
- Types of research papers
- Research process overview
- Problem statement
- Research proposal
- Research topic
- Statistics overview
- Levels of measurment
- Frequency distribution
- Measures of central tendency
- Measures of variability
- Hypothesis testing
- Parameters & test statistics
- Types of distributions
- Correlation
- Effect size
- Hypothesis testing assumptions
- Types of ANOVAs
- Types of chi-square
- Statistical data
- Statistical models
- Spelling mistakes
- Tips overview
- Academic writing tips
- Dissertation tips
- Sources tips
- Working with sources overview
- Evaluating sources
- Finding sources
- Including sources
- Types of sources

## Your Step to Success

Plagiarism Check within 10min

Printing & Binding with 3D Live Preview

## Random Assignment – A Simple Introduction with Examples

How do you like this article cancel reply.

Save my name, email, and website in this browser for the next time I comment.

Completing a research or thesis paper is more work than most students imagine. For instance, you must conduct experiments before coming up with conclusions. Random assignment, a key methodology in academic research, ensures every participant has an equal chance of being placed in any group within an experiment. In experimental studies, the random assignment of participants is a vital element, which this article will discuss.

Inhaltsverzeichnis

- 1 Random Assignment – In a Nutshell
- 2 Definition: Random assignment
- 3 Importance of random assignment
- 4 Random assignment vs. random sampling
- 5 How to use random assignment
- 6 When random assignment is not used

## Random Assignment – In a Nutshell

- Random assignment is where you randomly place research participants into specific groups.
- This method eliminates bias in the results by ensuring that all participants have an equal chance of getting into either group.
- Random assignment is usually used in independent measures or between-group experiment designs.

## Definition: Random assignment

Pearson Correlation is a descriptive statistical procedure that describes the measure of linear dependence between two variables. It entails a sample, control group , experimental design , and randomized design. In this statistical procedure, random assignment is used. Random assignment is the random placement of participants into different groups in experimental research.

## Importance of random assignment

Random assessment is essential for strengthening the internal validity of experimental research. Internal validity helps make a casual relationship’s conclusions reliable and trustworthy.

In experimental research, researchers isolate independent variables and manipulate them as they assess the impact while managing other variables. To achieve this, an independent variable for diverse member groups is vital. This experimental design is called an independent or between-group design.

Example: Different levels of independent variables

- In a medical study, you can research the impact of nutrient supplements on the immune (nutrient supplements = independent variable, immune = dependent variable)

Three independent participant levels are applicable here:

- Control group (given 0 dosages of iron supplements)
- The experimental group (low dosage)
- The second experimental group (high dosage)

This assignment technique in experiments ensures no bias in the treatment sets at the beginning of the trials. Therefore, if you do not use this technique, you won’t be able to exclude any alternate clarifications for your findings.

In the research experiment above, you can recruit participants randomly by handing out flyers at public spaces like gyms, cafés, and community centers. Then:

- Place the group from cafés in the control group
- Community center group in the low prescription trial group
- Gym group in the high-prescription group

Even with random participant assignment, other extraneous variables may still create bias in experiment results. However, these variations are usually low, hence should not hinder your research. Therefore, using random placement in experiments is highly necessary, especially where it is ethically required or makes sense for your research subject.

## Random assignment vs. random sampling

Simple random sampling is a method of choosing the participants for a study. On the other hand, the random assignment involves sorting the participants selected through random sampling. Another difference between random sampling and random assignment is that the former is used in several types of studies, while the latter is only applied in between-subject experimental designs.

Your study researches the impact of technology on productivity in a specific company.

In such a case, you have contact with the entire staff. So, you can assign each employee a quantity and apply a random number generator to pick a specific sample.

For instance, from 500 employees, you can pick 200. So, the full sample is 200.

Random sampling enhances external validity, as it guarantees that the study sample is unbiased, and that an entire population is represented. This way, you can conclude that the results of your studies can be accredited to the autonomous variable.

After determining the full sample, you can break it down into two groups using random assignment. In this case, the groups are:

- The control group (does get access to technology)
- The experimental group (gets access to technology)

Using random assignment assures you that any differences in the productivity results for each group are not biased and will help the company make a decision.

## How to use random assignment

Firstly, give each participant a unique number as an identifier. Then, use a specific tool to simplify assigning the participants to the sample groups. Some tools you can use are:

Random member assignment is a prevailing technique for placing participants in specific groups because each person has a fair opportunity of being put in either group.

## Random assignment in block experimental designs

In complex experimental designs , you must group your participants into blocks before using the random assignment technique.

You can create participant blocks depending on demographic variables, working hours, or scores. However, the blocks imply that you will require a bigger sample to attain high statistical power.

After grouping the participants in blocks, you can use random assignments inside each block to allocate the members to a specific treatment condition. Doing this will help you examine if quality impacts the result of the treatment.

Depending on their unique characteristics, you can also use blocking in experimental matched designs before matching the participants in each block. Then, you can randomly allot each partaker to one of the treatments in the research and examine the results.

## When random assignment is not used

As powerful a tool as it is, random assignment does not apply in all situations. Like the following:

## Comparing different groups

When the purpose of your study is to assess the differences between the participants, random member assignment may not work.

If you want to compare teens and the elderly with and without specific health conditions, you must ensure that the participants have specific characteristics. Therefore, you cannot pick them randomly.

In such a study, the medical condition (quality of interest) is the independent variable, and the participants are grouped based on their ages (different levels). Also, all partakers are tried similarly to ensure they have the medical condition, and their outcomes are tested per group level.

## No ethical justifiability

Another situation where you cannot use random assignment is if it is ethically not permitted.

If your study involves unhealthy or dangerous behaviors or subjects, such as drug use. Instead of assigning random partakers to sets, you can conduct quasi-experimental research.

When using a quasi-experimental design , you examine the conclusions of pre-existing groups you have no control over, such as existing drug users. While you cannot randomly assign them to groups, you can use variables like their age, years of drug use, or socioeconomic status to group the participants.

## What is the definition of random assignment?

It is an experimental research technique that involves randomly placing participants from your samples into different groups. It ensures that every sample member has the same opportunity of being in whichever group (control or experimental group).

## When is random assignment applicable?

You can use this placement technique in experiments featuring an independent measures design. It helps ensure that all your sample groups are comparable.

## What is the importance of random assignment?

It can help you enhance your study’s validity . This technique also helps ensure that every sample has an equal opportunity of being assigned to a control or trial group.

## When should you NOT use random assignment

You should not use this technique if your study focuses on group comparisons or if it is not legally ethical.

We use cookies on our website. Some of them are essential, while others help us to improve this website and your experience.

- External Media

Individual Privacy Preferences

Cookie Details Privacy Policy Imprint

Here you will find an overview of all cookies used. You can give your consent to whole categories or display further information and select certain cookies.

Accept all Save

Essential cookies enable basic functions and are necessary for the proper function of the website.

Show Cookie Information Hide Cookie Information

Statistics cookies collect information anonymously. This information helps us to understand how our visitors use our website.

Content from video platforms and social media platforms is blocked by default. If External Media cookies are accepted, access to those contents no longer requires manual consent.

Privacy Policy Imprint

## Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

- Knowledge Base
- Methodology
- Random Assignment in Experiments | Introduction & Examples

## Random Assignment in Experiments | Introduction & Examples

Published on 6 May 2022 by Pritha Bhandari . Revised on 13 February 2023.

In experimental research, random assignment is a way of placing participants from your sample into different treatment groups using randomisation.

With simple random assignment, every member of the sample has a known or equal chance of being placed in a control group or an experimental group. Studies that use simple random assignment are also called completely randomised designs .

Random assignment is a key part of experimental design . It helps you ensure that all groups are comparable at the start of a study: any differences between them are due to random factors.

## Table of contents

Why does random assignment matter, random sampling vs random assignment, how do you use random assignment, when is random assignment not used, frequently asked questions about random assignment.

Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment.

In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. To do so, they often use different levels of an independent variable for different groups of participants.

This is called a between-groups or independent measures design.

You use three groups of participants that are each given a different level of the independent variable:

- A control group that’s given a placebo (no dosage)
- An experimental group that’s given a low dosage
- A second experimental group that’s given a high dosage

Random assignment to helps you make sure that the treatment groups don’t differ in systematic or biased ways at the start of the experiment.

If you don’t use random assignment, you may not be able to rule out alternative explanations for your results.

- Participants recruited from pubs are placed in the control group
- Participants recruited from local community centres are placed in the low-dosage experimental group
- Participants recruited from gyms are placed in the high-dosage group

With this type of assignment, it’s hard to tell whether the participant characteristics are the same across all groups at the start of the study. Gym users may tend to engage in more healthy behaviours than people who frequent pubs or community centres, and this would introduce a healthy user bias in your study.

Although random assignment helps even out baseline differences between groups, it doesn’t always make them completely equivalent. There may still be extraneous variables that differ between groups, and there will always be some group differences that arise from chance.

Most of the time, the random variation between groups is low, and, therefore, it’s acceptable for further analysis. This is especially true when you have a large sample. In general, you should always use random assignment in experiments when it is ethically possible and makes sense for your study topic.

## Prevent plagiarism, run a free check.

Random sampling and random assignment are both important concepts in research, but it’s important to understand the difference between them.

Random sampling (also called probability sampling or random selection) is a way of selecting members of a population to be included in your study. In contrast, random assignment is a way of sorting the sample participants into control and experimental groups.

While random sampling is used in many types of studies, random assignment is only used in between-subjects experimental designs.

Some studies use both random sampling and random assignment, while others use only one or the other.

Random sampling enhances the external validity or generalisability of your results, because it helps to ensure that your sample is unbiased and representative of the whole population. This allows you to make stronger statistical inferences .

You use a simple random sample to collect data. Because you have access to the whole population (all employees), you can assign all 8,000 employees a number and use a random number generator to select 300 employees. These 300 employees are your full sample.

Random assignment enhances the internal validity of the study, because it ensures that there are no systematic differences between the participants in each group. This helps you conclude that the outcomes can be attributed to the independent variable .

- A control group that receives no intervention
- An experimental group that has a remote team-building intervention every week for a month

You use random assignment to place participants into the control or experimental group. To do so, you take your list of participants and assign each participant a number. Again, you use a random number generator to place each participant in one of the two groups.

To use simple random assignment, you start by giving every member of the sample a unique number. Then, you can use computer programs or manual methods to randomly assign each participant to a group.

- Random number generator: Use a computer program to generate random numbers from the list for each group.
- Lottery method: Place all numbers individually into a hat or a bucket, and draw numbers at random for each group.
- Flip a coin: When you only have two groups, for each number on the list, flip a coin to decide if they’ll be in the control or the experimental group.
- Use a dice: When you have three groups, for each number on the list, roll a die to decide which of the groups they will be in. For example, assume that rolling 1 or 2 lands them in a control group; 3 or 4 in an experimental group; and 5 or 6 in a second control or experimental group.

This type of random assignment is the most powerful method of placing participants in conditions, because each individual has an equal chance of being placed in any one of your treatment groups.

## Random assignment in block designs

In more complicated experimental designs, random assignment is only used after participants are grouped into blocks based on some characteristic (e.g., test score or demographic variable). These groupings mean that you need a larger sample to achieve high statistical power .

For example, a randomised block design involves placing participants into blocks based on a shared characteristic (e.g., college students vs graduates), and then using random assignment within each block to assign participants to every treatment condition. This helps you assess whether the characteristic affects the outcomes of your treatment.

In an experimental matched design , you use blocking and then match up individual participants from each block based on specific characteristics. Within each matched pair or group, you randomly assign each participant to one of the conditions in the experiment and compare their outcomes.

Sometimes, it’s not relevant or ethical to use simple random assignment, so groups are assigned in a different way.

## When comparing different groups

Sometimes, differences between participants are the main focus of a study, for example, when comparing children and adults or people with and without health conditions. Participants are not randomly assigned to different groups, but instead assigned based on their characteristics.

In this type of study, the characteristic of interest (e.g., gender) is an independent variable, and the groups differ based on the different levels (e.g., men, women). All participants are tested the same way, and then their group-level outcomes are compared.

## When it’s not ethically permissible

When studying unhealthy or dangerous behaviours, it’s not possible to use random assignment. For example, if you’re studying heavy drinkers and social drinkers, it’s unethical to randomly assign participants to one of the two groups and ask them to drink large amounts of alcohol for your experiment.

When you can’t assign participants to groups, you can also conduct a quasi-experimental study . In a quasi-experiment, you study the outcomes of pre-existing groups who receive treatments that you may not have any control over (e.g., heavy drinkers and social drinkers).

These groups aren’t randomly assigned, but may be considered comparable when some other variables (e.g., age or socioeconomic status) are controlled for.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomisation. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalisability of your results, while random assignment improves the internal validity of your study.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a die to randomly assign participants to groups.

## Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, February 13). Random Assignment in Experiments | Introduction & Examples. Scribbr. Retrieved 28 November 2023, from https://www.scribbr.co.uk/research-methods/random-assignment-experiments/

## Is this article helpful?

## Pritha Bhandari

Other students also liked, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control, control groups and treatment groups | uses & examples.

## IMAGES

## VIDEO

## COMMENTS

Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment and avoid biases. In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.

Importance sampling is a variance reduction technique that can be used in the Monte Carlo method. The idea behind importance sampling is that certain values of the input random variables in a simulation have more impact on the parameter being estimated than others.

I Importance sampling can be used in place of procedures like rejection sampling when direct sampling is di cult I Intelligent choice of the proposal can lead to Monte Carlo estimators with lower variance as a function of n, making them more e cient and generally more desirable Part A Simulation. HT 2020.

Random sampling is related to sampling and external validity (generalizability), whereas random assignment is related to design and internal validity. In experimental studies such as randomized controlled trials, subjects are first selected for inclusion in the study on the basis of appropriate criteria; they are then assigned to different ...

Importance sampling is a variance reduction technique. We use it to reduce the variance of the approximation error that we make when we approximate an expected value with Monte Carlo integration . In this lecture, we explain how importance sampling works and then we show with an example how effective it can be.

Random sampling vs. random assignment (scope of inference) Hilary wants to determine if any relationship exists between Vitamin D and blood pressure. She is considering using one of a few different designs for her study. Determine what type of conclusions can be drawn from each study design.

Random selection, or random sampling, is a way of selecting members of a population for your study's sample. In contrast, random assignment is a way of sorting the sample into control and experimental groups. Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal ...

called Sequential Importance Sampling (SIS) is discussed in Section 3. In the absence of a natural decomposition, it is still possible to apply the SIS framework by extending the Monte Carlo problem to an augmented space. A speci c implementation of this strategy, known as Annealed Importance Sampling is presented in Section 4.

The number of additional SA2s to sample were determined with reference to previous research suggesting that robust estimates could be achieved by randomly sampling 12.5% of the full set of alternatives (Nerella and Bhat 2004) and studies using stratified importance samples as small as 1-7% (Bowman and Ben-Akiva 2001; Jonnalagadda et al. 2001; Li et al. 2005; Kim and Lee 2017).

14.5 Importance Sampling. Importance sampling (IS) is a method for estimating expectations. Let be a known function of a random vector variable, x, which is distributed according to . If one could draw samples from , then the expectation in Eq. (14.1) could be approximated as in Eq. (14.2).

Sampling Methods | Types, Techniques & Examples Published on September 19, 2019 by Shona McCombes . Revised on June 22, 2023. When you conduct research about a group of people, it's rarely possible to collect data from every person in that group. Instead, you select a sample.

Importance sampling is an approximation method instead of sampling method. It derives from a little mathematic transformation and is able to formulate the problem in another way. In this post, we are going to: Learn the idea of importance sampling Get deeper understanding by implementing the process

In this assignment you will extend the Monte Carlo rendering system from the last assignment with importance sampling of various functions and next event estimation. In the above image, we see what these methods can do: the left scene is rendered with uniform hemisphere sampling.

252 CHAPTER9. MULTIPLEIMPORTANCESAMPLING combine samples from several techniques in a way that is provably good, both theoretically and practically. This allows us to construct Monte Carlo estimators that have low variance for a broad class of integrands — we call such estimators robust.

Importance Random assignment ensures that each group in the experiment is identical before applying the independent variable. In experiments, researchers will manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables.

Random assignment is a fundamental part of a "true" experiment because it helps ensure that any differences found between the groups are attributable to the treatment, rather than a confounding variable. So, to summarize, random sampling refers to how you select individuals from the population to participate in your study.

In fact, sampling is what the Census Bureau does in order to gather detailed information about the population such as the average household income, the level of education people have, and the kind of work people do for a living. But what, exactly, is sampling, and how does it work?

Random sampling is a process for obtaining a sample that accurately represents a population. Random assignment uses a chance process to assign subjects to experimental groups. Using random assignment requires that the experimenters can control the group assignment for all study subjects. For our study, we must be able to assign our participants ...

Illustration of the importance of sampling: A researcher might want to study the adverse health effects associated with working in a coal mine. However, it would be impossible to study a large population of coal workers. So, the researcher would need to narrow down the population and build a sample to collect data.

The key idea for importance sampling is that there is no requirement that p ( a) have any particular form. In one dimension, and when the probability density function is uniform, the integral simplifies to. ∫c b f(a)da = (c − b) ⋅ E{f(a)}, ∫ b c f ( a) d a = ( c − b) ⋅ E { f ( a) }, which is the uniform/regular sampling case where ...

Importance of random assignment Random assessment is essential for strengthening the internal validity of experimental research. Internal validity helps make a casual relationship's conclusions reliable and trustworthy.

5-1 Short Paper: Importance of Random Sampling Assignment. November 14th, 2021. Dear Mr. Parrish, I am having some trouble understanding why it is so important to randomly assign participants to experimental conditions. It seems to me that if you have a large enough sample, the results would probably be valid even if you didn't bother to ...

Due to the high rates of extra-pair paternity (68%) in this species, genotyping is required for accurate parentage assignment (Brekke et al., 2013; ... Figure 4 highlights the steps that are important for the sampling, processing, DNA extraction and microsatellite sequencing of failed embryos in unhatched, early-failed eggs.

[Federal Register Volume 88, Number 227 (Tuesday, November 28, 2023)] [Rules and Regulations] [Pages 83210-83301] From the Federal Register Online via the Government Publishing Office [www.gpo.gov] [FR Doc No: 2023-24922] [[Page 83209]] Vol. 88 Tuesday, No. 227 November 28, 2023 Part II Department of Agriculture ----- Agricultural Marketing Service ----- 9 CFR Part 201 Transparency in Poultry ...

Random sampling vs random assignment. Random sampling and random assignment are both important concepts in research, but it's important to understand the difference between them. Random sampling (also called probability sampling or random selection) is a way of selecting members of a population to be included in your study