What is design effect in sampling?

Design effect (DEFF) is a multiplier that adjusts sample size calculations when you use a complex sampling design instead of simple random sampling. It reflects how much precision you lose because respondents within the same cluster tend to be similar. A DEFF of 2.0 means you need twice as many interviews to achieve the same precision as simple random sampling. Most M&E household cluster surveys use a DEFF between 1.5 and 2.0.

How do you calculate design effect?

The formula is DEFF = 1 + (m - 1) x ICC, where m is the average number of interviews per cluster and ICC is the intraclass correlation coefficient. For household surveys with 15 interviews per village and an ICC of 0.05, DEFF equals 1 + (14 x 0.05) = 1.7. In practice, most M&E programs use empirical DEFF values from prior surveys in similar contexts rather than calculating ICC from scratch.

What happens if you ignore design effect?

Your sample will be too small, your confidence intervals will be too narrow, and your survey will be underpowered to detect the changes your program is designed to produce. You will not discover this until the analysis stage, when it is too late to add more interviews. Ignoring design effect is the most common reason household cluster surveys cannot confirm whether program targets were met.

Design Effect Explained: What It Is and How to Apply It

Sampling design	DEFF applies?	Typical range
Simple random sampling	No	1.0
Stratified random sampling	No, may be slightly less than 1.0	0.9-1.0
Cluster sampling (single stage)	Yes	1.3-2.5
Cluster sampling (multi-stage)	Yes, compounds across stages	1.5-3.0
Systematic sampling in homogeneous frames	Usually no	~1.0

The Formula and What It Means

The standard design effect formula is:

DEFF = 1 + (m - 1) x ICC

Three inputs:

m is the average number of completed interviews per cluster (the cluster size)
ICC is the intraclass correlation coefficient, a number between 0 and 1 that measures how similar observations within a cluster are relative to observations across clusters
The 1 is the baseline DEFF for simple random sampling

An example. You plan to survey 20 households in each of 30 villages. Your ICC for household food security outcomes in similar contexts is 0.05. DEFF equals 1 + (20 - 1) x 0.05 = 1 + 0.95 = 1.95. Your required sample size under simple random sampling is 350. Multiplied by DEFF, your actual required sample is 682 interviews, distributed across the 30 villages at roughly 23 per village, not 20.

What the formula tells you: bigger clusters mean more within-cluster similarity and a larger DEFF. Smaller clusters mean less similarity and a smaller DEFF. If ICC were zero (observations within clusters no more similar than across clusters), DEFF would be exactly 1.0 no matter the cluster size.

Most M&E practitioners do not calculate ICC from scratch. They use empirical DEFF values from prior surveys in similar contexts. DHS surveys, MICS, and donor-commissioned evaluations in the same sector and geography often publish their DEFF values; these are a reasonable starting point.

Typical DEFF Values in M&E

Survey type	Typical DEFF	Typical ICC
Household food security outcomes	1.5-2.0	0.03-0.08
WASH behaviors and access	1.5-2.5	0.05-0.12
Immunization and health service coverage	2.0-3.0	0.10-0.15
Education outcomes in cluster (school) surveys	2.0-3.5	0.10-0.20
Livelihoods and income	1.3-1.8	0.03-0.06
Gender-based violence (highly context-specific)	2.0-4.0	0.10-0.25

The pattern: outcomes that are strongly shaped by the cluster itself (a school's teaching quality, a health facility's protocols, a village's water source) have higher ICCs and therefore higher DEFFs than outcomes that vary more by individual circumstance. When in doubt and without empirical values for your context, a DEFF of 2.0 is a defensible planning assumption for household cluster surveys. Under-adjusting is a larger risk than over-adjusting, because under-adjustment cannot be fixed after fieldwork.

Four Steps to Apply Design Effect

Design effect belongs in the sample size calculation before the field schedule is finalized. Four steps.

Step 1: Calculate the simple random sample size. Start with the standard sample size formula for your indicator type: a proportion (e.g., 40% of households practicing safe water storage), a mean (e.g., average income per month), or a change detection target (e.g., detect a 10 percentage point change in indicator X). Your margin of error, confidence level, and expected variability feed this calculation. This is your SRS baseline.

Step 2: Select a DEFF value. Choose from empirical prior-survey values in your sector and geography if available. If not, use 1.5-2.0 for household surveys, 2.0-2.5 for school or health facility surveys, 2.5-3.0 for highly heterogeneous or service-delivery-dependent outcomes. Document your choice and the reasoning; reviewers will ask.

Step 3: Multiply. Required cluster sample size = SRS sample size x DEFF. If SRS is 400 and DEFF is 1.8, cluster sample size is 720.

Step 4: Apply the non-response buffer on top of the DEFF-adjusted sample. If your expected non-response rate is 15%, divide the DEFF-adjusted n by 0.85. 720 / 0.85 = 847 selected households. This is your final field target. See common sampling mistakes for why skipping the non-response buffer compounds the design effect problem.

The Sampling Calculator performs all four steps automatically: input your population size, required precision, confidence level, DEFF, and non-response rate, and it returns your final field target along with a cluster allocation plan.

Where the Numbers Come From

The inputs to DEFF do not require you to run a complicated ICC calculation on every new survey. Most M&E programs reuse empirical values from prior surveys and standard references.

Prior surveys in the same context: The strongest evidence. If your program area has been surveyed previously (DHS, MICS, SMART, KAP, baseline studies), the DEFFs from those surveys are directly applicable. The methodology section of the report will usually cite them.

Published reference tables: DHS Sampling and Household Listing Manual, MICS survey design guidance, and WHO EPI cluster sampling manuals publish typical ICC and DEFF values by outcome domain. These are general-purpose defaults, less precise than a local prior-survey value but more robust than guessing.

Post-hoc calculation from pilot data: If you run a pilot of 3-5 clusters before scaling to the full survey, you can calculate an empirical ICC from the pilot and feed it back into the final sample size calculation. This is the most context-specific approach but adds 2-4 weeks to the timeline.

Default assumption when no data is available: DEFF = 2.0 for household cluster surveys, DEFF = 2.5 for school or facility cluster surveys, DEFF = 3.0 for highly heterogeneous service-delivery outcomes. These are conservative defaults that err toward oversampling rather than undersampling.

Document whichever source you use. "DEFF of 1.8 based on the 2022 DHS in [country] for comparable coverage indicators" is a defensible audit trail. "DEFF = 1.5 assumed" is not.

Sector Examples

Health: Vaccination coverage survey in East Africa

A district health team designed a vaccination coverage survey using a standard 30-by-7 cluster design (30 clusters, 7 interviews each). The team used simple random sampling formulas to calculate n = 210, then treated that number as the cluster sample size. Post-analysis ICC was 0.13, producing a DEFF of 1.78. The effective sample size was 118 interviews, not 210. Confidence intervals on the coverage estimate were plus or minus 9 percentage points. The program needed 5-point precision to detect whether a new outreach strategy was working. The survey could not confirm or refute the strategy's effect, and the program ran a second survey three months later with a properly sized cluster allocation at 1.6x the original cost.

WASH: Household water storage survey in West Africa

A WASH program designed a household survey to measure safe water storage practices across 40 villages, with a planned 15 interviews per village. SRS n was 380. The team applied a DEFF of 2.0 based on a published MICS survey in the same region, producing a target sample of 760. After the 15% non-response buffer, the field target was 894 households, or 22 per village. The final completed sample was 746. Confidence intervals on the safe storage estimate were plus or minus 3.8 points at 95% confidence, sufficient to detect the 7-point improvement the program had targeted. The program confirmed its outcome was met with defensible precision.

Education: Learning outcome survey in South Asia

A school-based learning assessment drew from 25 schools in a program's catchment, with 40 students per school. The education team used an ICC of 0.18 based on a prior cluster-randomized trial in a similar geography. DEFF was 1 + (39 x 0.18) = 8.02, a very high value reflecting the strong clustering of learning outcomes by school. The team's initial SRS calculation of n = 400 translated to a required cluster sample of 3,200 students. The team recognized this was infeasible at program budget and redesigned the study to 60 schools with 25 students each (n = 1,500), which the DEFF formula put at an effective sample of 320 SRS-equivalent interviews. The redesign gave the program a workable study at a cost 2.5x the original budget but one that could actually answer the learning question.

Food security: Consumption survey in the Sahel

A food security program ran a quarterly consumption survey across 20 pastoralist communities, with 12 households per community. The ICC for food consumption scores in pastoralist contexts in prior surveys was 0.04, producing a DEFF of 1 + (11 x 0.04) = 1.44. The team applied the DEFF to its SRS baseline of 280 to reach a required sample of 403, then added a 20% non-response buffer (communities dispersed during dry season migration) to reach a field target of 504 households. The survey was completed at 472 households, meeting the precision requirement of plus or minus 4 points on the food consumption score.

Common Mistakes

Mistake 1: Using an SRS sample size for cluster fieldwork. Calculating the sample as if observations are independent when they are grouped in clusters is the most common DEFF error in M&E practice. The field team selects 30 villages with 15 households each, then reports on 450 interviews as if each one carried full SRS information. Confidence intervals come out too narrow, results appear more precise than they are, and reviewer pushback eventually forces a re-analysis. Apply the DEFF multiplier before the sample size is committed.

Mistake 2: Applying DEFF at the analysis stage instead of the design stage. Some teams recognize the clustering issue only in analysis and apply survey-weighted estimation. This produces correct confidence intervals but cannot add the interviews that were never conducted. The confidence intervals are wider, the precision is lower, and the program cannot confirm whether its target was met. Analysis-stage correction is better than no correction, but it cannot rescue a fundamentally underpowered study.

Mistake 3: Choosing a DEFF value without evidence. A DEFF of 1.2 sounds defensible but, without a prior-survey reference or published source, it may well be wrong. Most M&E household cluster surveys produce empirical DEFFs of 1.5-2.0. Under-specifying the DEFF produces an undersized sample; over-specifying wastes resources but is safer. When in doubt, use 2.0 and document the reasoning.

Mistake 4: Forgetting that DEFF compounds in multi-stage designs. A two-stage cluster sample (villages, then households within villages) has a compounded design effect, not the single-stage DEFF. The compounding depends on the ICCs at each stage, but a useful rule of thumb is that multi-stage DEFFs are 20-40% higher than comparable single-stage DEFFs. Factor this in when the survey design uses nested clusters (districts, then villages, then households).

Mistake 5: Not documenting the DEFF source. The DEFF value should appear in the survey methodology with a citation: the prior survey it came from, the published reference, or the assumption basis. External reviewers, future replications, and endline comparisons all need this trail. A survey that does not document its DEFF source is harder to defend and harder to compare against.

Design Effect Checklist

Run through this before committing to your cluster survey field plan.

Design stage:

DEFF value selected with documented source (prior survey, published reference, or conservative default)
DEFF multiplier applied to SRS sample size in the planning calculation
Multi-stage clustering factored in if your design uses nested clusters
Cluster allocation plan produced (how many clusters, how many interviews per cluster)

Fieldwork stage:

Cluster assignments documented before fieldwork starts
Enumerator training covers the cluster design, not just the questionnaire
Actual cluster size per village tracked in field logs (m may drift from planned m)

Analysis stage:

Survey-weighted estimation specified in the analysis plan before data collection
Primary sampling unit and strata variables included in the dataset
Confidence intervals reported using design-corrected standard errors, not naive SRS standard errors

For sample size calculation with design effect and non-response buffer, run the Sampling Calculator. For related sampling design decisions, see cluster vs. stratified sampling and probability vs. non-probability sampling. For the full pre-fieldwork sampling workflow, see common sampling mistakes.

Design Effect Explained: What It Is and How to Apply It

What Is Design Effect?

When Does It Apply?

The Formula and What It Means

Typical DEFF Values in M&E

Four Steps to Apply Design Effect

Where the Numbers Come From

Sector Examples

Health: Vaccination coverage survey in East Africa

WASH: Household water storage survey in West Africa

Education: Learning outcome survey in South Asia

Food security: Consumption survey in the Sahel

Common Mistakes

Design Effect Checklist

Frequently Asked Questions

Design Effect Explained: What It Is and How to Apply It

What Is Design Effect?

When Does It Apply?

The Formula and What It Means

Typical DEFF Values in M&E

Four Steps to Apply Design Effect

Where the Numbers Come From

Sector Examples

Health: Vaccination coverage survey in East Africa

WASH: Household water storage survey in West Africa

Education: Learning outcome survey in South Asia

Food security: Consumption survey in the Sahel

Common Mistakes

Design Effect Checklist

Frequently Asked Questions