Sample Size Calculator
Calculate the required sample size for surveys, polls, and research studies. Supports both proportion and mean calculations with finite population correction.
What is sample size and why does it matter?
Sample size (n) is the number of observations in a study. Larger samples give more accurate results but cost more. Too small → unreliable results, high margin of error. Too large → waste of resources. Example: Poll 50 people about election → +/-14% margin of error (useless). Poll 1,000 → +/-3% (reliable). Sample size affects: statistical power, confidence interval width, ability to detect effects. Most surveys use 1,000-2,000 for +/-3% accuracy at 95% confidence.
How does confidence level affect required sample size?
Higher confidence requires larger sample. Relationship is squared: For 90% (z=1.645) vs 95% (z=1.96) vs 99% (z=2.576). Example with +/-5% margin: 90% confidence needs n=271, 95% needs n=385 (+42%), 99% needs n=663 (+144%). Each confidence level has critical value: 90%→1.645, 95%→1.96, 99%→2.576. Formula includes z^2, so effect is amplified. Trade-off: higher confidence = more certainty but higher cost.
What is margin of error and how does it relate to sample size?
Margin of Error (MOE) is the +/- range around your estimate. Inverse relationship with sample size: n ∝ 1/MOE^2. To halve MOE, quadruple sample size. Example: MOE +/-5% needs n=385. For +/-2.5% (half), need n=1,540 (4x). For +/-10% (double), need n=96 (¼). This is why most polls use +/-3% (n≈1,000) - good balance. Below +/-2% gets very expensive. Formula: n = (z^2*p*(1-p))/MOE^2.
What is population proportion (p) and what value should I use?
Population proportion (p) is expected % with characteristic you're measuring. Examples: voting preference (50%), disease prevalence (5%), defect rate (1%). When unknown, use p=0.5 (50%) - gives maximum (conservative) sample size. Example at 95% confidence, +/-5% MOE: p=0.5 needs n=385, p=0.8 needs n=246, p=0.9 needs n=139. Use p=0.5 for polls, elections. Use known rate for quality control, medical studies with historical data.
How do I calculate sample size for comparing two groups?
For two-group comparison (e.g., A/B test, drug vs placebo), need larger samples. Formula: n = 2(z_alpha/2 + z_�^2)^2*p*(1-p)/d^2 where d = minimum detectable difference. Example: Detect 5% difference in 50% baseline conversion, 80% power, 95% confidence → n=1,571 per group (3,142 total). Factors: (1) Baseline rate, (2) Minimum detectable effect (smaller = more samples), (3) Power (usually 80%), (4) Significance level (usually 5%). Online calculators available for this.
What is statistical power and how does it affect sample size?
Statistical power = probability of detecting real effect when it exists (1-�^2). Standard is 80% power (�^2=0.20). Higher power needs larger samples. Example detecting 10% difference: 80% power needs n=393 per group, 90% power needs n=526 (+34%), 95% power needs n=651 (+66%). Power analysis prevents Type II error (missing real effect). Low power → wasted study, can't detect effects. Used in clinical trials, experiments. Trade-off: higher power = more certainty but higher cost.
How does population size affect required sample size?
Surprisingly, population size barely matters for large populations! Formula: n_finite = n/(1 + n/N). For infinite population at 95% confidence, +/-5% MOE: n=385. If population = 1,000, adjusted n=278. If population = 10,000, adjusted n=370. If population = 1,000,000, adjusted n=384. Only matters when sample is >5% of population. Census (sample everyone) when population <200-300. This is why national polls of 300M people only need ~1,000 samples!
What sample size do I need for different types of studies?
SURVEYS/POLLS: 1,000-2,000 (+/-3% MOE). ACADEMIC RESEARCH: 30-500 depending on effect size. A/B TESTING: 1,000-10,000 per variant (detect 2-10% changes). CLINICAL TRIALS: 100-10,000 (depends on risk/effect). FOCUS GROUPS: 6-10 per group. USABILITY TESTING: 5-8 users (Nielsen). MARKET RESEARCH: 200-500. QUALITY CONTROL: Depends on lot size and AQL. Always do power analysis before expensive studies!
What are common mistakes in sample size calculation?
MISTAKE 1: Not accounting for non-response. If 50% response rate expected, double sample size. MISTAKE 2: Using total sample when need per-group sample (comparing groups needs n per group). MISTAKE 3: Ignoring subgroup analysis - if analyzing 5 segments, need larger overall sample. MISTAKE 4: Not considering dropout in longitudinal studies (add 20-30%). MISTAKE 5: Using p=0.5 when you have better estimate (wastes resources). MISTAKE 6: Forgetting finite population correction for small populations.
How do I calculate sample size for continuous variables (means)?
For continuous data (height, income, test scores), formula: n = (z*rho�/E)^2 where rho�=population SD, E=desired margin of error. Example: Estimate average height +/-2cm, rho�=10cm, 95% confidence: n = (1.96*10/2)^2 = 96. Need estimate of SD from: pilot study, literature, similar studies, or assume SD = range/4. For comparing two means: n = 2(z_alpha/2 + z_�^2)^2*rho�^2/d^2 where d=minimum detectable difference. Larger SD or smaller difference → larger sample needed.
What is the relationship between sample size and cost-effectiveness?
Sample size follows law of diminishing returns. Each doubling gives 30% improvement (sqrt2≈1.41). Cost-benefit analysis: Going from +/-10% to +/-5% MOE: 4x cost for 2x precision. From +/-5% to +/-2.5%: 4x cost for 2x precision. Optimal point: Usually +/-3-5% for surveys (n=400-1,000). Below +/-2% gets very expensive. Factors: (1) Data collection cost, (2) Value of precision, (3) Budget constraints. Example: Phone survey $50/response → n=1,000 costs $50,000 for +/-3% vs $200,000 for +/-1.5%.
How do I adjust sample size for expected attrition or non-response?
Inflate initial sample size by expected loss rate: n_adjusted = n_required/(1 - attrition_rate). Example: Need n=200 completes, expect 30% dropout → recruit n=200/(1-0.30)=286. Common rates: Online surveys 10-30%, Mail surveys 30-50%, Longitudinal studies 20-40%, Clinical trials 10-20%. Calculate at each stage: Initial contact → Agreement → Completion. Example: 40% agree, 80% complete → n=200/(0.40*0.80)=625 initial contacts. Always overrecruit to ensure adequate final sample.