AD

Sponsor Us|If you find this site helpful, please consider sponsoring

Support Now →
statistics

Probability and Statistics Fundamentals

Core concepts of probability and statistics for data science

#Statistics #Probability #Mathematics

Probability and Statistics Fundamentals

Probability Basics

Basic Concepts

Sample Space (Ω): The set of all possible outcomes

Event (E): A subset of the sample space

Probability Axioms:

  1. P(E) ≥ 0, for any event E
  2. P(Ω) = 1
  3. Mutually exclusive events: P(A ∪ B) = P(A) + P(B)

Conditional Probability

P(A|B) = P(A ∩ B) / P(B),  where P(B) > 0

Bayes’ Theorem: P(A|B) = P(B|A) × P(A) / P(B)

Common Distributions

Discrete Distributions:

DistributionProbability Mass FunctionExpectationVariance
BernoulliP(X=k) = p^k(1-p)^(1-k)pp(1-p)
BinomialP(X=k) = C(n,k)p^k(1-p)^(n-k)npnp(1-p)
PoissonP(X=k) = λ^k e^(-λ) / k!λλ

Continuous Distributions:

DistributionProbability Density FunctionExpectationVariance
Uniformf(x) = 1/(b-a)(a+b)/2(b-a)²/12
Normalf(x) = (1/√(2πσ²))e^(-(x-μ)²/(2σ²))μσ²
Exponentialf(x) = λe^(-λx)1/λ1/λ²

Descriptive Statistics

Measures of Central Tendency

Mean: μ = Σxᵢ / n
Median: The middle value after sorting
Mode: The value with highest frequency

Measures of Dispersion

Variance: σ² = Σ(xᵢ - μ)² / n
Standard Deviation: σ = √σ²
Range: R = x_max - x_min
Interquartile Range: IQR = Q3 - Q1

Distribution Shape

Skewness: Measures the symmetry of a distribution

  • Skewness > 0: Right-skewed (positively skewed)
  • Skewness < 0: Left-skewed (negatively skewed)
  • Skewness = 0: Symmetric

Kurtosis: Measures the peakedness of a distribution

  • Kurtosis > 3: Leptokurtic (heavy tails)
  • Kurtosis < 3: Platykurtic (light tails)
  • Kurtosis = 3: Normal distribution

Inferential Statistics

Central Limit Theorem

Regardless of the population distribution, when the sample size is sufficiently large, the sampling distribution of the sample mean is approximately normal.

X̄ ~ N(μ, σ²/n)

Standardization: Z = (X̄ - μ) / (σ/√n) ~ N(0,1)

Confidence Interval

Confidence Interval for Population Mean (σ known):

CI = X̄ ± z_(α/2) × (σ/√n)

Common Confidence Levels: 90% → z = 1.645 95% → z = 1.96 99% → z = 2.576

Hypothesis Testing

Basic Steps:

  1. Establish hypotheses: H₀ (null hypothesis), H₁ (alternative hypothesis)
  2. Choose significance level α (typically 0.05)
  3. Calculate the test statistic
  4. Calculate the p-value
  5. Make a decision: reject H₀ if p < α

Common Tests:

TestUse CaseStatistic
Z-testLarge sample mean testZ = (X̄ - μ₀) / (σ/√n)
t-testSmall sample mean testt = (X̄ - μ₀) / (s/√n)
Chi-square testGoodness of fit/Independenceχ² = Σ(O-E)²/E
F-testVariance comparisonF = s₁²/s₂²

Correlation and Regression

Correlation Coefficient:

r = Σ(xᵢ-x̄)(yᵢ-ȳ) / √[Σ(xᵢ-x̄)² × Σ(yᵢ-ȳ)²]

-1 ≤ r ≤ 1 The larger |r|, the stronger the correlation

Simple Linear Regression:

y = β₀ + β₁x + ε

β₁ = Σ(xᵢ-x̄)(yᵢ-ȳ) / Σ(xᵢ-x̄)² β₀ = ȳ - β₁x̄