[1] TRUE
Proportion of red beads
Introduction
Can a small scoop of beads reveal the secret makeup of the entire urn?
Using simulated data from 1,000 beads, this project investigates how sampling and prediction work together to estimate the true proportion of red beads in a population.However, the validity of our conclusions depends on the assumption that the sample accurately represents the population; if the beads are not thoroughly mixed, even slight sampling bias could distort the results.To predict whether a bead is red, I use a logistic model that incorporates both size and coating status as explanatory variables.
Model Structure
We represent the general form of our model as:
\[ Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_k X_k + \varepsilon \] We express the fitted model as:
\[ \widehat{\text{logit}\left(P(\text{color} = \text{red})\right)} = -0.66 + 0.85 \cdot \text{size} + 1.20 \cdot \text{coating} \]
character(0)
Rows: 4,000
Columns: 3
$ product <chr> "B000J0LSBG", "B000EYLDYE", "B0026LIO9A", "B00473P8SK", "B001S…
$ review <chr> "this stuff is not stuffing its not good at all save your …
$ score <fct> other, great, great, great, great, great, other, great, great,…
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats 1.0.0 ✔ readr 2.1.5
✔ lubridate 1.9.4 ✔ stringr 1.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ readr::col_factor() masks scales::col_factor()
✖ purrr::discard() masks scales::discard()
✖ dplyr::filter() masks stats::filter()
✖ stringr::fixed() masks recipes::fixed()
✖ dplyr::lag() masks stats::lag()
✖ readr::spec() masks yardstick::spec()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Logistic Regression Estimates for Predicting Bead Color | |||
---|---|---|---|
Variable | Estimate | Lower 95% CI | Upper 95% CI |
(Intercept) | 0.35 | −0.31 | 1.02 |
size | 0.00 | −0.13 | 0.13 |
coatingyes | 0.11 | −0.15 | 0.36 |