Proportion of red beads

Author

Sajida Rehman

Introduction

Can a small scoop of beads reveal the secret makeup of the entire urn?
Using simulated data from 1,000 beads, this project investigates how sampling and prediction work together to estimate the true proportion of red beads in a population.However, the validity of our conclusions depends on the assumption that the sample accurately represents the population; if the beads are not thoroughly mixed, even slight sampling bias could distort the results.To predict whether a bead is red, I use a logistic model that incorporates both size and coating status as explanatory variables.

Model Structure

We represent the general form of our model as:

\[ Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_k X_k + \varepsilon \] We express the fitted model as:

\[ \widehat{\text{logit}\left(P(\text{color} = \text{red})\right)} = -0.66 + 0.85 \cdot \text{size} + 1.20 \cdot \text{coating} \]

[1] TRUE

character(0)

Rows: 4,000
Columns: 3
$ product <chr> "B000J0LSBG", "B000EYLDYE", "B0026LIO9A", "B00473P8SK", "B001S…
$ review  <chr> "this stuff is  not stuffing  its  not good at all  save your …
$ score   <fct> other, great, great, great, great, great, other, great, great,…

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ readr     2.1.5
✔ lubridate 1.9.4     ✔ stringr   1.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ readr::col_factor() masks scales::col_factor()
✖ purrr::discard()    masks scales::discard()
✖ dplyr::filter()     masks stats::filter()
✖ stringr::fixed()    masks recipes::fixed()
✖ dplyr::lag()        masks stats::lag()
✖ readr::spec()       masks yardstick::spec()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Logistic Regression Estimates for Predicting Bead Color
Variable	Estimate	Lower 95% CI	Upper 95% CI
(Intercept)	0.35	−0.31	1.02
size	0.00	−0.13	0.13
coatingyes	0.11	−0.15	0.36

Predicted probability of red beads by coating status