Introduction to Statistics Part I
Like most of my posts, the relevant topic is prepended by some sort of try-hard analogy. So if you’re a bit fuzzy on what is meant by ‘statistics’, fear mildly.
Since the dawn of Al Gore’s legacy, there have been, and will continue to be, an infinite number of unborn memes in the world. Yet only a select few of these internet inside jokes truly break the standards of innovation to make it to golden virality.
Doge’s rise to fame is a great example. The idea that a comic sans- spammed shibe photo acquired such a huge user base of fandom is seriously impressive. How did doge succeed where so many others failed? Is it that much funnier? Was it just in the right thread at the right time? Did it originate from a source that already had credible meme-making merit?
These burning questions need answers. Fortunately the nascent discipline of (what we call) [modern] statistics emerged in the 1930’s through the works of Fischer. Statistics lets us make a range of claims, either through following the life cycle of every meme to sashay its way onto the internet, or comparing sample datasets to infer on the population.
In this section I’ll just be covering the 2 broad classifications of Statistics.
Descriptive
Representing a large raw data set in a more generalized form to accurately summarize the findings.
These also involve measurements of the central tendency (mean, median, mode) and dispersion (variance, standard deviation, etc.) which i’ll get into more. maybe.
Let’s say you wanted to display the popularity of a single meme, like the trusty classic, Foul Bachelor Frog:
You would take data sets you scraped from FBF’s origination, up to his virality in 4chan and Reddit, possibly measuring the variance in responses/upvotes of top posts. You would then turn this raw data into a user-friendly visualization, maybe a pie chart or table, to represent this, without having to parse the mass amounts of raw data.
The key factor in descriptive statistics, is that it’s only concerned with the properties of the observable data. Descriptive statistics make no assumptions that the set came from a larger source and no generalizations on what hard data was observed.
That being said, I would never actually recommend memes as an analytical target for descriptive statistical testing. Aggregating that data would be damned near impossible.
Inferential
Extrapolating a representative sample set to the population as a whole, to allow well reasoned inferences. This will be, like with any partial representation, inherently error-prone, as outliers and improper targeting can bring noise into your findings.
Larger sample sizes can help reduce the impact of atypical results, but there are trade-offs for acquiring those. It’s best to do a preliminary analysis to decide the best representational sample.
There are also 2 categories of inferential statistics:
Estimation Theory
Estimation statistics seeks to best describe the data parameters. Think of parameters as just a representative number (like a mean); a measurable characteristic of the population as a whole. The value of such parameters affects the distribution of data.
An estimator will attempt to approximate these unknown values using measurements. Let me hammer this ambiguity home with 4 differently similar explanations:
An estimator is a rule for calculating an estimate of a quantity based on observed data: thus the rule (estimator), the quantity of interest (estimand), and the result (estimate) are distinguished.
Estimator: the rules to dictate proper measurements to get the value parameters.
Estimand: describes what is to be estimated. linked closely to the objective or purpose of the analysis.
Estimate: the resulting answer as a collection of estimators for the same quantity, based on the same data.
And one final way to visualize it, cuz fuckit ynot
A key thing to note about Estimation Theory is that the estimand has a randomness component
Example
Radar is a super legit example for understanding what is meant by “random”. The goal in radar is to estimate the range of objects by analyzing the two-way timing of echo transit in the form of pulses. The receive radar (usually the same system as the transit radar) processes the reflected microwaves to make the closest inference on the object’s properties.
This randomness is because electrical noise will inevitably be embedded in the pulses, so their measured values are randomly distributed. This puts a limiting factory on the accuracy of radar, as reflected signals decline rapidly over distance, and there will always be a noise floor generated by the electric devices. So the next time you decide to build you some radar, make sure your signal-to-noise ratio is above 1:1!
Hypothesis Testing
Comparably, hypothesis testing also is used to make a conclusion about the population’s parameters. A statistical hypothesis is based on the observation of a process modeled after a set of random variables to make an inference on any significance.
This is proposed as an alternative to an inherent and idealized null hypothesis, which claims zero significance between the datasets. The conclusion always infers the null to be true. Our goal is to reject this predisposed null hypothesis.
The comparison is deemed statistically significant if the p-value is less than the significance level, α.
The p-value is the probability of obtaining results as extreme as the findings while assuming the null is true.
The significance level, α, is the probability of rejecting the null hypothesis given that it’s true.
This result is called a Type 1 error, false hit, or false positive.
An important thing to note is that hypothesis testing can be implemented even while no scientific theory exists.
Example
In a famous hypothesis test, a female colleague of Fischer told him that she was able to tell difference between when milk was poured into tea, and tea was poured into milk. Fischer gave her 8 random cups, 4 of each type.
What is her probability of getting each number correct by chance alone?
The null hypothesis claim is that she has no predictive capability.
The test statistic is just a count of the number of successes in selecting the 4 cups.
In the end Fischer asserted that no alternative hypothesis was (ever) required. She identified every cup, which would be considered a statistically significant result.
an dasit!
I’ll try to stay on top of doing micro posts on Statistics, but the next blog I finish will likely be Chapter 2 of HTTP Protocols: URLs n shit.
I’ve made a point of avoiding more scientifically explicit examples for the base definitions of these categories, but, as one would expect, as I focus in on specific topics there will be a noticeable pivot in using harder explanations.