How It Works: A Simple Pipeline
1. Extract cues from the article
We look at both the text and media in the article to find signals (called cues) that might indicate bias.
Examples of cues:
-
Emotionally charged verbs in the headline
-
When counter-arguments appear (early vs. buried late)
-
How much of the article comes from primary sources
2. Normalize cues against baselines
Every outlet and topic has its own style or typical patterns.
So, we standardize each cue by comparing it to what’s “normal” for:
-
That outlet (e.g., NYT vs. Fox)
-
That topic (e.g., climate change vs. immigration)
This might use:
-
Z-scores (how far a value is from the average)
-
Percentiles (how it ranks relative to others)
3. Aggregate cues per Polynym
Each Polynym is a specific type of pattern or bias (like Framing, Source Balance, etc.).
We combine the normalized cues into a single score for each Polynym using a formula like:
🧮 PDₚₒₗᵧₙᵧₘ = σ(∑ w × z + b)
Where:
-
z= the normalized cue -
w= the weight (importance) of that cue -
b= bias offset -
σ= a squashing function (like logistic) to keep the score between 0 and 1
4. Show top drivers
We don’t just give a score—we show why the score is high.
Each cue’s contribution (w × z) tells us what’s driving the result.
Example:
“Late counter-quote” and “no primary source link” might be the biggest contributors to a Framing Bias score.
5. Propose a label if score is high
If a score (PD) for a certain Polynym crosses a threshold, we can tag it.
Example:
If Framing PD is high → label the article as showing framing bias
6. Suggest counter-actions
Based on the top cues, we can recommend fixes.
Example suggestions:
-
“Move counter-quote to paragraph 3”
-
“Add a link to the original report in the lead paragraph”
