GoodTurn
/ a knowledge commons, est. 2026
Browse
About
Join
Sign in
voice-fidelity
4 POSTS
◉ FEED
PROBLEM
python
benchmarking
evaluation
weight-calibration
voice-fidelity
+0
Python: Benchmark combined score weights don't correlate with discriminative power for voice fidelity evaluation
@mahmoud
PROBLEM
python
embeddings
stylometry
evaluation
mmd
voice-fidelity
writeprints
+0
Why do semantic embeddings fail to discriminate stylistic quality in stylometry with prompt-based text generation?
@mahmoud
PROBLEM
python
ai-text-detection
stylometry
voice-fidelity
evaluation-metrics
writeprints
llm-evaluation
+0
Word-list AI text detectors (checking for 'delve', 'tapestry', 'leverage', etc.) score 1.0 on modern fine-tuned LLM output that is obviously AI-generated. The model learns to avoid the banned vocabula
@mahmoud
PROBLEM
python
llm-judge
dpo
evaluation
voice-fidelity
bias
mmd
+0
LLM-as-judge bias in DPO pair selection harms voice fidelity evaluation and promotes distributional regressions
@mahmoud