Why the sleep-window learner kept collapsing to the wrong four hours
The v1 sleep-window learner for a side-project safety app produced a sleep window of 01:00–05:00 for me. My actual sleep is more like 23:00–07:00. The pre-alert cron was firing at 05:00 and 06:05 and I was tired of pulling my phone out from under the pillow to silence it.
What was happening
v1 scored every hour of day from 0..23 using
rawScore[h] = 0.4 * activityScore[h] + 0.6 * gapScore[h]
then searched windows of duration 4..12 hours and picked the window with the lowest mean score. The lowest mean over a 4-hour window almost always beats the lowest mean over an 8-hour window — because the 4-hour window can fit entirely inside the deepest valley while the 8-hour window has to include shoulders.
So the learner collapsed to the narrowest possible quiet stretch. For me, that was 01:00–05:00, the dead-quiet middle of my real sleep, and the alerting threshold treated 05:00 to 07:00 as awake hours.
What I found
Two compounding issues:
- Scoring by mean rewards short windows. The duration sweep was real but the metric punished its own purpose.
- A single noisy hour in the middle of sleep (a 3am pee-trip ping) distorted the boundary heavily.
The fix
v2 changes both:
// 3-hour rolling mean of the raw score — one anomalous hour
// no longer distorts the boundary
for ($h = 0; $h < 24; $h++) {
$sleepScore[$h] = (
$rawScore[($h - 1 + 24) % 24] +
$rawScore[$h] +
$rawScore[($h + 1) % 24]
) / 3.0;
}
// Score windows by SUM, not mean. Bias toward 8h windows.
foreach (range(6, 10) as $duration) {
foreach (range(0, 23) as $start) {
$score = sumWindow($sleepScore, $start, $duration);
$score *= 1 - abs($duration - 8) * 0.08;
// remember best
}
}
Also added a grace ramp at wake time. The old code went binary
("in sleep" → 10h threshold, "out of sleep" → 2h threshold) at
the exact hour sleep_end_hour. So at 07:01, a user who had
been quiet since midnight (a perfectly normal 7-hour gap) was
suddenly compared to a 2-hour threshold and pre-alerted. The
fix is a 1-hour linear ramp from sleep threshold down to active
threshold:
$hoursIntoMorning = $nowHour - $sleepEndHour;
if ($hoursIntoMorning >= 0 && $hoursIntoMorning < 1.0) {
$t = $hoursIntoMorning; // 0..1
$threshold = $sleepThreshold + $t * ($activeThreshold - $sleepThreshold);
} else {
$threshold = $hoursIntoMorning < 0 ? $sleepThreshold : $activeThreshold;
}
Validated by forcing a re-learn against my 90 days of activity. v2
produced sleep_start=23, sleep_end=7, which matched a quick
Python simulation I'd written against the same data.
What I'd do differently
The v1 mistake was using the same metric for ranking that I used
for filtering. Search over duration with a duration-insensitive
metric and you get the shortest viable duration every time. If you
want to compare windows of different lengths, you have to either
fix the length and compare positions, or use a metric that scales
with length (sum) plus an explicit prior for the length you want
(my 1 - |duration-8| * 0.08 bias).
I also should have shipped the grace ramp the day I shipped the window — the binary cliff at wake time was obvious in retrospect and not obvious to me at the time.