Audit of clock usage across sentry-java (as of b988b37): wall clock used to measure intervals, monotonic time converted to dates, and the dual-semantics ICurrentDateProvider interface. Each finding is classified actual bug vs theoretical with an urgency rating.
Found: 1 design flaw (root cause), 4 actual bugs, ~10 theoretical issues. Notably, every current ICurrentDateProvider pairing turned out to be correct — but only via careful manual wiring that nothing enforces.
A. Root design hazard
A1. ICurrentDateProvider has two implementations with different clock semantics — Design flaw · HIGH
CurrentDateProvider returns System.currentTimeMillis() (wall/epoch); AndroidCurrentDateProvider returns SystemClock.uptimeMillis() (monotonic, pauses in deep sleep) — same method name getCurrentTimeMillis().
Every consumer must hand-pick the impl matching whatever it compares against:
AnrV2Integration / TombstoneIntegration deliberately use the wall one (warning comment: "AppExitInfo uses System.currentTimeMillis") to compare against ApplicationExitInfo.getTimestamp().
LifecycleWatcher needs wall because it seeds lastUpdatedSession from Session.getStarted().getTime() (epoch).
AndroidEnvelopeCache needs uptime because it subtracts TimeSpan.getStartUptimeMs() (uptime base) for startup-crash detection.
ANRWatchDog bypasses both with an inline () -> SystemClock.uptimeMillis() lambda.
RateLimiter and ReplayIntegration get the wall one even on Android (correct — both need epoch).
A wrong wiring is silent (tests inject fakes) and catastrophic — e.g. uptime fed into Timer.schedule(task, absoluteDate) produces a 1970 date that fires instantly.
Fix: split into two explicitly-named interfaces (e.g. wall epochMillis() vs monotonic elapsedMillis()), or rename methods per impl so a mismatch can't compile.
B. Actual bugs
B1. java.util.Timer for transaction idle/deadline timeouts — MEDIUM-HIGH
SentryTracer.java#L102 (also scheduleFinish/scheduleDeadlineTimeout). Timer deadlines are wall-clock based and its internal Object.wait() does not progress during Android deep sleep:
- App backgrounded mid-
ui.load → device sleeps before the 30s deadline fires → timer fires at wake (potentially hours later); forceFinish stamps unfinished spans with dateProvider.now() → multi-hour transactions/spans (the classic "absurdly long ui.load transaction" artifact).
- Wall-clock steps (NTP/user) shift firing on any platform.
Fix: schedule on SentryExecutorService (nanoTime-based delays); clamp finish timestamps when the deadline fires late.
B2. Session-end timer has the same Timer mechanics — MEDIUM
LifecycleWatcher.java#L121. Device sleeps within the 30s background window → session ends only at wake; Session.end() stamps wake time → inflated session durations in release health; replay stop() and ContinuousProfiler.close(false) also run hours late. The foreground check lastUpdatedSession + sessionIntervalMillis <= now is also a wall-clock interval (clock step → spurious or missed session rotation).
B3. Session Replay: wall clock used for all interval math — MEDIUM (actual bug when the clock steps mid-recording)
Epoch is required for RRWeb payload timestamps (that part is correct), but the same wall values also drive windows/durations:
- SessionCaptureStrategy.kt#L81 (+L106, L162): segment durations and the 1h max-session deadline are
now - startEpoch diffs.
BufferCaptureStrategy trim-to-last-30s and ReplayCache.createVideoOf iterate epoch-millis windows; frame files are named by epoch millis.
- ReplayGestureConverter.kt#L56:
timeOffset = now - touchMoveBaseline.
A backward step mid-recording → frames "newer than now": trim can wipe valid frames, segment windows miss/duplicate frames, gesture offsets go negative. Forward step → premature 1h cutoff. NTP/carrier/user steps on phones are realistic.
Fix: keep epoch in RRWeb payloads, drive windows/trim from a monotonic clock with one epoch anchor per segment.
B4. Cron check-in durations measured with wall clock — LOW-MEDIUM
CheckInUtils.java#L64 and the same pattern in SentryCheckInAdvice (sentry-spring, -jakarta, -7): duration = currentTimeMillis() - start. Cron jobs run long → wide exposure to clock steps → wrong/negative durations. Pure interval → should be System.nanoTime().
C. Theoretical issues (need a clock step / edge condition)
- AndroidProfiler measurement re-anchoring · LOW-MEDIUM — AndroidProfiler.java#L309:
timestampDiff (elapsedRealtime↔wall offset) computed once at profile end and applied to every wall-stamped PerformanceCollectionData sample → a wall step during the profile shifts all earlier CPU/memory samples relative to the trace. Fix: stamp samples with elapsedRealtimeNanos on Android.
- Session
seq is raw epoch millis · LOW-MEDIUM — Session.java#L309: backward step between updates → newer update has smaller seq → server can discard the latest session state (lost end/error counts). Also calculateDurationTime uses Math.abs, masking negative durations. Fix: seq = max(prevSeq + 1, now).
- RateLimiter wall-clock deadlines · LOW — retry-after stored as epoch
Dates (self-consistent), but a backward step silently extends the drop window; the "limit lifted" observer callback is Timer.schedule(task, absoluteDate) → shifted by steps (continuous profilers resume late).
- App-start anchor projection · LOW —
TimeSpan (uptime + wall anchor) is the right pattern, but the wall anchor is captured once and setStartedAt() back-projects assuming no step since process start; NTP sync shortly after boot shifts app-start span timestamps relative to later-anchored spans.
SpanFrameMetricsCollector.toNanoTime() · LOW — re-anchors wall-based SentryLongDates into the nanoTime base using the current offset; wrong by any step since the date was created, and across deep sleep → frames attributed to wrong span windows.
- Wall-clock TTLs/cleanup · LOW —
HostnameCache 5h TTL; Sentry.classCreationTimestamp vs File.lastModified() for profiling-traces cleanup; CacheStrategy envelope rotation ordered by lastModified(); DefaultCompositePerformanceCollector 30s auto-stop via wall diff + sampling on java.util.Timer.
- Informational —
Breadcrumb.compareTo orders by captured System.nanoTime() (restored-from-disk breadcrumbs get fresh nanos at parse → cross-restart ordering is parse-order); cross-type SentryDate arithmetic (SentryNanotimeDate vs SentryLongDate) silently degrades to ms-precision wall math.
D. Checked and confirmed correct (coverage)
ANRWatchDog (all-uptime); AnrV2/Tombstone 90-day threshold (wall vs wall); AndroidEnvelopeCache startup-crash window (uptime vs uptime); AndroidConnectionStatusProvider cache TTL (uptime); Debouncer on uptime (no events during sleep anyway); DeviceInfoUtil boot time (wall − elapsedRealtime); AndroidCpuCollector (elapsedRealtimeNanos deltas); AndroidProfiler per-frame clock conversion; span/transaction SentryNanotimeDate anchor pattern; LoggerBatchProcessor/BackpressureMonitor (ScheduledExecutorService); OkHttp HTTP_START/END_TIMESTAMP (deliberately epoch for RRWeb). Swept clean: apollo*, graphql*, kafka, quartz, spotlight, reactor, ndk, fragment, navigation, distribution, jul/logback/log4j2, async-profiler.
Suggested fix order
- A1 — split/rename
ICurrentDateProvider semantics (prevents all future regressions; internal API).
- B1 + B2 — replace
java.util.Timer with SentryExecutorService for tracer idle/deadline + session-end (and C3's observer timer); clamp late-fire timestamps.
- B3 — monotonic windows in replay capture strategies.
- B4 —
nanoTime for check-in durations (CheckInUtils + 3 spring files).
- C1/C2 — profiler measurement stamping; session seq monotonicity.
- Rest are doc comments / opportunistic.
Verification idea for fixes: unit tests already inject ICurrentDateProvider fakes everywhere — add cases simulating backward/forward clock steps and assert intervals are unaffected.
Audit of clock usage across sentry-java (as of b988b37): wall clock used to measure intervals, monotonic time converted to dates, and the dual-semantics
ICurrentDateProviderinterface. Each finding is classified actual bug vs theoretical with an urgency rating.Found: 1 design flaw (root cause), 4 actual bugs, ~10 theoretical issues. Notably, every current
ICurrentDateProviderpairing turned out to be correct — but only via careful manual wiring that nothing enforces.A. Root design hazard
A1.
ICurrentDateProviderhas two implementations with different clock semantics — Design flaw · HIGHCurrentDateProvider returns
System.currentTimeMillis()(wall/epoch); AndroidCurrentDateProvider returnsSystemClock.uptimeMillis()(monotonic, pauses in deep sleep) — same method namegetCurrentTimeMillis().Every consumer must hand-pick the impl matching whatever it compares against:
AnrV2Integration/TombstoneIntegrationdeliberately use the wall one (warning comment: "AppExitInfo uses System.currentTimeMillis") to compare againstApplicationExitInfo.getTimestamp().LifecycleWatcherneeds wall because it seedslastUpdatedSessionfromSession.getStarted().getTime()(epoch).AndroidEnvelopeCacheneeds uptime because it subtractsTimeSpan.getStartUptimeMs()(uptime base) for startup-crash detection.ANRWatchDogbypasses both with an inline() -> SystemClock.uptimeMillis()lambda.RateLimiterandReplayIntegrationget the wall one even on Android (correct — both need epoch).A wrong wiring is silent (tests inject fakes) and catastrophic — e.g. uptime fed into
Timer.schedule(task, absoluteDate)produces a 1970 date that fires instantly.Fix: split into two explicitly-named interfaces (e.g. wall
epochMillis()vs monotonicelapsedMillis()), or rename methods per impl so a mismatch can't compile.B. Actual bugs
B1.
java.util.Timerfor transaction idle/deadline timeouts — MEDIUM-HIGHSentryTracer.java#L102 (also
scheduleFinish/scheduleDeadlineTimeout).Timerdeadlines are wall-clock based and its internalObject.wait()does not progress during Android deep sleep:ui.load→ device sleeps before the 30s deadline fires → timer fires at wake (potentially hours later);forceFinishstamps unfinished spans withdateProvider.now()→ multi-hour transactions/spans (the classic "absurdly long ui.load transaction" artifact).Fix: schedule on
SentryExecutorService(nanoTime-based delays); clamp finish timestamps when the deadline fires late.B2. Session-end timer has the same
Timermechanics — MEDIUMLifecycleWatcher.java#L121. Device sleeps within the 30s background window → session ends only at wake;
Session.end()stamps wake time → inflated session durations in release health; replaystop()andContinuousProfiler.close(false)also run hours late. The foreground checklastUpdatedSession + sessionIntervalMillis <= nowis also a wall-clock interval (clock step → spurious or missed session rotation).B3. Session Replay: wall clock used for all interval math — MEDIUM (actual bug when the clock steps mid-recording)
Epoch is required for RRWeb payload timestamps (that part is correct), but the same wall values also drive windows/durations:
now - startEpochdiffs.BufferCaptureStrategytrim-to-last-30s andReplayCache.createVideoOfiterate epoch-millis windows; frame files are named by epoch millis.timeOffset = now - touchMoveBaseline.A backward step mid-recording → frames "newer than now": trim can wipe valid frames, segment windows miss/duplicate frames, gesture offsets go negative. Forward step → premature 1h cutoff. NTP/carrier/user steps on phones are realistic.
Fix: keep epoch in RRWeb payloads, drive windows/trim from a monotonic clock with one epoch anchor per segment.
B4. Cron check-in durations measured with wall clock — LOW-MEDIUM
CheckInUtils.java#L64 and the same pattern in
SentryCheckInAdvice(sentry-spring, -jakarta, -7):duration = currentTimeMillis() - start. Cron jobs run long → wide exposure to clock steps → wrong/negative durations. Pure interval → should beSystem.nanoTime().C. Theoretical issues (need a clock step / edge condition)
timestampDiff(elapsedRealtime↔wall offset) computed once at profile end and applied to every wall-stampedPerformanceCollectionDatasample → a wall step during the profile shifts all earlier CPU/memory samples relative to the trace. Fix: stamp samples withelapsedRealtimeNanoson Android.seqis raw epoch millis · LOW-MEDIUM — Session.java#L309: backward step between updates → newer update has smaller seq → server can discard the latest session state (lost end/error counts). AlsocalculateDurationTimeusesMath.abs, masking negative durations. Fix:seq = max(prevSeq + 1, now).Dates (self-consistent), but a backward step silently extends the drop window; the "limit lifted" observer callback isTimer.schedule(task, absoluteDate)→ shifted by steps (continuous profilers resume late).TimeSpan(uptime + wall anchor) is the right pattern, but the wall anchor is captured once andsetStartedAt()back-projects assuming no step since process start; NTP sync shortly after boot shifts app-start span timestamps relative to later-anchored spans.SpanFrameMetricsCollector.toNanoTime()· LOW — re-anchors wall-basedSentryLongDates into thenanoTimebase using the current offset; wrong by any step since the date was created, and across deep sleep → frames attributed to wrong span windows.HostnameCache5h TTL;Sentry.classCreationTimestampvsFile.lastModified()for profiling-traces cleanup;CacheStrategyenvelope rotation ordered bylastModified();DefaultCompositePerformanceCollector30s auto-stop via wall diff + sampling onjava.util.Timer.Breadcrumb.compareToorders by capturedSystem.nanoTime()(restored-from-disk breadcrumbs get fresh nanos at parse → cross-restart ordering is parse-order); cross-typeSentryDatearithmetic (SentryNanotimeDatevsSentryLongDate) silently degrades to ms-precision wall math.D. Checked and confirmed correct (coverage)
ANRWatchDog (all-uptime); AnrV2/Tombstone 90-day threshold (wall vs wall); AndroidEnvelopeCache startup-crash window (uptime vs uptime); AndroidConnectionStatusProvider cache TTL (uptime); Debouncer on uptime (no events during sleep anyway); DeviceInfoUtil boot time (
wall − elapsedRealtime); AndroidCpuCollector (elapsedRealtimeNanos deltas); AndroidProfiler per-frame clock conversion; span/transactionSentryNanotimeDateanchor pattern; LoggerBatchProcessor/BackpressureMonitor (ScheduledExecutorService); OkHttpHTTP_START/END_TIMESTAMP(deliberately epoch for RRWeb). Swept clean: apollo*, graphql*, kafka, quartz, spotlight, reactor, ndk, fragment, navigation, distribution, jul/logback/log4j2, async-profiler.Suggested fix order
ICurrentDateProvidersemantics (prevents all future regressions; internal API).java.util.TimerwithSentryExecutorServicefor tracer idle/deadline + session-end (and C3's observer timer); clamp late-fire timestamps.nanoTimefor check-in durations (CheckInUtils + 3 spring files).Verification idea for fixes: unit tests already inject
ICurrentDateProviderfakes everywhere — add cases simulating backward/forward clock steps and assert intervals are unaffected.