Reviews Kotlin code for common structured-concurrency violations: leaked scopes, runBlocking in the wrong place, caught exceptions that swallow cancellation, orphaned coroutines.
Best for: Kotlin engineers shipping concurrent code who want a second set of eyes on scope management.
Creator's repository · chrisbanes/skills
License: Apache-2.0
---
name: kotlin-coroutines-structured-concurrency
description: Use when writing or reviewing Kotlin code that stores CoroutineScope, launches from init/non-suspending APIs, calls runBlocking, or catches broad exceptions around suspend calls.
---
# Kotlin coroutines: structured concurrency
## Core principle
A well-structured coroutine is a self-contained unit of asynchronous work — single entry, single exit, scoped to a lifecycle known at the call site.
**Scopes should usually be tied to the caller's lifecycle, not stored as a property on the callee.** A stored `CoroutineScope` is a strong review signal: the class must prove it owns cancellation, error reporting, restart behavior, and lifecycle. Most repositories, managers, use cases, and data sources cannot prove that, so they should expose `suspend` APIs instead.
The fix is almost always the same: **make the API `suspend` and let the caller own the scope.**
## When to use this skill
You're writing or reviewing Kotlin code and you see any of these:
- A class with `private val scope: CoroutineScope` (constructor param stored as a property)
- An `init { scope.launch { ... } }` block
- A non-suspending public function whose body is `scope.launch { ... }`
- `runBlocking { ... }` in suspend-capable application code, or in tests where `runTest` should apply
- `runCatching { suspendCall() }` or a `catch` on `Exception` / `Throwable` around a `suspend` call without rethrowing `CancellationException`
- A `catch (e: CancellationException)` (or equivalent) around suspension that does not rethrow
## The silent-cancellation bug
The reason an unowned `CoroutineScope` property is so dangerous: **once a scope is cancelled, every future `launch` on it silently completes as cancelled — no exception, no log, nothing.** The work just doesn't happen. This is one of the hardest coroutine bugs to diagnose, and it appears when a class holds a long-lived reference to a lifecycle it does not own.
If APIs are `suspend`, this can't happen: the caller's scope is either alive (work runs) or the call site cancels (the caller knows).
## Anti-patterns and fixes
### 1. CoroutineScope stored as a property
```kotlin
// ❌ BAD
@Inject
class UserRepository(
private val scope: CoroutineScope,
private val api: UserApi,
) {
fun refresh() {
scope.launch { _state.value = api.fetchUser() }
}
}
// ✅ GOOD
@Inject
class UserRepository(
private val api: UserApi,
) {
suspend fun refresh(): User = api.fetchUser()
}
```
The repository no longer needs to know about coroutines at all. The caller (a ViewModel, a use case) decides on what scope, with what error handling, with what cancellation semantics.
### 2. init-block launches
```kotlin
// ❌ BAD: construction-time side effect, unbounded work
class UserSession(private val scope: CoroutineScope, private val api: Api) {
init { scope.launch { _user.value = api.load() } }
}
```
The constructor returns immediately. The caller can't `await` the load, can't see errors, can't cancel. The class is "alive" but its state is undefined.
```kotlin
// ✅ GOOD: explicit bootstrap, caller owns the suspension
class UserSession(private val api: Api) {
private var _user: User? = null
val user: User get() = checkNotNull(_user) { "Call init() first" }
suspend fun init() { _user = api.load() }
}
```
### 3. Fire-and-forget from non-UI classes
A non-suspending public function on a **non-UI class** (repository, manager, use case, data source) that launches into a class-owned scope. The caller gets no result, no error, no cancellation, and no guarantee the work ever ran.
```kotlin
// ❌ BAD — repository with stored scope and fire-and-forget public API
class AnalyticsClient(private val scope: CoroutineScope, private val api: Api) {
fun track(event: Event) {
scope.launch { api.send(event) } // caller has no idea what happens
}
fun signOut() {
scope.launch { api.signOut() } // silent failure if scope cancelled
}
}
```
```kotlin
// ✅ GOOD
class AnalyticsClient(private val api: Api) {
suspend fun track(event: Event) = api.send(event)
suspend fun signOut() = api.signOut()
}
```
#### Carve-out: the UI ↔ state-holder boundary
UI frameworks are non-suspending. A Composable's `onClick`, a Fragment's `onKeyEvent`, an Activity's `onNewIntent` — none can `suspend`. The state holder (ViewModel, Decompose Component, feature model, etc. — anything whose role is to absorb UI events and hold UI state) **is** the boundary that translates one-shot UI events into asynchronous work bound to the UI lifecycle. That's its job.
```kotlin
// ✅ GOOD — state holder absorbs a non-suspending UI event onto its scope
class FavouritesViewModel(private val repo: FavouritesRepository) : ViewModel() {
fun onToggleFavourite(item: Item) {
viewModelScope.launch { repo.toggleFavourite(item) }
}
}
// in Compose:
ListItem(onClick = { viewModel.onToggleFavourite(item) })
```
This is **not** the fire-and-forget anti-pattern. All three conditions must hold:
1. **State holder for a UI surface** — a ViewModel, Decompose Component, feature model, or equivalent UI state holder. Not a repository, manager, use case, or data source.
2. **Lifecycle-bound scope** — `viewModelScope`, a Component's `coroutineScope` that's cancelled on destroy, a Composable's `rememberCoroutineScope()`. Not `AppScope`, not an injected long-lived scope, not an ad-hoc `CoroutineScope(...)`.
3. **Caller really is a UI event** — Composable callback, key handler, lifecycle hook. Not another business-logic class calling through the state holder.
The repository / use case / data source layers underneath still expose `suspend` APIs. The state holder is the *only* layer where the non-suspending → suspending translation belongs.
"It feels like a state holder" isn't enough. The question is "does the UI directly bind to this?" If no, the carve-out doesn't apply.
### 4. Stored scopes that aren't injected
The same anti-pattern, without an injected scope:
```kotlin
// ❌ BAD — same problem, scope is constructed in-class instead of injected
class FooManager {
private val scope = MainScope()
private val scope = CoroutineScope(Dispatchers.Default + SupervisorJob())
}
```
Lifecycle is now owned by nothing and lives forever. Replace with `suspend` APIs.
The same is true if the instantiation is nested inside a function body — `fun foo() { CoroutineScope(...).launch { … } }` is just a stored scope with extra steps. Each call leaks a new uncancellable scope; bundling it into a `by lazy` property doesn't fix the underlying issue (the scope shouldn't exist at all).
### 5. DI-bound singletons / initializers that launch
A specific pattern that is hard to spot: a DI-bound class (`@SingleIn(AppScope)`, `@Singleton`, an `Initializer.initialize()`) launches a coroutine from its constructor / `init` block / `initialize()`. The launched work then has:
- **A non-deterministic start time** — whenever the graph realizes the binding. Cold-start ordering is invisible.
- **No observable lifecycle.** Nothing else in the codebase can see whether it's running or has crashed.
- **No `stop()` / restart path.** If upstream enters a bad state, the loop is uncancellable.
- **No calling code to grep for.** Readers can't find "who starts this and when".
§1 says scopes should be tied to the caller's lifecycle. The DI-bound variant violates this indirectly: the *scope* may be injected, but the *launch* is hidden inside construction — same effect, harder to see.
```kotlin
// ❌ BAD — singleton boots work as a side effect of being constructed
@SingleIn(AppScope::class)
@Inject
class TokenRefresher(
@ForScope(AppScope::class) private val scope: CoroutineScope,
private val auth: AuthService,
) {
init {
scope.launch {
while (isActive) {
delay(5.minutes)
auth.refreshIfNeeded()
}
}
}
}
// ❌ ALSO BAD — Initializer.initialize() that *launches*, not just registers
class TokenInvalidatorInitializer @Inject constructor(
@ForScope(AppScope::class) private val scope: CoroutineScope,
private val store: AuthStore,
private val invalidator: TokenInvalidator,
) : Initializer {
override fun initialize() {
scope.launch { store.tokenChanges.collect { invalidator.invalidate() } }
}
}
```
Both look like "application-scoped singletons", but the **When NOT to apply** carve-out is *not* permission to launch from `init` / `initialize()`. It's permission for a singleton to own a scope when its API is suspending.
#### First ask: does this background-loop class need to exist at all?
Most background-loop classes exist only because no one inverted the observation. Three answers, in order of preference:
**Pattern 1 — invert into the consumer.** The class observes state forever to react when it changes. But *someone* mutates the state — sign-out flow, profile switch, flag-update handler. That mutation site is already in a coroutine context and is the natural place to do the work directly.
```kotlin
// ✅ GOOD — no background loop, no scope, no class. The mutation site does the work.
class Authenticator(
private val authStore: AuthStore,
private val tokenInvalidator: TokenInvalidator,
) {
suspend fun signOut() {
authStore.clearTokens()
tokenInvalidator.invalidate() // direct call at the mutation site
}
}
```
The background-loop class is **deleted**. The work happens where the state changes.
When this applies: the consumer of the state has a clear lifecycle (a use case, an Authenticator, a service handler) and can perform the reaction inline.
**Pattern 2 — scheduled work.** Genuinely periodic or deferred. Use WorkManager / BGTaskScheduler. The enqueue is one-shot; make it suspending and call it once from an orchestrator that already runs at startup.
**Pattern 3 — explicit named launch site.** Sometimes the consumer is a synchronous API with no observable lifecycle (e.g., OpenTelemetry's `Sampler.shouldSample(...)`, an AIDL stub fanout, a broadcast receiver bridge). The observation has to live somewhere coroutine-aware, but it must live at an *explicit named call site* — not in the class's own `init`.
```kotlin
// ✅ GOOD — work is named; an explicit call site owns the launch
@SingleIn(AppScope::class)
class OtelConfigurableSampler(...) : Sampler {
@Volatile private var delegate: Sampler = ...
suspend fun observeRate(featureFlags: FeatureFlags) {
featureFlags.observe(OTEL_SAMPLING_RATE).collect { rate ->
delegate = Sampler.traceIdRatioBased(rate.coerceIn(0.0, 1.0))
}
}
override fun shouldSample(...) = delegate.shouldSample(...)
}
// wired explicitly at the OTel SDK init module:
applicationScope.launch { otelSampler.observeRate(featureFlags) }
```
When this applies: the consumer is a synchronous API that calls *into* you with no observable lifecycle. The launch can't be invertible, but it must still be visible at a named call site.
#### Test for which pattern fits
"Is the consumer's lifecycle observable to me?"
- **Yes, and they're already in a coroutine context** → Pattern 1. Push the subscription into them; delete the background-loop class.
- **The work is periodic / deferred** → Pattern 2. Suspend enqueue called once.
- **No, they're a synchronous API with no observable lifecycle** → Pattern 3. Explicit launch site, not `init`.
If a fourth answer seems to fit — e.g., "I want a `Bootable` interface that launches everything for me" — that's the same anti-pattern with an extra layer of abstraction. The whole point is that launches be *visible*; auto-discovery by interface defeats it.
#### Initializers are still fine — *if they only register*
The `Initializer` pattern is correct when `initialize()` *registers* a listener or hook. The bug is when `initialize()` *launches* a coroutine.
```kotlin
// ✅ GOOD Initializer — registers a contributor, doesn't launch
class FavouritesContributorInitializer @Inject constructor(
private val registry: ContributorRegistry,
private val favouritesContributor: FavouritesContributor,
) : Initializer {
override fun initialize() {
registry.register(favouritesContributor)
}
}
```
**`Initializer.initialize()` must not `launch` a coroutine.** If yours does, it's a Pattern 1/2/3 candidate.
#### Diagnostic for review
- Where is the start moment defined? If "wherever DI realizes me", bad.
- Who can observe whether the work is running? If "no one", bad.
- Who can stop or restart it? If "no one", bad.
- Can a reader grep for the launch site? If no, bad.
If the answers are "the consumer / the orchestrator / the named call site" — you're good.
### 6. Swallowing `CancellationException`
A `catch` clause around a `suspend` call that matches `CancellationException` — directly, or through `Exception` / `Throwable` — and doesn't rethrow usually turns cancellation into silent success. The parent coroutine thinks the child finished; the child keeps running (or its side effects do); the cancellation contract is broken.
Same failure shape as §1's stored-scope bug, viewed from the other end: §1 hides the work *from* the caller's lifecycle; this hides cancellation *from* the work.
```kotlin
// ❌ BAD — catches CancellationException, never rethrows
suspend fun fetch() {
try {
api.load()
} catch (e: Exception) { // matches CancellationException too
logger.warn("load failed", e)
}
}
// ❌ ALSO BAD — runCatching has the same problem
suspend fun fetch() {
runCatching { api.load() }
.onFailure { logger.warn("load failed", it) }
}
```
The acceptable shapes:
```kotlin
// ✅ Separate catch first
try { api.load() }
catch (e: CancellationException) { throw e }
catch (e: Exception) { logger.warn("load failed", e) }
// ✅ Conditional rethrow inside the broad catch
try { api.load() }
catch (e: Exception) {
if (e is CancellationException) throw e
logger.warn("load failed", e)
}
// ✅ ensureActive() — good when the catch handles ordinary failures and you only need
// to rethrow if the current coroutine is cancelled
try { api.load() }
catch (e: Exception) {
currentCoroutineContext().ensureActive()
logger.warn("load failed", e)
}
// ✅ runCatching with explicit guard
runCatching { api.load() }
.onFailure {
if (it is CancellationException) throw it
logger.warn("load failed", it)
}
// ✅ runCatching terminated with getOrThrow (cancellation flows back out)
runCatching { api.load() }.getOrThrow()
```
The trigger is "a suspend call inside the `try`", not "the enclosing function is declared `suspend`". This applies inside any suspending body — `suspend fun`, a `launch { … }` lambda, a Flow `collect { … }`, etc.
The common carve-out is an intentionally local timeout: catching `TimeoutCancellationException` from your own `withTimeout` and converting it to a domain result can be correct. Keep that catch narrow and close to the timeout. Do not use it as permission to swallow arbitrary cancellation.
Catching a non-cancellation subtype (`IOException`, your own exception types) is fine — they don't extend `CancellationException`.
### 7. `runBlocking`
`runBlocking` parks the current thread until the lambda finishes. Inside suspend-capable or lifecycle-scoped application paths it is wrong: a thread that meant to be async is now blocked, structured concurrency is broken, and any cancellation upstream has no effect. It is the "callee makes a structural decision for the caller" anti-pattern at its most direct.
```kotlin
// ❌ BAD — bridging to suspend by blocking the calling thread
fun saveUser(user: User) {
runBlocking { repository.save(user) }
}
```
Three fixes, by context:
**Suspend-capable application code** — make the function `suspend`:
```kotlin
// ✅ GOOD
suspend fun saveUser(user: User) = repository.save(user)
```
If the immediate caller can't suspend either (a non-suspending UI callback, a `BroadcastReceiver` hook), use the existing lifecycle-bound scope at the boundary — see §3's UI ↔ state-holder carve-out. The fix is at the boundary, not inside `saveUser`.
Legitimate blocking boundaries exist: `main` in a CLI tool, Java interop APIs that must return synchronously, framework callbacks with no suspending alternative, and migration shims. Keep `runBlocking` at that outer boundary, keep the body small, and call suspending code immediately.
**Tests** — use `runTest`:
```kotlin
// ❌ BAD — real time, slow tests, no virtual delay
@Test fun loadsUser() = runBlocking {
assertThat(repository.load().name).isEqualTo("Alice")
}
// ✅ GOOD
@Test fun loadsUser() = runTest {
assertThat(repository.load().name).isEqualTo("Alice")
}
```
`runTest` gives you virtual time (`delay()` returns immediately), `TestDispatcher` integration, and proper coroutine cleanup. Real-time `runBlocking` in tests makes them slow and flaky.
**`ContentProvider` carve-out** — Android's `ContentProvider` methods (`query`, `insert`, `update`, `delete`, `onCreate`, `call`) are synchronous from outside the process. There is no way to suspend them. Inside *member functions* of a `ContentProvider` subclass (direct or indirect — not companion objects), `runBlocking` is the unavoidable bridge. Keep the body as short as possible and call into suspending code immediately:
```kotlin
// ✅ Acceptable in ContentProvider members only
class MyProvider : ContentProvider() {
override fun query(...): Cursor? = runBlocking { dao.query(...) }
}
```
This carve-out is for `android.content.ContentProvider` subclasses *only*. "It's like a `ContentProvider`" doesn't apply, and a `runBlocking` in a `ContentProvider`'s companion object is still a regular violation — the helper isn't part of the framework's synchronous surface.
## Quick reference
| Symptom | Anti-pattern | Fix |
|---|---|---|
| Class has `private val scope: CoroutineScope` | Stored scope on the callee | Remove. Make public APIs `suspend`. |
| `init { scope.launch { ... } }` | Construction-time launch | Move to `suspend fun init()` / `login()` |
| `fun foo() { scope.launch { ... } }` on a repository/manager/use case | Fire-and-forget from non-UI class | `suspend fun foo()`, let UI state holder pick the scope |
| `fun onClick() { viewModelScope.launch { ... } }` on a state holder, called from UI | UI ↔ state-holder boundary — fine | Keep as-is (see §3 carve-out) |
| `private val scope = MainScope()` | Internally-constructed stored scope | Same — remove, make APIs `suspend` |
| `@SingleIn(AppScope) class X(scope) { init { scope.launch { … } } }` | DI-bound opaque launch (§5) | Expose `suspend fun run()`, launch from startup orchestrator |
| `class Y : Initializer { override fun initialize() { scope.launch { … } } }` | Initializer that launches, not registers (§5) | Same — `suspend fun run()`, orchestrator owns lifecycle |
| `try { suspendCall() } catch (e: Exception\|Throwable\|CancellationException) { … }` with no rethrow | Swallowed cancellation (§6) | Prefer `catch (e: CancellationException) { throw e }`; use `ensureActive()` only when that matches the intent |
| `runCatching { suspendCall() }.onFailure { … }` with no cancellation guard | Same shape as above (§6) | Add `if (it is CancellationException) throw it`, or terminate with `.getOrThrow()` |
| `runBlocking { … }` inside suspend-capable app code | Thread-blocking bridge (§7) | Make caller `suspend`; or use a lifecycle scope at the boundary |
| `runBlocking { … }` in a test | Same — real-time bridging (§7) | Use `runTest { … }` |
| `runBlocking { … }` inside a `ContentProvider.query`/`insert`/… member | Carve-out (§7) | Acceptable; keep the body minimal |
## Refactoring guidance
Removing an existing offender:
1. **Start at the leaf.** Pick the class farthest from any UI — usually a repository or data source. Its public surface should be the easiest to convert.
2. **Convert public functions to `suspend`** one at a time. The compiler will surface every caller.
3. **At each caller, choose the scope deliberately:** `viewModelScope`, `lifecycleScope`, `coroutineScope { }`, or an explicit job. This is the choice that was missing before.
4. **Delete the `CoroutineScope` constructor parameter** once nothing uses it. Remove the injection binding.
Don't try to fix every class in one MR. Removing an anti-pattern is incremental work.
## When NOT to apply
- **UI state holders absorbing UI events.** A ViewModel/Component/feature model with `fun onClick(...) { viewModelScope.launch { ... } }` is correct — that's the boundary the framework needs. See §3 carve-out.
- **Lifecycle owners with explicit cancellation and error policy.** Actors/services, app infrastructure, or application-scoped singletons may own a scope when they expose clear `close`/`cancel`/restart behavior or otherwise map directly to an application lifecycle. Inject `Application.applicationScope` explicitly rather than creating one ad-hoc. **This is not permission to launch from `init` / `initialize()`** — see §5.
- **Already-suspending APIs** don't need any of this work.
- **Tests** sometimes use `TestScope` as a deliberate ambient scope — that's a different pattern with explicit virtual-time control.
## Red flags during review
These thoughts mean the anti-pattern is back:
| Thought | Reality |
|---|---|
| "I'll just add a `CoroutineExceptionHandler` to the scope" | The problem isn't error handling. The problem is the scope shouldn't exist. |
| "I need to launch from `init` so the data's ready when consumers arrive" | Consumers reading state that isn't ready is the bug. Use phasing. |
| "The caller doesn't want to deal with `suspend`" | Then the caller chooses fire-and-forget at their scope. Don't decide for them. |
| "It's just a small fire-and-forget call" | Silent cancellation makes every fire-and-forget a potential silent failure. |
| "We caught and logged the exception, so we're fine" | Did the catch rethrow `CancellationException`? If no, the coroutine is silently un-cancelled. (§6) |
| "It's just one `runBlocking`, in a non-critical path" | Every `runBlocking` asserts the caller has no async option. If they do, it's the wrong primitive. (§7) |
| "Tests are simpler with `runBlocking`" | They run in real time, can't fast-forward `delay`, and lose `TestDispatcher` semantics. Use `runTest`. (§7) |
## Related
- [`kotlin-flow-state-event-modeling`](../kotlin-flow-state-event-modeling/SKILL.md) — `StateFlow`, `SharedFlow`, `Channel`, `stateIn`, one-shot events, and related modeling.