Apr 23, 2026 - 10 MIN READ

Error handling across the stack

Things break. The question is not whether your app will hit an error, it is how gracefully it recovers. Here is how I think about error handling across the frontend, backend, and queue consumers, with real patterns from the apps I run.

Peter Oliha

All tags Articles TypeScript

A few days ago I wrote about validation across the stack. This is the companion piece.

Validation tells you what should not happen. Error handling is what you do when it happens anyway.

Same note as last time on the code: most of my samples are from NestJS services and Nuxt apps, because that is what I build in. The patterns translate.

Things break

The question is not whether your app will hit an error. It will. Third-party APIs go down. Users disconnect mid-request. A vendor quietly deprecates a field. Your AI provider rate-limits you on a Tuesday afternoon. Someone on your team ships a regression.

The real question is how your app behaves when it does. A graceful recovery, an honest error message to the user, a signal to you that something is off. That is a well-built system. A blank screen, a generic "Something went wrong", or worse, silent data corruption. That is where trust gets destroyed.

Error handling is not defensive coding. It is acknowledging that you do not control everything and building for that honestly.

Classify first

The biggest mistake I see in error handling code is treating every error the same. A catch block that logs and moves on, or one that throws a generic InternalServerError, is a missed opportunity. Different errors want different responses.

Here is a small utility from the Jobven scraper that I end up reaching for constantly.

export function isRateLimitError(error: unknown): boolean {
  if (!error || typeof error !== 'object') return false
  const err = error as Record<string, unknown>

  if (err.status === 429 || err.code === 429) return true

  const message = String(err.message || '').toLowerCase()
  return (
    message.includes('rate limit') ||
    message.includes('quota exceeded') ||
    message.includes('too many requests')
  )
}

export function isTransientError(error: unknown): boolean {
  if (!error || typeof error !== 'object') return false
  const err = error as Record<string, unknown>

  if (err.isTransientError === true) return true

  const status = (err.status || err.code) as number | undefined
  if (typeof status === 'number' && status >= 500 && status < 600) return true

  if (err.code === 'ECONNRESET' || err.code === 'ETIMEDOUT') return true
  if (err.code === 'ECONNREFUSED' || err.code === 'ENOTFOUND') return true

  return false
}

There is also a third function, extractRetryAfterMs, that pulls the Retry-After header out of the error if the provider sent one.

These three helpers let every consumer in the system respond to errors the same way: rate limits get a long pause and a requeue, transient errors get a short pause and a requeue, permanent errors get logged and dropped. Without them, every retry loop reinvents the same logic and gets it subtly wrong in different places.

The lesson is not "write these specific functions". It is "classify errors before you decide what to do with them". Rate limit, transient, permanent, user error, data error: whatever categories make sense for your domain, make them explicit, and let the handling follow from the category.

Backend errors

On the backend, errors come from every direction. A database call fails. A third-party service returns a 500. A bug throws an unexpected TypeError. Your job is to make sure all of that is caught, logged with enough context to debug, and returned to the client in a form they can act on.

I run a global exception filter in Jobven's API that does four things for every unhandled error.

It generates a request ID. Every error gets a UUID that shows up in the log line, in the response body, and in the Discord notification. When a user reports a broken request, they send me the ID and I can pull the exact trace immediately.

It sanitises what goes back to the client. 4xx errors are safe to show directly to users (validation failures, permissions, not-found). 5xx errors get a generic "An internal server error occurred. Please try again later or contact support with your request ID." The real message and stack trace stay in the server logs where they belong.

private getSanitizedMessage(status: number, originalMessage: string): string {
  if (status >= 400 && status < 500) {
    return originalMessage
  }

  if (this.isProduction) {
    return 'An internal server error occurred. Please try again later or contact support with your request ID.'
  }

  return originalMessage
}

It logs with context. Every error log includes the request ID, method, path, status code, error name, sanitised body, query params, user agent, and IP. Logging the message alone is a waste. You need the shape of the request to reproduce the bug.

It notifies on 5xx. For server errors, the filter fires off an async Discord notification. Not a blocking call, not a dependency. If Discord is down, the error still gets handled. Async side effects for monitoring are a pattern worth copying.

The pattern is consistent: catch everything, log everything internally, show the user only what they can act on.

Frontend errors

On the frontend, errors are part of the user experience. A generic "Something went wrong" toast is a failure of imagination. The user knows something went wrong, that is why they are reading the toast. They want to know what to do next.

Take a login flow. The backend can reject a login for a few different reasons: wrong credentials, account locked, email not verified. They all come back as authentication errors, but they mean very different things to the user. The tempting thing to do is to branch on the error message and route the user accordingly.

async function onSubmit(payload: FormSubmitEvent<Schema>) {
  try {
    await login({ email: payload.data.email, password: payload.data.password })
    await navigateTo(APP_PATHS.DASHBOARD.HOME)
  } catch {
    const errorMessage = authError.value?.toLowerCase() || ''

    if (errorMessage.includes('not verified')) {
      toast.add({
        title: 'Email Not Verified',
        description: 'Please verify your email before logging in.',
        color: 'warning'
      })
      await navigateTo({
        path: APP_PATHS.AUTH.VERIFY_EMAIL,
        query: { email: payload.data.email }
      })
      return
    }

    toast.add({
      title: 'Login Failed',
      description: authError.value || 'Please check your credentials and try again.',
      color: 'error'
    })
  }
}

The UX win is obvious. An unverified email could have thrown a red error toast and left the user to figure out what to do. Instead, the page detects the case, shows a warning toast, and navigates to the verification page with the email pre-filled.

The enumeration tradeoff

Here is where this specific pattern gets complicated. Login is a security boundary, and differentiating error messages on a security boundary leaks information.

Consider an attacker cycling through a list of email addresses with a wrong password. If "account does not exist" and "wrong password" return the same generic error, the attacker learns nothing. But if "email not verified" is a distinguishable response, the attacker now knows which emails in their list have accounts on your platform. That is called a user enumeration attack, and it is the kind of thing that turns a leaked email list into a targeted phishing campaign.

The same logic applies to signup ("that email is already in use"), password reset ("no account found"), and anywhere else you tell the user something specific about the state of an account before they have proven they own it.

On a security boundary, uniform error responses win. Something like:

catch {
  toast.add({
    title: 'Login Failed',
    description: 'Invalid email or password. If you recently signed up, please check your inbox for a verification link.',
    color: 'error'
  })
}

Same message for every auth failure. No branching. No routing. The verification flow moves out of the error path entirely: a standalone "Resend verification email" page the user can visit whenever they need to, with a uniform response that is safe to send to anyone.

The principle still holds

None of this contradicts the original point. Errors on the frontend deserve thoughtful user paths. The nuance is that which path depends on the boundary you are on.

On a security boundary, thoughtful means uniform. Recovery flows live elsewhere, where the user is not trying to prove identity.

Off a security boundary, thoughtful means routing users to the right next step. An upload that fails because the file is too large should say so. A payment that fails because the card was declined should say so. A save that fails because someone else edited the record first should say so, and offer to reload.

Good frontend error handling is not about catching exceptions. It is about knowing which errors deserve different user paths, and knowing when the right path is the same one for everyone.

Recovery patterns

Some errors are not meant to be shown to anyone. They are meant to be handled: retried, requeued, paused, or failed over.

Queue consumers are where recovery patterns earn their keep. Here is the core of the Jobven enrichment consumer, which calls an AI provider for every scraped job.

try {
  // ...enrich the job...
} catch (error) {
  this.logger.error(`Error enriching job ${msg.jobId}:`, error)

  if (isRateLimitError(error) || (error as any).isRateLimitError) {
    const retryAfterMs = extractRetryAfterMs(error) || this.DEFAULT_PAUSE_MS
    await this.pauseEnrichment(retryAfterMs)
    return new Nack(true)
  }

  if (isTransientError(error) || (error as any).isTransientError) {
    await this.pauseEnrichment(this.TRANSIENT_PAUSE_MS)
    return new Nack(true)
  }

  this.publishEnrichmentUsage({ /* ...failure analytics... */ })
  await this.markJobFailed(msg.jobId, error.message)
  return new Nack(false)
}

Three categories, three different responses.

Rate limits get a long pause (respecting the Retry-After header if the provider sent one) and the message goes back on the queue. The consumer will pick it back up after the pause, by which point the window has opened again.

Transient errors (5xx, connection resets, timeouts) get a short pause and a requeue. The assumption is that whatever went wrong is probably resolved by now.

Permanent errors (bad data, a schema violation, something the enrichment provider flatly refuses) get logged to the analytics pipeline, marked as failed on the job record, and dropped. No requeue. Retrying a permanent failure is just shouting into the void.

The other detail worth calling out is how the pause works. pauseEnrichment writes a pause_until key to Valkey (a Redis-compatible store) that all enrichment consumers across all workers check before processing. One consumer hitting a rate limit pauses the whole fleet. That prevents a thundering herd where every worker hammers the same rate-limited endpoint at the same time.

Similar patterns show up elsewhere. Batch enrichment jobs that fail to commit to the database trigger a cancellation on the external batch provider, so you do not get charged for work whose result you cannot save. Scrapers with persistent failures flip a markManualReview flag and stop retrying. The pattern is always: decide what "recovery" means for this error, do it, and make sure the system does not keep repeating the same mistake.

Frontend and backend together

Error handling is one of those places where the frontend and backend have to speak the same language. If the backend throws a generic 500 with "An error occurred", the frontend has nothing to work with. If the backend returns specific, documented error codes or messages, the frontend can actually do something useful.

The login flow from earlier is a deliberate exception. Outside of security boundaries, specific error shapes pay off: the frontend can route users directly to the fix.

A good example is a metered API plan. If a user exceeds their monthly quota, the backend returns a 429 with a structured error code (quota_exceeded) and the relevant plan context in the payload. The frontend knows that code, shows a toast with the right tone, and offers a button that opens the upgrade page with the recommended plan preselected. The user's next action is one click away.

That handshake only works because the backend is deliberate about its error shape and the frontend knows to look for it.

You can get fancier with custom error codes, typed error responses, or shared error enums between a BFF and its UI. The principle is the same. Errors are an interface between your layers, and like any interface, they should be designed, not accidental.

Closing

Error handling is your app's insurance policy. Things will go wrong. The only question is whether you recover gracefully, give the user something useful, and learn from what happened.

The patterns that matter are simple:

Classify before you handle. Rate limit, transient, permanent, user, system: each deserves a different response.
Catch everything on the backend, log with full context, and send the user only what they can act on.
On the frontend, treat errors as part of the user experience. Route them somewhere useful, not to a red toast that explains nothing.
For queue consumers and async work, build pause-and-requeue logic around classification, and make sure the system stops repeating failures.
Treat errors as an interface between layers. Design them on purpose.

None of this is glamorous work. Error handling is where a lot of the craft of building software actually lives, and it is what separates apps that feel solid from apps that feel brittle. Spend the time.

Have any questions, want to share your thoughts or just say Hi? I'm always excited to connect! Follow me on Bluesky, LinkedIn or Twitter for more insights and discussions. If you've found this valuable, please consider sharing it on your social media. Your support through shares and follows means a lot to me!

Where should I validate? Everywhere, and for different reasons

Validation is one of those topics that feels boring until something goes wrong in production. Here is how I think about validation across the stack, with real examples from the apps I run.

3 Ways to Run NestJS Cron Jobs When Running Multiple Instances

Learn three effective methods to handle cron jobs in a multi-instance NestJS environment. From named instances to database locking strategies.