Identity & OTP - Proving Who You Are Without Sharing a Password // Megha Bose

Helpful context:

REST APIs - Resources, Verbs, and the Architecture of the Web

You click “Login with Google.” Three seconds later you’re in. You created no password. You entered no username. You didn’t register with this application before. The site has never seen you. And yet it knows who you are, trusts that you are who you claim to be, and hands you a session.

What happened in those three seconds is one of the more elegant pieces of protocol engineering in modern software - OAuth 2.0 and OpenID Connect, two overlapping standards that handle the question of identity at internet scale. Understanding them also means understanding JWT revocation, TOTP’s cryptographic foundation, why “stateless is better” is often wrong, and why the future of authentication probably doesn’t involve passwords at all.

The Problem OAuth 2.0 Solves

Before OAuth, the way a third-party application accessed your data on another service was: you gave it your password. Yelp wanted to access your Google Contacts to suggest friends? You gave Yelp your Google password. This is catastrophically bad: the third party has full access to your account, you can’t revoke that access without changing your password, and if Yelp is breached, your Google credentials are compromised.

OAuth 2.0 (2012) was designed to solve the delegation problem: how can a user authorize an application to access resources on their behalf without sharing credentials? The protocol separates four roles:

Resource Owner: the user who owns the data
Client: the application requesting access (your Yelp account)
Authorization Server: the server that authenticates the user and issues tokens (Google’s auth infrastructure)
Resource Server: the API that holds the protected resources (Google Contacts API)

The authorization code flow - the one you experience as “Login with Google” - works as follows:

The client redirects the user to the authorization server with a client_id, requested scope, and a redirect_uri
The authorization server authenticates the user (Google asks for your Google credentials)
If the user consents, the authorization server redirects back to the client with a short-lived authorization code
The client exchanges the authorization code for an access token, by calling the token endpoint with the code and a client_secret
The client uses the access token to call the resource server on the user’s behalf

The authorization code is short-lived (seconds) and single-use. The access token is what the client actually uses. The user’s credentials never leave Google.

PKCE: OAuth for Mobile Apps

Step 4 requires a client_secret. But a mobile app cannot keep a secret - anything bundled into an app binary can be extracted by a determined attacker. So how do mobile and single-page apps use OAuth safely?

PKCE (Proof Key for Code Exchange, pronounced “pixie”) is the answer. Before the authorization request, the client generates a random code_verifier. It hashes the verifier to produce a code_challenge. The authorization request includes the code_challenge.

When the client later exchanges the authorization code for a token, it sends the original code_verifier. The authorization server verifies that hash(code_verifier) == code_challenge. This proves the token request comes from the same client that initiated the authorization - without needing a static secret that could be stolen from the app binary.

PKCE is now recommended for all OAuth clients, including confidential clients (web servers with secrets). It defends against authorization code interception attacks that are possible in some environments.

OpenID Connect: Identity on Top of OAuth

OAuth 2.0 handles authorization - “this token grants access to these resources.” It deliberately says nothing about who the user is. The access token tells the resource server that someone authorized access; it doesn’t say who that someone is.

OpenID Connect (OIDC) adds an identity layer. OIDC extends the OAuth 2.0 flow with an additional token: the ID token. The ID token is a JWT containing claims about the user’s identity: sub (subject - a stable, unique user identifier from the authorization server’s perspective), email, name, iss (issuer), and exp (expiration).

The client verifies the ID token’s signature (using the authorization server’s public key, typically fetched from a well-known JWKS endpoint), checks that iss matches the expected issuer and aud matches its own client ID, and can now trust the identity claims without any additional network call.

“Sign in with Google,” “Sign in with GitHub,” and “Sign in with Apple” are all OIDC flows. The access token lets the app call APIs; the ID token tells the app who is logged in.

JWT: Structure and the Stateless Tradeoff

A JWT (JSON Web Token) has three base64url-encoded sections separated by dots: header.payload.signature.

The header specifies the signing algorithm (RS256 - RSA with SHA-256, using the authorization server’s private key - is common for public OIDC providers; HS256 - HMAC-SHA256 with a shared secret - is common for internal systems). The payload contains claims. The signature signs header + payload, preventing tampering.

The appeal of JWTs is that they are self-contained. A resource server can verify a JWT’s validity by checking the signature with the public key - no database lookup, no session store query. This makes JWTs horizontally scalable: any server instance can verify any token independently.

The JWT Revocation Problem

Here is the uncomfortable reality that “JWT is stateless therefore better” advocates tend to skip: a valid JWT is valid until it expires, regardless of what happens after it’s issued.

If a user logs out, the JWT is still valid. If an account is compromised and you want to invalidate all sessions immediately, the JWTs are still valid. If an employee is fired and you revoke their access in your identity provider, existing JWTs are still valid until expiry.

The solutions:

Short expiry + refresh tokens: access tokens expire in 15 minutes. Clients use a long-lived refresh token to get a new access token silently. Refresh tokens are stored server-side and can be revoked. When you revoke the refresh token, the next access token refresh fails and the user is effectively logged out - within 15 minutes of the last access token’s issue time.
Token blocklist: maintain a server-side set of revoked token IDs (the jti claim). On every request, check the token’s jti against the blocklist. This reintroduces state - you now have a distributed cache or database query on every authenticated request. You’ve lost most of the scalability benefit of stateless JWTs.
Short expiry and accept the gap: for low-risk use cases, simply accept that revoked tokens remain valid for a few minutes. Logging out from a consumer app doesn’t require immediate invalidation in most threat models.

The right choice depends on your threat model. For high-security systems (banking, healthcare), token blocklists are worth the operational cost. For most consumer applications, short-lived access tokens with refresh token revocation are sufficient.

TOTP: The Math Behind Google Authenticator

Time-based One-Time Passwords (RFC 6238) generate a 6-digit code that changes every 30 seconds. The algorithm is worth understanding because it’s elegant.

During setup, the server generates a random 160-bit secret $K$ and shares it with the user - typically encoded as a QR code containing an otpauth:// URI. The secret is stored on both sides: in the user’s authenticator app and in the server’s database.

At verification time, both sides independently compute:

$$TOTP(K, t) = HOTP(K, \lfloor t / T_0 \rfloor)$$

where $t$ is the current Unix timestamp and $T_0 = 30$ seconds. The time counter $\lfloor t / T_0 \rfloor$ increments every 30 seconds, synchronized by the global Unix clock.

HOTP (the underlying hash-based OTP algorithm) computes:

$$HOTP(K, C) = \text{Truncate}(HMAC\text{-}SHA1(K, C))$$

HMAC-SHA1 produces a 20-byte hash. The truncation step extracts 4 bytes from a dynamic offset within the hash (the offset is determined by the last nibble of the hash), then takes those 4 bytes modulo $10^6$ to produce a 6-digit code.

The result: both the authenticator app and the server derive the same 6-digit code from the same shared secret and the current time, without any network communication. The code is valid for 30 seconds. The server accepts codes from the current window and the adjacent windows (±1) to tolerate clock skew between the user’s device and the server.

The security model: an attacker who intercepts a code cannot reuse it after the 30-second window. An attacker who obtains the shared secret $K$ can generate valid codes indefinitely - which is why storing TOTP secrets securely in the server’s database is critical.

SMS OTP and Why It’s Weak

SMS one-time passwords are better than no second factor. They’re not much better. The primary attack is SIM swapping: an attacker calls the mobile carrier, social-engineers the support representative into reassigning the victim’s phone number to a SIM the attacker controls, and then receives all SMS codes sent to that number. This attack has been used successfully against high-profile targets including the CEO of Twitter and numerous cryptocurrency holders.

For sensitive systems, TOTP or hardware security keys (FIDO2/WebAuthn) are strongly preferred over SMS.

Session Tokens vs JWTs: The Real Comparison

The debate between session tokens and JWTs is often framed as “stateful vs stateless,” but the operational reality is more nuanced.

Session tokens are opaque strings. The server looks up the session in a store (Redis, database) on every request. Revocation is instant - delete the session. The session store is the single source of truth. The cost: every authenticated request requires a round trip to the session store.

JWTs are self-contained. Verification requires only the public key (cached locally). No session store round trip. The cost: revocation is hard (see above). JWTs also leak information: anyone who intercepts the token and base64-decodes the payload reads all the claims in plaintext (JWTs are signed, not encrypted - use JWE for encryption).

At scale, the session store round trip is often the bottleneck. But “at scale” means millions of requests per second. For most applications, a Redis session store with sub-millisecond lookup latency handles the load without issue. The decision between sessions and JWTs should be driven by your security requirements (revocation) and your architecture (single backend vs many independent services), not by a general preference for “stateless.”

AWS Cognito, Auth0, and the Build vs Buy Decision

Building an identity system from scratch is a bad idea for most teams. The attack surface is large, the edge cases are numerous (rate limiting failed login attempts, account recovery flows, bot detection, compliance requirements), and the consequences of getting it wrong are severe.

AWS Cognito is tightly integrated with the AWS ecosystem. User pools handle authentication (storing users, managing passwords, TOTP, social login via OIDC federation). Identity pools handle authorization (mapping authenticated users to IAM roles, enabling direct access to AWS services). If you’re already on AWS and don’t need complex customization, Cognito is the practical choice. Its rough edges (confusing documentation, limited customization of hosted UI) are well-known.

Auth0 (now Okta CIC) and Okta are enterprise-grade identity platforms with more flexibility. Auth0 supports custom Actions (small functions that run during the authentication flow), extensive social connection support, and a more polished developer experience. The cost scales with Monthly Active Users and gets expensive at scale.

Clerk and WorkOS are newer entrants optimized for B2B SaaS - enterprise SSO (SAML), directory sync (SCIM), and fine-grained organization management. If you’re building a product that enterprise customers need to connect to their Active Directory or Okta tenant, WorkOS is purpose-built for that use case.

The build vs buy calculus: if authentication is not your core business, buy. The ongoing maintenance cost of a homegrown identity system - keeping up with OAuth2 spec updates, implementing MFA, managing security disclosures - is significant.

Service-to-Service Authentication: IAM Role Assumption

Human authentication is only part of the problem. In a microservices architecture, services also need to authenticate to each other. A payment service calling a fraud service needs to prove it is the payment service, not a compromised container that has taken its place.

AWS IAM handles this via role assumption. Each ECS task or Lambda function runs with an associated IAM role. The AWS SDK automatically fetches temporary credentials (access key, secret key, session token) from the instance metadata service - credentials that expire after a few hours and are automatically rotated. The fraud service’s resource policy allows only the payment service’s IAM role. No long-lived credentials stored anywhere.

Long-lived API keys (the old alternative) are dangerous: they don’t expire, they’re often accidentally committed to version control, and revoking them is manual. IAM role-based auth with short-lived credentials is the modern standard. Cloudflare Access uses a similar model with JWT-based service tokens for its Zero Trust network model.

Passkeys and WebAuthn: The Password-Free Future

WebAuthn is a W3C standard that enables authentication using public-key cryptography. During registration, the authenticator (a device’s built-in secure enclave, a hardware key like a YubiKey) generates a public-private key pair. The public key is stored on the server. The private key never leaves the device.

At login, the server sends a challenge. The authenticator signs the challenge with the private key (using the device’s biometric or PIN to authorize the signing operation). The server verifies the signature with the stored public key. There is no password to phish, no credential to steal, and no shared secret that can leak from a server breach.

Passkeys are a platform-level implementation of WebAuthn that sync across devices via cloud key sync (iCloud Keychain on Apple, Google Password Manager on Android). They combine WebAuthn’s security with the convenience of being available on all your devices.

The critical property: passkeys are origin-bound. The key pair is scoped to the exact domain (yelp.com). A phishing site at ye1p.com cannot request a signature using the Yelp passkey - the origin doesn’t match, and the authenticator refuses. This makes passkeys phishing-resistant in a way that passwords fundamentally cannot be.

Apple, Google, and Microsoft all ship passkey support. The transition is already underway. The question for engineers building new systems is not whether to support passkeys, but whether to make them the primary authentication method from the start.

OAuth2’s Complexity Problem

OAuth2 has seven grant types. The original spec included the implicit flow - designed for single-page apps to receive tokens directly in the URL fragment without a backend round trip. The implicit flow has a subtle but serious security issue: the token appears in the browser’s URL bar, in referrer headers, and in server logs. Any server the browser touches can see the token.

PKCE, designed to replace the implicit flow for SPAs, is now the standard. The implicit flow is deprecated. But the existence of seven grant types - authorization code, implicit, resource owner password credentials (never use this), client credentials, device flow, refresh token, and OIDC hybrid - means OAuth2 libraries and clients have historically implemented incompatible subsets. The security properties of each grant type differ, and choosing the wrong one has security consequences.

The OAuth 2.1 draft consolidates and cleans this up, mandating PKCE for all authorization code flows, deprecating implicit and password grant types, and generally making it harder to make the common mistakes. It’s a meaningful improvement.

Multi-Region Identity

Identity systems present specific challenges in multi-region deployments. User sessions created in US-EAST-1 must be valid when a request routes to EU-WEST-1 - either the session store is globally replicated, or tokens are self-contained (JWTs). Data sovereignty requirements complicate this: GDPR’s right to erasure means that deleting a user must propagate to all regions. If authentication data is replicated globally, deletion must be globally consistent and verifiable.

AWS Cognito User Pools are region-specific - users in a US pool are not available in an EU pool. Multi-region identity federation requires either Cognito’s federation features (social logins via OIDC) or a third-party provider that handles global replication. Auth0 and Okta operate globally with built-in data residency controls.

Future Outlook

Passwordless authentication via passkeys is the clear direction. Major platforms are investing heavily. The developer ecosystem (FIDO2 libraries, browser WebAuthn APIs) is mature. The remaining friction is enterprise adoption - enterprises with legacy identity infrastructure (LDAP, legacy SAML IdPs) move slowly.

Decentralized identity (W3C DID standard, Verifiable Credentials) represents a longer-horizon shift: identity credentials issued by one party and verifiable by any party without a central authority. Useful for cross-organizational identity (proving your university degree to an employer without the employer calling the university). Not replacing OAuth2 for consumer applications anytime soon.

Summary

Mechanism	Revocable	Phishing-resistant	Server-side state	Best for
Session token	Yes (instant)	No	Yes (session store)	Web apps with backend
JWT (short-lived + refresh)	Yes (on refresh)	No	Minimal (refresh token only)	APIs, microservices
TOTP	Not applicable	No (can be phished)	Shared secret (server DB)	MFA second factor
WebAuthn/Passkeys	Yes	Yes (origin-bound)	Public key only	Primary auth, passwordless
SMS OTP	Not applicable	No	None	Weak MFA, legacy systems

Read Next:

Docker & Containerization - Packaging Code So It Runs the Same Everywhere