Helpful context:


A new engineer building their first cloud service tends to think of storage as one thing. You put data somewhere, you get it back. The cloud gives you several ways to do this, and the choice between them has significant consequences for performance, cost, scalability, and what your application can actually do with the data.

The three fundamental models are block storage, file storage, and object storage. They are not interchangeable. Each one reflects a different set of tradeoffs, and understanding those tradeoffs is what lets you make the right choice rather than defaulting to whatever is familiar.


Block Storage

A block storage volume looks exactly like a hard drive. The operating system sees a raw device - a sequence of fixed-size blocks, addressable by offset. The OS formats it with a filesystem, mounts it, and applications read and write files as if the disk were physically attached to the machine.

This is what GCP’s Persistent Disk and AWS’s EBS (Elastic Block Store) provide. Underneath, the actual storage is network-attached rather than local, but the interface is indistinguishable from a local disk.

The characteristics:

Low latency. Block storage provides the lowest latency of the three models - typically sub-millisecond for reads and writes on SSD-backed volumes. Reads and writes go through a high-speed network path optimized for the block access pattern. This matters for databases, which do random small reads and writes at high volume.

Attached to one instance. A standard persistent disk volume can be attached to exactly one VM at a time (GCP’s ReadWriteOnce mode). You cannot mount the same block device on two VMs simultaneously and have both writing to it. There are multi-attach modes, but coordinating concurrent writes from multiple machines to a raw block device is the application’s problem - there is no filesystem-level coordination.

Sized at creation. You provision a block volume with a specific capacity. Running out of space means resizing the volume (which may require a brief outage depending on the OS and filesystem) or migrating data. You pay for the provisioned capacity even if you use only half of it.

What it is used for: databases (MySQL, PostgreSQL, MongoDB) that need low latency random I/O. Boot volumes for virtual machines. Any workload that requires POSIX filesystem semantics with strict latency requirements.

On GCP, block storage comes in tiers: standard (hard disk, cheaper, higher latency), balanced (SSD, good balance of cost and performance), SSD (highest IOPS and throughput), and Hyperdisk (newer generation, dynamically configurable IOPS and throughput independent of capacity).


File Storage

File storage presents a shared filesystem that multiple VMs can mount simultaneously. The abstraction is a directory tree with files - POSIX semantics including permissions, directory listings, and file locking. Multiple machines mount the same filesystem and see the same files.

On GCP this is Filestore; on AWS it is EFS (Elastic File System). Under the covers, a managed NFS (Network File System) or SMB server serves the filesystem over the network. The clients believe they are accessing a local filesystem; they are issuing network requests.

The characteristics:

Shared access. Multiple VMs can mount the same Filestore instance simultaneously with read-write access. This is the defining feature that block storage cannot provide without application-level coordination.

POSIX semantics. File locking, directory listing, permissions - applications written to expect a normal filesystem work without modification. This is important for lifting and shifting legacy applications that were written for shared network storage.

Higher latency than block storage. NFS adds network overhead even when optimized. For latency-sensitive database operations, file storage is noticeably slower than block storage. For applications that read and write files at moderate frequency, the difference is acceptable.

Throughput scales with capacity. Filestore’s throughput is tied to provisioned capacity - a larger share gives more throughput. This is different from block storage, where IOPS and throughput are more directly configurable.

What it is used for: shared media directories (image uploads that multiple application servers need to read), shared code repositories in CI/CD pipelines, ML training data that multiple training workers need to read from simultaneously, legacy applications that require shared filesystem access.


Object Storage

Object storage is the most radical departure from traditional storage models. There is no filesystem, no directory tree (directories are simulated by naming conventions), no random access to parts of a file, no POSIX semantics. Instead, there are objects - blobs of data of arbitrary size, each with a globally unique key and associated metadata. Operations are simple: put an object, get an object, delete an object, list objects by key prefix.

On GCP this is Cloud Storage (GCS); on AWS it is S3. The interface is an HTTP API - you PUT a file to a URL and GET it back by the same URL.

The characteristics:

Massive scale. Object storage is designed for virtually unlimited capacity. You do not provision a specific amount of space; you pay for what you use, per byte per month. A bucket can hold a single small text file or petabytes of data - the interface is the same either way.

Global accessibility. Objects are accessible via HTTPS from anywhere - your application servers, external users, other cloud services. No mounting required. Any process with the right credentials can access any object directly.

High durability. GCS standard class provides 99.999999999% (eleven nines) annual durability. The storage system distributes data across multiple zones and replicates it multiple times. Data loss from hardware failure is, for practical purposes, not something you plan for.

Higher latency for small random I/O. Object storage is optimized for large object reads and writes, not random small I/O. An application that issues thousands of small random reads and writes to object storage will see far worse performance than one using block storage. Object storage is not a database substitute.

Eventual consistency for operations. Historically, object storage systems offered eventual consistency - a write might not be immediately visible on a subsequent read. GCS (and modern S3) now offer strong consistency for most operations: after a successful write, the next read returns the new data. But this is at the object level; there is no cross-object transaction support.

What it is used for: storing and serving static assets (images, videos, CSS, JavaScript) for web applications. Data lake storage for analytics and ML - raw data is stored as files (Parquet, CSV, Avro) and queried by systems like BigQuery, Spark, or Dataflow. Backups and archives. Distributing build artifacts. Anything that needs to be stored at large scale and accessed occasionally via API.


Storage Classes: Trading Cost for Access Speed

Within object storage, GCP and other providers offer multiple storage classes at different price points. The underlying durability is the same; what changes is the cost per GB per month and the cost of data access.

Standard: the highest-cost storage tier, no retrieval fees, designed for frequently accessed data. Serving images to a web application, storing a dataset you query daily.

Nearline: lower monthly cost, a retrieval fee per byte accessed, designed for data accessed roughly once per month. Infrequent backups, monthly reports.

Coldline: lower monthly cost than Nearline, higher per-access cost, designed for data accessed roughly once per quarter. Disaster recovery data you hope never to need.

Archive: the cheapest monthly storage cost, the highest access cost, and a minimum 365-day storage commitment. Long-term regulatory archives, historical data that might be needed for a compliance audit in five years.

Object Lifecycle Management automates moving objects between classes. You define rules: “move to Nearline after 30 days; move to Coldline after 90 days; delete after 1 year.” The system applies the rules automatically as objects age. This is how you achieve cost-efficient storage without manually managing thousands of objects.


How They Compare

Dimension Block File Object
Access pattern Random I/O via filesystem POSIX filesystem, shareable HTTP API, key-value
Latency Sub-millisecond Milliseconds Tens to hundreds of milliseconds
Concurrent writers One (usually) Many Many (object-level)
Capacity limit Provisioned size Provisioned size Effectively unlimited
Pricing model Pay for provisioned GB Pay for provisioned GB Pay per GB stored
Best for Databases, boot volumes Shared app data, NFS workloads Static assets, data lakes, backups

The Decision in Practice

Use block storage when your workload requires the lowest latency or POSIX filesystem semantics on a single machine. Databases almost always need block storage. If you are running MySQL, PostgreSQL, or Cassandra, block storage is not optional.

Use file storage when multiple machines need to share access to the same files with standard filesystem semantics. Shared upload directories, CI/CD artifact sharing, legacy applications that require NFS - these are the file storage use cases.

Use object storage for almost everything else. Storing user-uploaded images, distributing ML training data, keeping backups, serving large files to clients, archiving logs - object storage is cheaper, more durable, more scalable, and globally accessible. The constraint is that it is not a filesystem and is not appropriate for random small I/O workloads.

In practice, most cloud applications use all three: object storage for user data and static assets, block storage for the database, and potentially file storage for a shared configuration or media directory that multiple application servers read.


Read next: