Simplifying Snowflake Architecture: What You Need to Know
Table of contents
- Introduction
- The Three Layers of Snowflake Architecture
- 1. Storage Layer - The Library of Data
- 2. Processing Layer - The Chef of Snowflake
- 3. Cloud Services Layer - The Manager of Everything
- 5. How Data is Stored in Snowflake
- 6. Understanding Virtual Warehouses
- 7. Connecting to Snowflake
- Conclusion
- What is a Virtual Warehouse?
- Why Are Virtual Warehouses Needed?
- Types of Virtual Warehouses
- Scaling Virtual Warehouses
- Multi-Cluster Warehouses
- How to Create a Warehouse in Snowflake
- Key Benefits of Virtual Warehouses
- Mind Map Representation
- Conclusion
Understanding Snowflake Architecture in Simple Terms
Introduction
Snowflake is a cloud-based data warehouse that is widely used for storing and analyzing data. It is known for its scalability, ease of use, and cost-effectiveness. In this guide, we will break down the Snowflake architecture into three layers, explain their functionalities, and use real-life examples, tables, and mind maps to help you understand it better.
The Three Layers of Snowflake Architecture
Snowflake's architecture consists of three key layers:
Layer | Description | Real-Life Example |
Storage Layer | Stores all the data in a structured format | Like a library, where books (data) are arranged on shelves (storage) |
Processing Layer | Executes queries using virtual warehouses | Like a chef preparing meals in a kitchen (query execution) |
Cloud Services Layer | Manages access control, metadata, security, and optimizations | Like a restaurant manager, ensuring smooth operations |
Let's explore each of these layers in detail.
1. Storage Layer - The Library of Data
This is where Snowflake stores all your data.
Data is stored in compressed columnar format.
It uses micro-partitions to store data efficiently.
Snowflake manages everything—no need to worry about storage allocation.
Real-Life Example:
Imagine a library where books are categorized by topic and stored on different shelves. Instead of scanning the entire library, you just go to the relevant shelf (partition) and pick the book you need.
Key Features:
✅ Automatically handles storage allocation ✅ Data is stored in an optimized, compressed format ✅ No manual intervention required
🧠 Mind Map - Storage Layer
Storage Layer
│
├── Data Storage
│ ├── Columnar format
│ ├── Compressed data
│
├── Micro-partitions
│ ├── Organizes data efficiently
│ ├── Faster access and retrieval
│
└── Managed by Snowflake
├── No manual setup
└── Automatic scaling
2. Processing Layer - The Chef of Snowflake
Also called the Query Execution Layer.
Uses Virtual Warehouses (compute clusters) to process queries.
Each query runs on a separate warehouse, ensuring speed and efficiency.
You pay only for the compute time used.
Real-Life Example:
Think of this as a restaurant kitchen. Different chefs (virtual warehouses) prepare meals (queries) based on customer orders (user requests). Each chef works independently, making the service faster.
Key Features:
✅ Uses Virtual Warehouses (compute clusters) ✅ Runs multiple queries simultaneously ✅ Scales up or down automatically
🧠 Mind Map - Processing Layer
Processing Layer
│
├── Virtual Warehouses
│ ├── Execute queries
│ ├── Can be resized easily
│
├── Independent Compute
│ ├── Each query runs separately
│ ├── No impact on other queries
│
└── Cost-Effective
├── Pay per compute usage
├── Auto-suspend feature
3. Cloud Services Layer - The Manager of Everything
Manages authentication, security, and access control.
Optimizes query performance and stores metadata.
Handles automatic scaling and infrastructure management.
Real-Life Example:
Think of this as a restaurant manager who ensures that chefs (processing layer) and the kitchen (storage layer) work efficiently. They also handle customer check-ins (authentication) and security.
Key Features:
✅ Controls access and authentication
✅ Manages query optimization
✅ Ensures data security and governance
Cloud Services Layer
│
├── Security & Authentication
│ ├── Manages user logins
│ ├── Controls access
│
├── Query Optimization
│ ├── Speeds up query execution
│ ├── Reduces compute cost
│
└── Infrastructure Management
├── Handles scaling automatically
├── Ensures system reliability
5. How Data is Stored in Snowflake
Snowflake stores data in Micro-Partitions, which are small, compressed blocks of data. These partitions allow Snowflake to process queries faster and reduce storage costs.
Feature | Traditional Databases | Snowflake |
Storage Format | Row-based | Columnar |
Optimization | Manual tuning required | Automatic tuning |
Performance | Slower query execution | Faster query execution |
Cost | Higher storage costs | Lower storage costs |
6. Understanding Virtual Warehouses
Snowflake uses Virtual Warehouses (VW) to execute queries.
VW can be scaled up or down based on demand.
It supports Auto-Suspend and Auto-Resume features to save costs.
🧠 Mind Map - Virtual Warehouses
Virtual Warehouses
│
├── Compute Resources
│ ├── Executes queries
│ ├── Allocates processing power
│
├── Auto-Scaling
│ ├── Can be resized anytime
│ ├── Supports multiple sizes (XS, S, M, L, etc.)
│
└── Cost Management
├── Auto-suspend saves cost
├── Pay-per-use billing
7. Connecting to Snowflake
There are multiple ways to connect to Snowflake and interact with the data:
Web Interface – Access Snowflake through a browser.
Command Line (SnowSQL) – Run SQL queries through a command-line tool.
JDBC/ODBC Drivers – Connect Snowflake to applications like Tableau, Power BI.
ETL Tools (Informatica, Talend) – Load and transform data into Snowflake.
Conclusion
Snowflake is a powerful cloud-based data warehouse that offers a simple, scalable, and cost-effective solution for data storage and processing. Its three-layer architecture ensures optimal performance, security, and cost management.
🔹 Storage Layer: Stores data efficiently in micro-partitions.
🔹 Processing Layer: Executes queries using virtual warehouses.
🔹 Cloud Services Layer: Manages authentication, security, and performance optimization.
With its automated scaling, cost-effective compute model, and high performance, Snowflake is a great choice for modern data warehousing needs.
Understanding Snowflake Virtual Warehouses in Simple Terms
What is a Virtual Warehouse?
A Virtual Warehouse in Snowflake is like a kitchen in a restaurant. Just as a kitchen processes raw ingredients into meals, a virtual warehouse processes queries and data operations.
Real-World Analogy:
Imagine a restaurant with multiple kitchens (virtual warehouses).
Some kitchens are small (extra-small warehouse), handling only a few orders at a time.
Some kitchens are large (extra-large warehouse), handling bulk orders efficiently.
The chef (compute resources) in each kitchen determines how fast orders (queries) are processed.
Why Are Virtual Warehouses Needed?
They process data queries and operations.
They allow scalability – you can increase or decrease the warehouse size as needed.
They ensure faster query execution by distributing workload across compute nodes.
Types of Virtual Warehouses
Warehouse Size | Compute Nodes (Servers) | Ideal Use Case |
Extra Small (XS) | 1 | Small queries, lightweight operations |
Small (S) | 2 | Medium-sized queries, moderate workloads |
Medium (M) | 4 | Business reports, data transformations |
Large (L) | 8 | Large datasets, heavy data processing |
X-Large (XL) | 16 | Massive computations, high concurrency |
2X-Large (2XL) | 32 | Enterprise-grade analytics, ML workloads |
Scaling Virtual Warehouses
Scaling is like upgrading or downgrading your mobile data plan based on usage.
Scale-Up (Vertical Scaling):
Increases the size of the virtual warehouse.
Useful when a query takes too long due to heavy computation.
Example: Upgrading from Small to Medium warehouse when handling bigger data.
Scale-Out (Horizontal Scaling):
Increases the number of warehouses (clusters).
Helps when multiple users are running queries simultaneously.
Example: If 10 users run queries at the same time, adding more clusters helps avoid queuing.
Multi-Cluster Warehouses
A multi-cluster warehouse is like having multiple checkout counters at a supermarket. More counters = faster service.
Standard Mode: New clusters activate when demand increases.
Maximized Mode: All clusters stay active all the time.
How to Create a Warehouse in Snowflake
Go to Admin Panel → Select Warehouses
Click + Warehouse
Choose:
Name (e.g., "Data_Processing_WH")
Size (Small, Medium, Large, etc.)
Type (Standard or Snowpark Optimized)
Multi-Cluster Mode (Auto-scale or Maximized)
Click Create and start running queries!
Key Benefits of Virtual Warehouses
✅ Pay-as-you-go: Costs are based on query execution time. ✅ Performance Optimization: Can auto-suspend when not in use. ✅ Scalability: Easily scale up or down based on workload. ✅ Parallel Processing: Multiple clusters process queries simultaneously.
Mind Map Representation
[Virtual Warehouse]
|
----------------------
| |
[Scale-Up] [Scale-Out]
| |
[More Power] [More Clusters]
Conclusion
A Virtual Warehouse is the core processing unit in Snowflake that helps execute queries and manage workloads efficiently. Understanding how to size and scale it can help optimize costs and performance.