Summary
Graph databases treat relationships as first-class citizens, making connected data queries natural and performant.
Core concepts:
- Property graph model: nodes (entities) + relationships (connections)
- Cypher query language for pattern matching
- Built-in graph algorithms
Use cases: Social networks, fraud detection, recommendation engines, knowledge graphs
vs. Relational: Graphs excel at deep traversals and relationship queries; relational wins at aggregations and transactions.
When to use: Multi-hop queries, flexible schema, relationship-centric data model.
Graph Databases: When Relations Matter Most
Introduction: The World is a Graph
The world is inherently connected. People know people. Products relate to categories. Transactions flow between accounts. Yet for decades, we forced this interconnected reality into the rigid rows and columns of relational databases.
Graph databases flip the script: relationships are first-class citizens. Instead of foreign keys and JOIN operations, graphs represent connections directly, making relationship queries that would require complex multi-table JOINs trivial.
Consider finding friends-of-friends in a social network:
-- Relational (MySQL): Complex self-join
SELECT DISTINCT u.name
FROM users u
JOIN friendships f1 ON u.id = f1.user2_id
JOIN friendships f2 ON f1.user1_id = f2.user2_id
WHERE f2.user1_id = 123
AND u.id != 123;
-- Performance degrades exponentially with depth
-- Graph (Neo4j): Natural traversal
MATCH (me:User {id: 123})-[:FRIEND]->()-[:FRIEND]->(friend)
RETURN DISTINCT friend.name;
-- Performance is predictable, scales linearly
This article explores when and how to use graph databases.
Graph Fundamentals
The Property Graph Model
Most graph databases use the property graph model:
// Example: Social network
;
;
;
When to Choose Graph Databases
Use Case 2: Fraud Detection
Detect fraud rings by finding suspicious connection patterns:
-- Find accounts sharing suspicious attributes
MATCH (a1:Account)-[:HAS_IP]->(ip:IP)<-[:HAS_IP]-(a2:Account)
MATCH (a1)-[:HAS_DEVICE]->(device:Device)<-[:HAS_DEVICE]-(a2)
WHERE a1 <> a2
WITH a1, a2, COUNT(DISTINCT ip) + COUNT(DISTINCT device) AS sharedAttributes
WHERE sharedAttributes >= 2
RETURN a1.id, a2.id, sharedAttributes
ORDER BY sharedAttributes DESC;
-- Detect circular money flow (money laundering)
MATCH path = (start:Account)-[:TRANSFERRED*3..6]->(start)
WHERE ALL(r IN relationships(path) WHERE r.amount > 10000)
RETURN path, [r IN relationships(path) | r.amount] AS amounts;
Why graphs win: Pattern matching across relationships is graph databases natural strength.
Use Case 3: Recommendation Engines
-- Collaborative filtering: users who liked X also liked Y
MATCH (user:User {id: 123})-[:LIKED]->(item:Product)<-[:LIKED]-(other:User)
MATCH (other)-[:LIKED]->(recommendation:Product)
WHERE NOT (user)-[:LIKED]->(recommendation)
RETURN recommendation.name, COUNT(*) AS score
ORDER BY score DESC
LIMIT 10;
-- Content-based: similar items
MATCH (item:Product {id: 456})-[:IN_CATEGORY]->(cat:Category)<-[:IN_CATEGORY]-(similar:Product)
WHERE item <> similar
RETURN similar.name, COUNT(cat) AS sharedCategories
ORDER BY sharedCategories DESC;
Graph Algorithms
Graph databases provide built-in algorithms:
1. Shortest Path
-- Dijkstra for weighted paths
MATCH (start:Station {name: "A"}), (end:Station {name: "Z"})
CALL algo.shortestPath.stream(start, end, "distance")
YIELD nodeId, cost
RETURN algo.getNodeById(nodeId).name AS station, cost;
2. Community Detection
-- Louvain algorithm for community detection
CALL algo.louvain.stream("Person", "FRIEND")
YIELD nodeId, community
RETURN community, COLLECT(algo.getNodeById(nodeId).name) AS members
ORDER BY SIZE(members) DESC;
3. Centrality Measures
-- Betweenness centrality: identify bridges
CALL algo.betweenness.stream("Person", "FRIEND")
YIELD nodeId, centrality
RETURN algo.getNodeById(nodeId).name, centrality
ORDER BY centrality DESC;
Implementation: Building a Graph Database
// Simplified in-memory graph database
Indexing and Performance
// Index structures for graph databases
Distributed Graph Databases
Scaling graphs across machines is challenging:
// Graph partitioning strategies
Graph vs. Relational: When to Choose
// Decision matrix
Conclusion
Graph databases shine when relationships are as important as the data itself. From social networks to fraud detection to knowledge graphs, they provide a natural way to model and query connected data.
Key takeaways:
- Graphs make relationship queries simple and fast
- Built-in graph algorithms (shortest path, PageRank, community detection)
- Schema flexibility enables rapid iteration
- But: Less mature than relational, harder to scale
Choose graphs when:
- Query depth > 3 hops
- Relationships have properties
- Schema evolves frequently
- Pattern matching is core to the application
The world is a graph—sometimes your database should be too.