Graph-Powered Twitter Trend Analysis: Building Recommendations with Neo4j and Spring Data

Explore how to leverage Neo4j graph database for real-time Twitter trend analysis and user recommendations using Spring Data Neo4j and Spring Social.

GT
Gonnect Team
January 14, 202411 min readView on GitHub
Neo4jSpring BootSpring Data Neo4jSpring SocialTwitter API

Introduction

Traditional relational databases excel at storing structured data, but they struggle when relationships become the primary focus of queries. Social networks, recommendation engines, and trend analysis are domains where graph databases shine, offering intuitive modeling and exceptional query performance for connected data.

This article explores a practical implementation that combines Neo4j - the leading graph database - with Spring Data Neo4j and Spring Social to analyze live Twitter streams, detect trends, and generate user recommendations.

Key Insight: Graph databases eliminate the need for complex JOIN operations by making relationships first-class citizens, enabling queries that would be impractical in relational systems.

Why Graph Databases for Social Analysis?

Event-Driven Architecture

Loading diagram...

The Problem with Relational Models

Consider modeling Twitter data in a relational database:

-- Users table
CREATE TABLE users (id BIGINT PRIMARY KEY, username VARCHAR(255));

-- Tweets table
CREATE TABLE tweets (id BIGINT PRIMARY KEY, user_id BIGINT, content TEXT);

-- Follows relationship
CREATE TABLE follows (follower_id BIGINT, following_id BIGINT);

-- Hashtags
CREATE TABLE hashtags (id BIGINT PRIMARY KEY, tag VARCHAR(255));

-- Tweet-Hashtag relationship
CREATE TABLE tweet_hashtags (tweet_id BIGINT, hashtag_id BIGINT);

A simple query like "find users who follow someone who tweeted about a trending topic" requires multiple JOINs and becomes exponentially complex as relationship depth increases.

The Graph Advantage

In Neo4j, the same data is modeled naturally:

┌─────────────────────────────────────────────────────────────────┐
│                     Twitter Graph Model                         │
└─────────────────────────────────────────────────────────────────┘

    (User:alice)─[:FOLLOWS]─>(User:bob)
         │                       │
    [:POSTED]               [:POSTED]
         │                       │
         ▼                       ▼
    (Tweet:t1)             (Tweet:t2)
         │                       │
    [:TAGGED]               [:TAGGED]
         │                       │
         ▼                       ▼
    (Tag:#java)            (Tag:#spring)
AspectRelationalGraph
Relationship QueriesComplex JOINsNative traversal
PerformanceDegrades with depthConstant regardless of size
Schema EvolutionRigid migrationsDynamic properties
Intuitive ModelingNormalization requiredMatches mental model

Architecture Overview

Microservices Architecture

Loading diagram...
┌─────────────────────────────────────────────────────────────────┐
│                   Twitter Stream API                            │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                 Spring Social Twitter                           │
│            (OAuth + Stream Processing)                          │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                Spring Boot Application                          │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐  │
│  │  User Service   │  │  Tweet Service  │  │  Tag Service   │  │
│  └────────┬────────┘  └────────┬────────┘  └───────┬────────┘  │
│           │                    │                   │            │
│           └────────────────────┼───────────────────┘            │
│                                │                                │
│                    Spring Data Neo4j                            │
└────────────────────────────────┼────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Neo4j Graph Database                         │
│               (Dockerized, Bolt Protocol)                       │
└─────────────────────────────────────────────────────────────────┘

Technology Stack

ComponentTechnologyPurpose
DatabaseNeo4jGraph storage and Cypher queries
FrameworkSpring BootApplication infrastructure
Data AccessSpring Data Neo4jOGM (Object-Graph Mapping)
Social IntegrationSpring SocialTwitter API connectivity
Build ToolMavenDependency management

Implementation Deep Dive

Domain Model with Neo4j Annotations

The graph model is expressed using Neo4j OGM annotations:

// User Node
@NodeEntity
public class User {

    @Id
    @GeneratedValue
    private Long id;

    @Property("twitterId")
    private String twitterId;

    @Property("username")
    private String username;

    @Property("displayName")
    private String displayName;

    @Property("followersCount")
    private int followersCount;

    @Relationship(type = "POSTED", direction = Relationship.OUTGOING)
    private Set<Tweet> tweets = new HashSet<>();

    @Relationship(type = "FOLLOWS", direction = Relationship.OUTGOING)
    private Set<User> following = new HashSet<>();

    @Relationship(type = "FOLLOWS", direction = Relationship.INCOMING)
    private Set<User> followers = new HashSet<>();

    public void post(Tweet tweet) {
        tweets.add(tweet);
        tweet.setAuthor(this);
    }

    public void follow(User user) {
        following.add(user);
    }

    // Constructors, getters, setters...
}

// Tweet Node
@NodeEntity
public class Tweet {

    @Id
    @GeneratedValue
    private Long id;

    @Property("tweetId")
    private String tweetId;

    @Property("content")
    private String content;

    @Property("createdAt")
    private Date createdAt;

    @Property("retweetCount")
    private int retweetCount;

    @Relationship(type = "POSTED", direction = Relationship.INCOMING)
    private User author;

    @Relationship(type = "TAGGED", direction = Relationship.OUTGOING)
    private Set<Tag> tags = new HashSet<>();

    @Relationship(type = "MENTIONS", direction = Relationship.OUTGOING)
    private Set<User> mentions = new HashSet<>();

    public void addTag(Tag tag) {
        tags.add(tag);
        tag.getTweets().add(this);
    }

    // Constructors, getters, setters...
}

// Tag (Hashtag) Node
@NodeEntity
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @Property("name")
    @Index(unique = true)
    private String name;

    @Property("tweetCount")
    private int tweetCount;

    @Relationship(type = "TAGGED", direction = Relationship.INCOMING)
    private Set<Tweet> tweets = new HashSet<>();

    // Constructors, getters, setters...
}

Repository Layer with Custom Cypher Queries

Spring Data Neo4j provides powerful repository support with custom Cypher queries:

public interface UserRepository extends Neo4jRepository<User, Long> {

    Optional<User> findByTwitterId(String twitterId);

    Optional<User> findByUsername(String username);

    // Find users who follow a specific user
    @Query("MATCH (u:User)-[:FOLLOWS]->(target:User {username: $username}) " +
           "RETURN u")
    List<User> findFollowersOf(@Param("username") String username);

    // Recommendation: Users followed by people I follow (2nd degree)
    @Query("MATCH (me:User {username: $username})-[:FOLLOWS]->(friend)-[:FOLLOWS]->(recommended) " +
           "WHERE NOT (me)-[:FOLLOWS]->(recommended) AND me <> recommended " +
           "RETURN recommended, count(friend) as mutualFriends " +
           "ORDER BY mutualFriends DESC " +
           "LIMIT $limit")
    List<User> findRecommendedUsers(
        @Param("username") String username,
        @Param("limit") int limit);

    // Find users who tweeted about a specific tag
    @Query("MATCH (u:User)-[:POSTED]->(t:Tweet)-[:TAGGED]->(tag:Tag {name: $tagName}) " +
           "RETURN DISTINCT u " +
           "LIMIT $limit")
    List<User> findUsersByTag(
        @Param("tagName") String tagName,
        @Param("limit") int limit);
}

public interface TweetRepository extends Neo4jRepository<Tweet, Long> {

    @Query("MATCH (t:Tweet)-[:TAGGED]->(tag:Tag {name: $tagName}) " +
           "RETURN t " +
           "ORDER BY t.createdAt DESC " +
           "LIMIT $limit")
    List<Tweet> findByTag(
        @Param("tagName") String tagName,
        @Param("limit") int limit);

    // Find trending tweets (most retweeted in last 24h)
    @Query("MATCH (t:Tweet) " +
           "WHERE t.createdAt > datetime() - duration('P1D') " +
           "RETURN t " +
           "ORDER BY t.retweetCount DESC " +
           "LIMIT $limit")
    List<Tweet> findTrendingTweets(@Param("limit") int limit);
}

public interface TagRepository extends Neo4jRepository<Tag, Long> {

    Optional<Tag> findByName(String name);

    // Find trending hashtags
    @Query("MATCH (tag:Tag)<-[:TAGGED]-(t:Tweet) " +
           "WHERE t.createdAt > datetime() - duration('P1D') " +
           "RETURN tag, count(t) as tweetCount " +
           "ORDER BY tweetCount DESC " +
           "LIMIT $limit")
    List<Map<String, Object>> findTrendingTags(@Param("limit") int limit);

    // Find related tags (co-occurrence)
    @Query("MATCH (tag:Tag {name: $tagName})<-[:TAGGED]-(t:Tweet)-[:TAGGED]->(related:Tag) " +
           "WHERE tag <> related " +
           "RETURN related, count(t) as coOccurrences " +
           "ORDER BY coOccurrences DESC " +
           "LIMIT $limit")
    List<Map<String, Object>> findRelatedTags(
        @Param("tagName") String tagName,
        @Param("limit") int limit);
}

Twitter Stream Integration

The application connects to Twitter's streaming API to capture live tweets:

@Service
public class TwitterStreamService {

    private final Twitter twitter;
    private final TweetProcessor tweetProcessor;

    public TwitterStreamService(Twitter twitter, TweetProcessor tweetProcessor) {
        this.twitter = twitter;
        this.tweetProcessor = tweetProcessor;
    }

    public void startStreamingByKeywords(List<String> keywords) {
        StreamListener listener = new StreamListener() {
            @Override
            public void onTweet(Tweet tweet) {
                tweetProcessor.process(tweet);
            }

            @Override
            public void onDelete(StreamDeleteEvent deleteEvent) {
                // Handle deletions if needed
            }

            @Override
            public void onLimit(int numberOfLimitedTweets) {
                log.warn("Rate limited: {} tweets", numberOfLimitedTweets);
            }

            @Override
            public void onWarning(StreamWarningEvent warningEvent) {
                log.warn("Stream warning: {}", warningEvent.getMessage());
            }
        };

        FilterStreamParameters params = new FilterStreamParameters()
            .track(keywords.toArray(new String[0]));

        twitter.streamingOperations().filter(params, listener);
    }
}

@Component
public class TweetProcessor {

    private final UserRepository userRepository;
    private final TweetRepository tweetRepository;
    private final TagRepository tagRepository;

    @Transactional
    public void process(org.springframework.social.twitter.api.Tweet tweet) {
        // Find or create user
        User user = userRepository.findByTwitterId(
                String.valueOf(tweet.getFromUserId()))
            .orElseGet(() -> createUser(tweet));

        // Create tweet node
        Tweet tweetNode = new Tweet();
        tweetNode.setTweetId(String.valueOf(tweet.getId()));
        tweetNode.setContent(tweet.getText());
        tweetNode.setCreatedAt(tweet.getCreatedAt());
        tweetNode.setRetweetCount(tweet.getRetweetCount());

        // Extract and link hashtags
        extractHashtags(tweet.getText()).forEach(tagName -> {
            Tag tag = tagRepository.findByName(tagName)
                .orElseGet(() -> {
                    Tag newTag = new Tag();
                    newTag.setName(tagName);
                    return tagRepository.save(newTag);
                });
            tweetNode.addTag(tag);
            tag.setTweetCount(tag.getTweetCount() + 1);
        });

        // Link to user
        user.post(tweetNode);

        userRepository.save(user);
    }

    private Set<String> extractHashtags(String text) {
        Set<String> hashtags = new HashSet<>();
        Matcher matcher = Pattern.compile("#(\\w+)").matcher(text);
        while (matcher.find()) {
            hashtags.add(matcher.group(1).toLowerCase());
        }
        return hashtags;
    }

    private User createUser(org.springframework.social.twitter.api.Tweet tweet) {
        User user = new User();
        user.setTwitterId(String.valueOf(tweet.getFromUserId()));
        user.setUsername(tweet.getFromUser());
        user.setDisplayName(tweet.getUser().getName());
        user.setFollowersCount(tweet.getUser().getFollowersCount());
        return user;
    }
}

REST API Endpoints

The application exposes REST endpoints for querying the graph:

@RestController
@RequestMapping("/api")
public class TrendController {

    private final UserRepository userRepository;
    private final TweetRepository tweetRepository;
    private final TagRepository tagRepository;

    // GET /api/users - List all users
    @GetMapping("/users")
    public List<User> getUsers() {
        return userRepository.findAll();
    }

    // GET /api/users/search/{username} - Search user
    @GetMapping("/users/search/{username}")
    public ResponseEntity<User> searchUser(@PathVariable String username) {
        return userRepository.findByUsername(username)
            .map(ResponseEntity::ok)
            .orElse(ResponseEntity.notFound().build());
    }

    // GET /api/users/{username}/recommendations - Get recommendations
    @GetMapping("/users/{username}/recommendations")
    public List<User> getRecommendations(
            @PathVariable String username,
            @RequestParam(defaultValue = "10") int limit) {
        return userRepository.findRecommendedUsers(username, limit);
    }

    // GET /api/tweets - List tweets
    @GetMapping("/tweets")
    public List<Tweet> getTweets() {
        return tweetRepository.findAll();
    }

    // GET /api/tweets/trend - Get trending tweets
    @GetMapping("/tweets/trend")
    public List<Tweet> getTrendingTweets(
            @RequestParam(defaultValue = "20") int limit) {
        return tweetRepository.findTrendingTweets(limit);
    }

    // GET /api/tags - List hashtags
    @GetMapping("/tags")
    public List<Tag> getTags() {
        return tagRepository.findAll();
    }

    // GET /api/tags/trending - Get trending hashtags
    @GetMapping("/tags/trending")
    public List<Map<String, Object>> getTrendingTags(
            @RequestParam(defaultValue = "10") int limit) {
        return tagRepository.findTrendingTags(limit);
    }

    // GET /api/tags/{tagName}/related - Get related hashtags
    @GetMapping("/tags/{tagName}/related")
    public List<Map<String, Object>> getRelatedTags(
            @PathVariable String tagName,
            @RequestParam(defaultValue = "10") int limit) {
        return tagRepository.findRelatedTags(tagName, limit);
    }
}

Configuration

Application Properties

# Neo4j Connection
spring.data.neo4j.uri=bolt://localhost:7687
spring.data.neo4j.username=neo4j
spring.data.neo4j.password=secret

# Twitter API Credentials
spring.social.twitter.appId=YOUR_APP_ID
spring.social.twitter.appSecret=YOUR_APP_SECRET
spring.social.twitter.accessToken=YOUR_ACCESS_TOKEN
spring.social.twitter.accessTokenSecret=YOUR_ACCESS_TOKEN_SECRET

# Logging
logging.level.org.neo4j=INFO
logging.level.org.springframework.data.neo4j=DEBUG

Docker Setup for Neo4j

# Run Neo4j with Docker
docker run \
  --name neo4j-twitter \
  -p 7474:7474 \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/secret \
  -v $HOME/neo4j/data:/data \
  -d neo4j:latest

Running the Application

# Clone the repository
git clone https://github.com/mgorav/neo4j-twitter-trend-recomendentation.git
cd neo4j-twitter-trend-recomendentation

# Start Neo4j
docker run -d --name neo4j -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/secret neo4j:latest

# Configure Twitter credentials in application.properties

# Build and run
mvn clean install
mvn spring-boot:run

API Endpoints Summary

EndpointMethodDescription
/api/usersGETList all users
/api/users/search/{username}GETSearch user by username
/api/users/{username}/recommendationsGETGet user recommendations
/api/tweetsGETList all tweets
/api/tweets/trendGETGet trending tweets
/api/tagsGETList all hashtags
/api/tags/trendingGETGet trending hashtags
/api/tags/{tagName}/relatedGETGet related hashtags

Use Cases and Applications

Graph databases with social data analysis are particularly valuable in:

IndustryUse Case
Financial ServicesFraud detection through relationship patterns
MarketingInfluencer identification and campaign targeting
IoTDevice relationship and network topology analysis
SecurityThreat actor network mapping
E-commerceRecommendation engines

Conclusion

Combining Neo4j with Spring Data Neo4j provides an elegant solution for social data analysis. The graph model naturally represents relationships, while Cypher queries enable complex traversals with simple, readable syntax. Key takeaways:

  • Graph databases excel when relationships are the primary focus
  • Spring Data Neo4j provides familiar repository patterns for graph access
  • Cypher queries enable powerful graph traversals
  • Real-time streaming with Spring Social enables live data capture
  • Recommendation algorithms become natural graph queries

The neo4j-twitter-trend-recomendentation project demonstrates these concepts in a production-ready implementation that can be extended for various social analysis use cases.


Further Reading