Skip to content

rosalinatorres888/linkedin-brand-analyzer

Repository files navigation

Project Banner

LinkedIn Brand Analyzer

NLP and network analytics platform for measuring professional brand performance on LinkedIn.

Built as a portfolio project demonstrating ML/AI engineering capabilities: sentiment analysis, topic modeling, network graph analytics, and interactive visualization.

Sentiment Analysis


Overview

The LinkedIn Brand Analyzer transforms your LinkedIn data export into actionable insights about your professional brand. It answers questions like:

  • What topics drive the most engagement with my content?
  • How does my posting sentiment correlate with audience response?
  • Who are the high-value connections in my network (recruiters, decision-makers)?
  • How do I compare to industry benchmarks?

Sample Visualizations

Sentiment Analysis Topic Distribution Network Graph
Sentiment Topics Network
Engagement ROI Target Companies
Engagement Companies

Technical Architecture

LinkedIn Data Export → Ingestion Pipeline → NLP Analysis → Network Analytics → Dashboard
                              ↓                   ↓                ↓
                        Cleaned CSVs         Sentiment         Community
                                             Topics            Centrality
                                             Entities          PageRank

Core Components

Module Technology Purpose
Sentiment Analysis VADER + TextBlob Classify post/comment tone (positive, neutral, negative)
Topic Modeling Keyword Classification + LDA Cluster content into thematic categories
Network Analysis NetworkX + Louvain Community detection, centrality metrics, connection graphs
Recruiter Detection Custom NER + Title Matching Identify high-value engagers (recruiters, hiring managers)
Visualization Plotly + PyVis + Streamlit Interactive dashboards and network graphs

Features

1. Sentiment Analysis

  • VADER lexicon-based sentiment scoring
  • Sentiment-engagement correlation analysis
  • Temporal sentiment trends

2. Topic Classification

  • Configurable keyword-based topic matching
  • Support for BERTopic/LDA topic modeling
  • Engagement metrics by topic category

3. Network Intelligence

  • Professional network graph construction
  • Community detection using Louvain algorithm
  • Centrality metrics (degree, betweenness, PageRank)
  • Interactive network visualization

4. Recruiter & High-Value Connection Detection

  • Title-based recruiter identification
  • Tier-1 company classification
  • Engagement scoring weighted by connection value

5. Benchmark Comparison

  • Configurable influencer benchmark data
  • Percentile ranking across key metrics
  • Industry-specific comparisons

Installation

Prerequisites

  • Python 3.9+
  • LinkedIn data export (Settings → Data Privacy → Get a copy of your data)

Setup

git clone https://github.com/rosalinatorres/linkedin-brand-analyzer.git
cd linkedin-brand-analyzer
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Add Your LinkedIn Data

  1. Request your data export from LinkedIn (Settings → Data Privacy → Get a copy of your data)
  2. Wait for email (24-72 hours)
  3. Extract to data/raw/linkedin_export/

Launch Dashboard

streamlit run app/streamlit_app.py

Project Structure

linkedin-brand-analyzer/
├── config/
│   └── settings.yaml           # Configuration & influencer benchmarks
├── data/
│   ├── raw/                    # LinkedIn export data
│   ├── processed/              # Cleaned datasets
│   └── benchmarks/             # Industry comparison data
├── src/
│   ├── ingestion/              # Data loaders & parsers
│   │   └── linkedin_loader.py
│   ├── nlp/                    # NLP analysis modules
│   │   └── analyzer.py         # Sentiment, topics, NER
│   ├── network/                # Graph analytics
│   │   └── graph_builder.py    # NetworkX graphs, community detection
│   ├── engagement/             # Scoring & detection
│   │   └── scorer.py           # Recruiter detection, engagement metrics
│   └── visualization/          # Chart generation
├── app/
│   └── streamlit_app.py        # Interactive dashboard
├── notebooks/                  # Jupyter exploration & analysis
├── tests/                      # Unit tests
├── images/                     # Generated visualizations
├── requirements.txt
└── run.py                      # CLI entry point

Key Metrics

Metric Description Technical Implementation
Engagement Rate by Topic Which content themes drive the most interaction Keyword classifier + engagement aggregation
Recruiter Engagement Index Are decision-makers engaging with your posts NER + title matching + company tier scoring
Network Centrality Score How connected are you in your professional graph NetworkX degree, betweenness, PageRank
Sentiment-Engagement Correlation Does posting tone affect audience response VADER sentiment + Pearson correlation
Benchmark Percentile How you compare to industry influencers Configurable baseline comparison

Tech Stack

  • NLP: VADER, TextBlob, SpaCy, BERTopic
  • Network Analysis: NetworkX, python-louvain, PyVis
  • Visualization: Plotly, Streamlit, Matplotlib
  • Data Processing: Pandas, NumPy

Usage Examples

CLI

# Launch interactive dashboard
python run.py --dashboard

# Process LinkedIn data export
python run.py --process /path/to/linkedin/export

# Run full analysis pipeline
python run.py --analyze

Python API

from src.nlp.analyzer import SentimentAnalyzer, KeywordTopicClassifier
from src.network.graph_builder import NetworkBuilder, NetworkAnalyzer
from src.engagement.scorer import RecruiterDetector

# Sentiment analysis
analyzer = SentimentAnalyzer(method='vader')
results = analyzer.analyze_batch(posts['content'].tolist())

# Network analysis
builder = NetworkBuilder()
graph = builder.build_from_connections(connections_df)
network_analyzer = NetworkAnalyzer(graph)
communities = network_analyzer.detect_communities()

# Recruiter detection
detector = RecruiterDetector()
recruiters = detector.detect_batch(connections_df)

Learning Outcomes

This project demonstrates proficiency in:

  1. NLP Engineering: Sentiment analysis, topic modeling, named entity recognition
  2. Graph Analytics: Network construction, community detection, centrality measures
  3. Data Engineering: ETL pipelines, data quality, schema design
  4. ML Engineering: Classification, clustering, feature engineering
  5. Full-Stack Development: Streamlit dashboards, interactive visualization

Future Enhancements

  • BERTopic integration for advanced topic modeling
  • Time-series forecasting for engagement prediction
  • A/B testing framework for content optimization
  • LinkedIn API integration for real-time data
  • Export to Memory Brain for cross-system intelligence

Author

Rosalina Torres
MS Data Analytics Engineering, Northeastern University
LinkedIn | GitHub


License

MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors