Dibas Kumar Borborah

Full Stack Engineer | AI/ML Enthusiast

Demo working preview

🎬 Building an Agentic AI Movie Recommendation System

Using Graph Databases, Semantic Search, and Community Signals

Finding a movie that matches your mood or taste shouldn't feel like solving a puzzle. That's why we built an agentic AI movie recommendation system β€” a smart assistant that understands natural language queries and surfaces meaningful recommendations using a combination of graph reasoning, semantic similarity, and community discussions.

πŸ”— Project Links

βš™οΈ Tech Stack

Here’s a peek into the technologies powering this agentic movie recommendation system:

🧠 AI & Backend Intelligence

  • FastAPI (Python) β€” High-performance API orchestration
  • LangChain β€” LLM chain management, entity extraction & tool routing
  • LLaMA 3.3 β€” Core LLM for NER, sentiment understanding, and query parsing
  • Jina Embeddings v3 β€” Converts text into vector space for semantic matching
  • Quadrant β€” Vector database for similarity search
  • Neo4j Aura β€” Graph database for movie relationships (genres, directors, etc.)
  • Brave Search β€” Real-time web search for Reddit and Letterboxd integration

πŸ”„ Real-time Query Flow

  • Server-Sent Events (SSE) β€” Streams intermediate updates like:
    • βœ… Entity extracted: { "genre": "thriller", "director": "Nolan" }
    • πŸ“Š Cypher query executed...
    • πŸ” Vector search complete...

🎨 Frontend Framework

  • Vite β€” Ultra-fast frontend bundler for blazing-fast dev experience

🧰 State & Styling

  • nuqs β€” Syncs search state to URL for sharable sessions
  • zustand β€” Lightweight and reactive state store
  • shadcn/ui β€” Accessible UI components powered by Radix
  • Tailwind CSS β€” Utility-first styling for consistent design

Let's break down how it works πŸ‘‡


🧠 Natural Language Understanding with LLMs

When a user enters a search query, we use an LLM (LLaMA 3.3) to process the text. The model performs Named Entity Recognition (NER) to extract important information like:

  • 🎬 Movie titles
  • πŸ‘¨β€πŸŽ€ Actors / Directors
  • πŸ“… Years
  • 🎭 Genres
  • 🌐 Search modifiers (e.g., "Search Reddit", "Letterboxd")

πŸ•ΈοΈ Querying the Graph Database (Neo4j)

If the query includes structured entities like genres, directors, or release years, we map the extracted elements into a prebuilt Cypher query to search our Neo4j movie graph.

πŸ“Š What's in the Graph?

  • 49,369 nodes including movies, people, genres, and years
  • 132,772 relationships, like:
    • ACTED_IN
    • DIRECTED_BY
    • HAS_GENRE
    • RELEASED_IN

Example:
"Show me action movies directed by Christopher Nolan"
β†’ Cypher query pulls all action genre movies with the DIRECTED_BY relation pointing to Nolan.


🧬 Semantic Search for Similar Movies

If the user asks for movies similar to another movie, e.g.
"Show me movies like The Departed" β€”
we switch to semantic search.

How It Works:

  1. Extract the movie title (The Departed) using NER.
  2. Fetch its summary embedding from Quadrant DB (vector database).
  3. Perform cosine similarity search against other embedded movie summaries.
  4. Return the top N most similar results.

Embeddings are generated using jina-embeddings-v3 and stored in Quadrant.


πŸ§‘β€πŸ€β€πŸ§‘ Community-Driven Discovery (Reddit + Letterboxd)

Our agent can also leverage online movie communities:

πŸ” Reddit Integration

If the query contains "Search Reddit", the agent:

  1. Uses Brave Search to find top reddit.com links.
  2. Scrapes comment threads from those posts.
  3. Uses the LLM to extract movie mentions from user comments.

πŸ“½οΈ Letterboxd Integration

For "Search Letterboxd", the agent:

  1. Finds top lists or reviews via Brave.
  2. Directly extracts movie names from curated Letterboxd lists (no scraping comments).

Query-to-Movie Semantic Search

We also support full-query semantic matching β€” ideal when the user describes the kind of movie they want without naming one.

Movies where the protagonist invents a time machine and travels back in time to his childhood to fix his mistakes

How it works:

  1. Embed the entire query using jina-embeddings-v3.
  2. Search the Quadrant vector database using cosine similarity.
  3. Fetch top-N most semantically similar movie summaries.

🧰 Databases Used

1. Neo4j Graph Database

  • Stores structured relationships between movies, people, genres, and time.
  • Powers reasoning-style queries based on graph paths.

2. Quadrant Vector Database

  • Stores dense embeddings of movie summaries.
  • Enables similarity search using cosine distance.

πŸš€ Bringing It All Together

This hybrid system empowers users to explore the movie world using just natural language. Whether you're searching by genre, similarity, director, or online opinions β€” the AI understands your intent and routes the query through the best tool for the job.

This is agentic AI in action β€” combining structured reasoning, semantic intelligence, and real-time web knowledge to help you find your next favorite film πŸŽ₯🍿