tensorlakeai · Ashish-Abraham · Nov 5, 2024 · Nov 5, 2024 · Nov 5, 2024 · Nov 5, 2024
diff --git a/examples/conversation_extraction/README.md b/examples/conversation_extraction/README.md
@@ -0,0 +1,92 @@
+# Meeting Conversation Extractor with Indexify
+
+This project demonstrates how to build a meeting conversation extraction pipeline using Indexify. The pipeline processes audio files, transcribes them, classifies the content, and generates structured summaries based on the meeting type.
+
+## Features
+
+- Speech-to-text transcription using Faster Whisper
+- Meeting type classification using Llama.cpp
+- Structured summaries for different meeting types:
+  - Strategy meetings
+  - Sales/Marketing/Product calls  
+  - R&D brainstorming sessions
+
+## Prerequisites
+
+- Python 3.9+
+- Docker and Docker Compose (for containerized setup)
+
+## Installation and Usage
+
+### Option 1: Local Installation - In Process
+
+1. Clone this repository:
+   ```
+   git clone https://github.com/tensorlakeai/indexify
+   cd indexify/examples/conversation_extraction
+   ```
+
+2. Create a virtual environment and activate it:
+   ```
+   python -m venv venv
+   source venv/bin/activate
+   ```
+
+3. Install the required dependencies:
+   ```
+   pip install -r requirements.txt
+   ```
+
+4. Run the main script:
+   ```
+   python main.py --mode in-process-run
+   ```
+
+### Option 2: Using Docker Compose - Deployed Graph
+
+1. Clone this repository:
+   ```
+   git clone https://github.com/tensorlakeai/indexify
+   cd indexify/examples/conversation_extraction
+   ```
+
+2. Ensure Docker and Docker Compose are installed on your system.
+
+3. Build the Docker images for each function in the pipeline.
+
+4. Start the services:
+   ```
+   docker-compose up --build
+   ```
+
+5. Deploy the graph:
+   ```
+   python main.py --mode remote-deploy
+   ```
+
+6. Run the workflow:
+   ```
+   python main.py --mode remote-run
+   ```
+
+## Workflow
+
+1. **Audio Processing:**
+   - Transcription: Converts speech to text using Faster Whisper, including speaker diarization
+   - Meeting Classification: Uses LLM to determine the type of meeting
+
+2. **Content Analysis:**
+   Based on the meeting type classification, the system generates structured summaries:
+   - Strategy Meetings: Key decisions, action items, and strategic initiatives
+   - Sales/Marketing/Product Calls: Customer details, pain points, and next steps
+   - R&D Brainstorms: Innovative ideas, technical challenges, resource requirements, and potential impacts
+
+## Graph Structure
+
+The project uses the following Indexify graph:
+
+```
+transcribe_audio -> classify_meeting_intent -> router -> summarize_strategy_meeting
+                                                      -> summarize_sales_call
+                                                      -> summarize_rd_brainstorm
+```
diff --git a/examples/conversation_extraction/docker-compose.yml b/examples/conversation_extraction/docker-compose.yml
@@ -0,0 +1,48 @@
+networks:
+  server:
+services:
+  indexify:
+    image: tensorlake/indexify-server
+    ports:
+      - 8900:8900
+    networks:
+      server:
+        aliases:
+          - indexify-server
+    volumes:
+      - data:/app
+
+  audio-processor:
+    image: tensorlake/audio-processor:latest
+    command: ["indexify-cli", "executor", "--server-addr", "indexify:8900"]
+    networks:
+      server:
+    volumes:
+      - data:/app
+
+  transcriber:
+    image: tensorlake/transcriber:latest
+    command: ["indexify-cli", "executor", "--server-addr", "indexify:8900"]
+    networks:
+      server:
+    volumes:
+      - data:/app
+
+  router:
+    image: tensorlake/router:latest
+    command: ["indexify-cli", "executor", "--server-addr", "indexify:8900"]
+    networks:
+      server:
+    volumes:
+      - data:/app
+
+  llama-cpp:
+    image: tensorlake/llama-cpp:latest
+    command: ["indexify-cli", "executor", "--server-addr", "indexify:8900"]
+    networks:
+      server:
+    volumes:
+      - data:/app    
+
+volumes:
+  data: