Skip to content

onoja123/knowledge-sweeper-backend

Repository files navigation

Knowledge Sweeper Backend

Overview

Knowledge Sweeper is a backend service for syncing, organizing, searching, and analyzing messages from Discord and Slack. It leverages advanced NLP and AI for message summarization, categorization, and semantic search, powered by Elasticsearch. The backend supports user OAuth, auto-sync, collection management, and robust error handling. Built with Node.js, Express, MongoDB, and integrates with Discord, Slack, and Elasticsearch APIs.

Features

  • Discord and Slack OAuth authentication
  • Auto-sync messages after integration
  • Auto-population of message collections
  • Advanced search, filtering, and faceted navigation using Elasticsearch
  • NLP and AI-powered message summarization, categorization, sentiment analysis, and keyword extraction
  • Semantic search and "more like this" recommendations
  • RESTful API endpoints for integrations, message sync, and collection management
  • Robust logging and error handling
  • Pagination, search, and filtering for messages
  • Role-based access and permission checks

Technologies

  • Node.js
  • Express.js
  • MongoDB (Mongoose)
  • TypeScript
  • Discord API (user OAuth, bot OAuth)
  • Slack API (OAuth)
  • Elasticsearch (search, analytics, recommendations)
  • Natural Language Processing (NLP) and AI (summarization, categorization, sentiment)
  • Docker (optional)

Setup

Prerequisites

  • Node.js >= 16
  • MongoDB
  • Discord and Slack developer credentials
  • Docker (optional)

Installation

git clone https://github.com/onoja123/knowledge-sweeper-backend.git
cd knowledge-sweeper-backend
npm install

Environment Variables

Create a .env file in the root directory and set:

# Server
PORT=
NODE_ENV=

# Database
MONGODB_URI=

# JWT
JWT_SECRET_KEY=
JWT_COOKIE_EXPIRES_IN=
JWT_EXPIRES_IN=

# Slack
SLACK_CLIENT_ID=.
SLACK_CLIENT_SECRET=
SLACK_SIGNING_SECRET=

# Discord
DISCORD_CLIENT_ID=
DISCORD_CLIENT_SECRET=
DISCORD_BOT_TOKEN=

# Stripe
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=

# Email
SMTP_HOST=
SMTP_PORT=
SMTP_USER=
SMTP_PASS=

# Frontend URL
FRONTEND_URL=

OPENAI_API_KEY=
ANTHROPIC_API_KEY=

# Redis URL
REDIS_HOST=
REDIS_PORT=
REDIS_PASSWORD=

# Google OAuth
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GOOGLE_CALLBACK_URL=

Running Locally

npm run dev

Docker

docker-compose up --build

API Endpoints

Discord

  • POST /api/v1/discord/auth-url — Get Discord OAuth URL
  • POST /api/v1/discord/callback — Handle Discord OAuth callback
  • POST /api/v1/discord/connect-server — Connect Discord server
  • GET /api/v1/discord/synced-messages — Get synced Discord messages
  • POST /api/v1/discord/disconnect — Disconnect Discord server

Slack

  • POST /api/v1/slack/auth-url — Get Slack OAuth URL
  • POST /api/v1/slack/callback — Handle Slack OAuth callback
  • POST /api/v1/slack/connect-workspace — Connect Slack workspace
  • GET /api/v1/slack/synced-messages — Get synced Slack messages
  • POST /api/v1/slack/disconnect — Disconnect Slack workspace

Collections

  • GET /api/v1/collections — List user collections
  • POST /api/v1/collections — Create a new collection
  • PUT /api/v1/collections/:id — Update a collection
  • DELETE /api/v1/collections/:id — Delete a collection

Search & AI

  • POST /api/v1/search — Advanced search with filters, semantic relevance, and highlighting
  • GET /api/v1/suggest — Autocomplete and phrase suggestions
  • GET /api/v1/messages/:id/similar — Find similar messages ("more like this")

Message Sync & Collections

  • After OAuth, messages are auto-synced and added to a default collection for each platform.
  • Collections can be managed via API (CRUD).
  • Messages are tagged, searchable, and paginated.

NLP, AI & Elasticsearch

  • NLP and AI modules extract summaries, categories, sentiment, keywords, and action items from messages.
  • Elasticsearch powers fast, faceted, and semantic search, autocomplete, and recommendations.

Logging & Error Handling

  • Uses Winston logger for info, warn, and error logs.
  • Handles permission errors (Discord 50001, Slack not_in_channel) gracefully.
  • Skips inaccessible channels and continues syncing.

Architecture

  • Modular services for Discord, Slack, NLP/AI, Elasticsearch, and collections
  • Extensible for new platforms and analytics features

Contributing

  1. Fork the repo
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Commit your changes (git commit -am 'Add some fooBar')
  4. Push to the branch (git push origin feature/fooBar)
  5. Create a new Pull Request

License

MIT

Maintainers

Troubleshooting

  • If you see "Missing Access" errors, ensure the bot/user has the correct permissions in Discord/Slack.
  • Only server admins can add bots to Discord servers.
  • For full sync, admin must grant "Read Message History" and "View Channels" permissions.

Contact

For support, open an issue or contact the maintainers.

About

AI-powered backend for syncing, searching, and analyzing Discord and Slack messages using NLP and Elasticsearch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages