2025-01-26 01:06:02 -06:00
2025-01-26 01:06:02 -06:00
2025-01-25 23:54:03 -06:00
2025-01-25 23:36:51 -06:00
2025-01-25 19:04:20 -06:00
2025-01-25 23:39:53 -06:00
2025-01-25 19:04:20 -06:00
2025-01-25 23:54:03 -06:00
2025-01-25 23:54:03 -06:00
2025-01-26 01:01:21 -06:00

Browser Recall

Browser Recall is a browser history and bookmark management system that captures, processes, and stores web page content in a searchable format. It consists of a browser extension and a FastAPI backend server that work together to provide the ability to search your the content of your browsing history and bookmarks.

Features

  • 🔍 Full-text search across browsing history and bookmarks
  • 📝 Automatic conversion of web pages to markdown format
  • 🔄 Real-time page content capture via WebSocket
  • Optimized SQLite database with FTS5 search
  • 🛡️ Configurable domain exclusions
  • 📊 Efficient content processing and storage

System Architecture

Backend Components

  • FastAPI Server: Main application server handling WebSocket connections and HTTP endpoints
  • SQLite Database: Stores history and bookmarks with full-text search capabilities
  • Page Reader: Converts HTML content to markdown format
  • History Scheduler: Background task for updating browser history
  • Configuration System: Manages domain exclusions and reader settings

Browser Extension

  • Content Script: Captures page content and sends to backend
  • Background Script: Manages WebSocket connection and message handling
  • Manifest: Extension configuration and permissions

Setup

Prerequisites

  • Python 3.8+
  • Firefox Browser (for the extension)
  • SQLite3

Installation

  1. Clone the repository:
git clone <repository-url>
cd browser-recall
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Install the browser extension:
    • Open Firefox
    • Navigate to about:debugging
    • Click "This Firefox"
    • Click "Load Temporary Add-on"
    • Select the manifest.json file from the extension directory

Configuration

  1. Configure domain exclusions in app/config.yaml:
ignored_domains:
  - "localhost"
  - "127.0.0.1"
  - "*.local"
  # Add more patterns as needed
  1. Configure the server port in main.py (default: 8523)

Usage

  1. Start the server:
python main.py
  1. The extension will automatically:

    • Capture page content as you browse
    • Send content to the backend server
    • Update history and bookmarks
  2. Access the web interface:

    • Home page: http://localhost:8523/
    • Search interface: http://localhost:8523/search
    • Bookmarks page: http://localhost:8523/bookmarks
  3. Access the API endpoints:

    • Search history: GET /history/search
    • Search bookmarks: GET /bookmarks/search
    • Advanced search: GET /history/search/advanced
    • Manage ignored domains: GET/POST/DELETE /config/ignored-domains

Web Interface

Browser Recall includes a basic web interface for viewing and searching your browsing history and bookmarks:

  • Home Page: Displays recent browsing history
  • Search Page: Provides a form interface for searching history with filters
  • Bookmarks Page: Shows your browser bookmarks

The interface is built with:

  • Tailwind CSS for styling
  • Responsive design for mobile and desktop
  • Dark mode for comfortable viewing

API Documentation

The API documentation is available through FastAPI's interactive interface at http://localhost:8523/docs. This provides a complete API reference with:

  • Interactive endpoint testing
  • Request/response examples
  • Schema documentation

History Endpoints

  • GET /history/search

    • Query parameters:
      • domain: Filter by domain
      • start_date: Filter by start date
      • end_date: Filter by end date
      • search_term: Full-text search
      • include_content: Include markdown content
  • GET /history/search/advanced

    • Advanced full-text search using SQLite FTS5 syntax

Bookmark Endpoints

  • GET /bookmarks/search
    • Query parameters:
      • domain: Filter by domain
      • folder: Filter by folder
      • search_term: Search in titles

Configuration Endpoints

  • GET /config/ignored-domains: List ignored domains
  • POST /config/ignored-domains: Add domain pattern
  • DELETE /config/ignored-domains/{pattern}: Remove domain pattern

Development

  • Logs are stored in the logs directory
  • Database file: browser_history.db
  • WebSocket endpoint: ws://localhost:8523/ws
Description
A tool to capture the content of visited pages as markdown in a SQLite database with search
Readme MIT 810 KiB
Languages
Python 83%
JavaScript 14.3%
Shell 2.7%