@agent-infra/browser-search

English | 简体中文

A tiny stealth-mode web search and content extraction library built on top of Puppeteer, inspired by EGOIST's local-web-search.

Features

🔍 Multi-Engine Search - Support for Google, Bing, and Baidu search engines
📑 Content Extraction - Extract readable content from web pages using Readability
🚀 Concurrent Processing - Built-in queue system for efficient parallel processing
🛡️ Stealth Mode - Advanced browser fingerprint spoofing
📝 Markdown Output - Automatically converts HTML content to clean Markdown
⚡ Performance Optimized - Smart request interception for faster page loads

Installation

npm install @agent-infra/browser-search
# or
yarn add @agent-infra/browser-search
# or
pnpm add @agent-infra/browser-search

Quick Start

import { BrowserSearch } from '@agent-infra/browser-search';
import { ConsoleLogger } from '@agent-infra/logger';

// Create a logger (optional)
const logger = new ConsoleLogger('[BrowserSearch]');

// Initialize the search client
const browserSearch = new BrowserSearch({
  logger,
  browserOptions: {
    headless: true,
  },
});

// Perform a search
const results = await browserSearch.perform({
  query: 'climate change solutions',
  count: 5,
});

console.log(`Found ${results.length} results`);
results.forEach((result) => {
  console.log(`Title: ${result.title}`);
  console.log(`URL: ${result.url}`);
  console.log(`Content preview: ${result.content.substring(0, 150)}...`);
});

API Reference

BrowserSearch

Constructor

constructor(config?: BrowserSearchConfig)

Configuration options:

interface BrowserSearchConfig {
  logger?: Logger; // Custom logger
  browser?: BrowserInterface; // Custom browser instance
}

perform(options)

Performs a search and extracts content from result pages.

async perform(options: BrowserSearchOptions): Promise<SearchResult[]>

Search options:

interface BrowserSearchOptions {
  query: string | string[]; // Search query or array of queries
  count?: number; // Maximum results to fetch
  concurrency?: number; // Concurrent requests (default: 15)
  excludeDomains?: string[]; // Domains to exclude
  truncate?: number; // Truncate content length
  browserOptions?: {
    headless?: boolean; // Run in headless mode
    proxy?: string; // Proxy server
    executablePath?: string; // Custom browser path
    profilePath?: string; // Browser profile path
  };
}

Response Type

interface SearchResult {
  title: string; // Page title
  url: string; // Page URL
  content: string; // Extracted content in Markdown format
}

Advanced Usage

Multiple Queries

const results = await browserSearch.perform({
  query: ['renewable energy', 'solar power technology'],
  count: 10, // Will fetch approximately 5 results per query
  concurrency: 5,
});

Domain Exclusion

const results = await browserSearch.perform({
  query: 'artificial intelligence',
  excludeDomains: ['reddit.com', 'twitter.com', 'youtube.com'],
  count: 10,
});

Content Truncation

const results = await browserSearch.perform({
  query: 'machine learning',
  count: 5,
  truncate: 1000, // Limit content to 1000 characters
});

Using with Custom Browser Instance

import { ChromeBrowser } from '@agent-infra/browser';
import { BrowserSearch } from '@agent-infra/browser-search';

const browser = new ChromeBrowser({
  // Custom browser configuration
});

const browserSearch = new BrowserSearch({
  browser,
});

const results = await browserSearch.perform({
  query: 'typescript best practices',
});

Examples

See examples.

Credits

Thanks to:

EGOIST for creating a great AI chatbot product ChatWise from which we draw a lot of inspiration for local-browser based search.
The puppeteer project which helps us operate the browser better.

License

Licensed under the Apache License, Version 2.0.

@agent-infra/browser-search

@agent-infra/browser-search

Features

Installation

Quick Start

API Reference

BrowserSearch

Constructor

perform(options)

Response Type

Advanced Usage

Multiple Queries

Domain Exclusion

Content Truncation

Using with Custom Browser Instance

Examples

Credits

License

Readme

Keywords

Package Sidebar

Install

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

@agent-infra/browser-search

@agent-infra/browser-search

Features

Installation

Quick Start

API Reference

BrowserSearch

Constructor

perform(options)

Response Type

Advanced Usage

Multiple Queries

Domain Exclusion

Content Truncation

Using with Custom Browser Instance

Examples

Credits

License

Readme

Keywords

Package Sidebar

Install

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads