@agent-infra/browser-search
TypeScript icon, indicating that this package has built-in type declarations

0.0.2 • Public • Published

@agent-infra/browser-search

npm version downloads node version license

English | 简体中文

A tiny stealth-mode web search and content extraction library built on top of Puppeteer, inspired by EGOIST's local-web-search.

Features

  • 🔍 Multi-Engine Search - Support for Google, Bing, and Baidu search engines
  • 📑 Content Extraction - Extract readable content from web pages using Readability
  • 🚀 Concurrent Processing - Built-in queue system for efficient parallel processing
  • 🛡️ Stealth Mode - Advanced browser fingerprint spoofing
  • 📝 Markdown Output - Automatically converts HTML content to clean Markdown
  • Performance Optimized - Smart request interception for faster page loads

Installation

npm install @agent-infra/browser-search
# or
yarn add @agent-infra/browser-search
# or
pnpm add @agent-infra/browser-search

Quick Start

import { BrowserSearch } from '@agent-infra/browser-search';
import { ConsoleLogger } from '@agent-infra/logger';

// Create a logger (optional)
const logger = new ConsoleLogger('[BrowserSearch]');

// Initialize the search client
const browserSearch = new BrowserSearch({
  logger,
  browserOptions: {
    headless: true,
  },
});

// Perform a search
const results = await browserSearch.perform({
  query: 'climate change solutions',
  count: 5,
});

console.log(`Found ${results.length} results`);
results.forEach((result) => {
  console.log(`Title: ${result.title}`);
  console.log(`URL: ${result.url}`);
  console.log(`Content preview: ${result.content.substring(0, 150)}...`);
});

API Reference

BrowserSearch

Constructor

constructor(config?: BrowserSearchConfig)

Configuration options:

interface BrowserSearchConfig {
  logger?: Logger; // Custom logger
  browser?: BrowserInterface; // Custom browser instance
}

perform(options)

Performs a search and extracts content from result pages.

async perform(options: BrowserSearchOptions): Promise<SearchResult[]>

Search options:

interface BrowserSearchOptions {
  query: string | string[]; // Search query or array of queries
  count?: number; // Maximum results to fetch
  concurrency?: number; // Concurrent requests (default: 15)
  excludeDomains?: string[]; // Domains to exclude
  truncate?: number; // Truncate content length
  browserOptions?: {
    headless?: boolean; // Run in headless mode
    proxy?: string; // Proxy server
    executablePath?: string; // Custom browser path
    profilePath?: string; // Browser profile path
  };
}

Response Type

interface SearchResult {
  title: string; // Page title
  url: string; // Page URL
  content: string; // Extracted content in Markdown format
}

Advanced Usage

Multiple Queries

const results = await browserSearch.perform({
  query: ['renewable energy', 'solar power technology'],
  count: 10, // Will fetch approximately 5 results per query
  concurrency: 5,
});

Domain Exclusion

const results = await browserSearch.perform({
  query: 'artificial intelligence',
  excludeDomains: ['reddit.com', 'twitter.com', 'youtube.com'],
  count: 10,
});

Content Truncation

const results = await browserSearch.perform({
  query: 'machine learning',
  count: 5,
  truncate: 1000, // Limit content to 1000 characters
});

Using with Custom Browser Instance

import { ChromeBrowser } from '@agent-infra/browser';
import { BrowserSearch } from '@agent-infra/browser-search';

const browser = new ChromeBrowser({
  // Custom browser configuration
});

const browserSearch = new BrowserSearch({
  browser,
});

const results = await browserSearch.perform({
  query: 'typescript best practices',
});

Examples

See examples.

Credits

Thanks to:

  • EGOIST for creating a great AI chatbot product ChatWise from which we draw a lot of inspiration for local-browser based search.
  • The puppeteer project which helps us operate the browser better.

License

Copyright (c) 2025 ByteDance, Inc. and its affiliates.

Licensed under the Apache License, Version 2.0.

Readme

Keywords

none

Package Sidebar

Install

npm i @agent-infra/browser-search

Weekly Downloads

73

Version

0.0.2

License

none

Unpacked Size

198 kB

Total Files

63

Last publish

Collaborators

  • ulivz
  • ycjcl868