Architecture Overview

The Company Research Agent is built using CrewAI, a powerful framework for orchestrating multiple AI agents to work together on complex tasks. The architecture combines specialized agents, custom tools, and the Apify platform to deliver comprehensive company research.

High-Level Architecture

CrewAI Framework

Coordinates multiple AI agents working together to research and analyze company data

Apify Integration

Provides web scraping and data collection capabilities through specialized actors

Custom Tools

Purpose-built tools for interacting with various data sources

LLM Integration

Uses Google’s Gemini model for analysis and report generation

Core Components

1. CrewAI Crew

The CompanyResearchCrew class orchestrates three specialized agents:

@CrewBase
class CompanyResearchCrew:
    def crew(self) -> Crew:
        return Crew(
            agents=[
                self.researcher(),
                self.data_analyst(),
            ],
            tasks=[
                self.research_company(),
                self.analyze_data(),
            ],
            process=Process.sequential
        )

2. Specialized Agents

Each agent has a specific role in the research process:

Researcher: Gathers raw company information
Data Analyst: Processes and analyzes the collected data
Content Compiler: Formats the findings into structured reports

3. Custom Tools

The system includes specialized tools for different data sources:

tools=[
    CompanyNewsSearchTool(actor=self.actor),
    ProfessionalProfilesTool(actor=self.actor),
    LinkedInScraperTool(actor=self.actor),
    CrunchbaseScraperTool(actor=self.actor),
    PitchBookScraperTool(actor=self.actor),
    GoogleSearchTool(actor=self.actor)
]

4. Task Flow

The research process follows a sequential workflow:

Company research using domain name
Data analysis and insight extraction
Report compilation and formatting

Integration Points

Apify Platform

Actors: Uses specialized Apify actors for web scraping
Dataset: Stores and manages research results
Key-Value Store: Handles intermediate data storage

LLM Integration

self.llm = LLM(
    model="gemini/gemini-2.0-flash-lite",
    temperature=0.7,
    api_key=os.getenv("GOOGLE_API_KEY")
)

Data Flow

Input: Company domain name
Research Phase: Multiple tools gather data in parallel
Analysis Phase: Data is processed and analyzed
Output: Structured report in JSON format

Error Handling

The system includes robust error handling:

Tool-level error handling for each data source
Agent-level task retry mechanisms
Crew-level process management

Performance Optimization

Concurrent data collection where possible
Rate limiting controls
Memory management for large datasets

The architecture is designed to be modular and extensible. New agents, tools, or data sources can be added without major modifications to the core system.

Architecture

​Architecture Overview

​High-Level Architecture

CrewAI Framework

Apify Integration

Custom Tools

LLM Integration

​Core Components

​1. CrewAI Crew

​2. Specialized Agents

​3. Custom Tools

​4. Task Flow

​Integration Points

​Apify Platform

​LLM Integration

​Data Flow

​Error Handling

​Performance Optimization

Architecture Overview

High-Level Architecture

Core Components

1. CrewAI Crew

2. Specialized Agents

3. Custom Tools

4. Task Flow

Integration Points

Apify Platform

LLM Integration

Data Flow

Error Handling

Performance Optimization