Skip to content

[Bug]: Deep crawl fails in the Docker environment because Logger objects are converted into dictionaries. #1357

@takeaship

Description

@takeaship

crawl4ai version

0.7.0+

Expected Behavior

Logger objects should remain as logging.Logger instances throughout the application lifecycle, allowing normal logging operations like logger.info(), logger.error(), etc. to function properly in all environments including Docker containers.

Current Behavior

When running crawl4ai in Docker environments, logger objects are being serialized and deserialized incorrectly, causing them to become dictionary objects instead of proper logging.Logger instances. This results in AttributeError exceptions when attempting to call logging methods.

Is this reproducible?

Yes

Inputs Causing the Bug

- Environment: Docker container
- Components affected: Deep crawling strategies (BFSDeepCrawlStrategy, BestFirstCrawlingStrategy)
- Trigger: Object serialization/deserialization during Docker API operations
- Any crawling options that uses logging

Steps to Reproduce

1. Set up crawl4ai in a Docker environment
2. Initialize any deep crawling strategy (e.g., BFSDeepCrawlStrategy)
3. Execute a crawling operation via API that triggers the deep crawling strategy
4. Observe that `logger.info()` or other logging method calls fail with AttributeError
5. Inspect the logger object to find it has become a dictionary: `{'type': 'Logger', 'params': {'name': 'crawl4ai.deep_crawling.bfs_strategy'}}`

Code snippets

async def main():
    async with Crawl4aiDockerClient(
        base_url="http://crawl4ai-endpoint.com:11235", verbose=True
    ) as client:
        config = CrawlerRunConfig(
            deep_crawl_strategy=BFSDeepCrawlStrategy(
                max_depth=10,
                include_external=False,
                max_pages=50
            ),
            scraping_strategy=LXMLWebScrapingStrategy(),
            verbose=False,
        )
        results = await client.crawl(
            ["https://example.com/"],
            browser_config=BrowserConfig(
                headless=True
            ),
            crawler_config=config,
        )

OS

Linux docker

Python version

3.12

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

crawl4ai-1 | Traceback (most recent call last):
crawl4ai-1 | File "/app/api.py", line 439, in handle_crawl_request
crawl4ai-1 | results = await partial_func()
crawl4ai-1 | ^^^^^^^^^^^^^^^^^^^^
crawl4ai-1 | File "/app/crawl4ai/deep_crawling/base_strategy.py", line 24, in wrapped_arun
crawl4ai-1 | result_obj = await config.deep_crawl_strategy.arun(
crawl4ai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
crawl4ai-1 | File "/app/crawl4ai/deep_crawling/base_strategy.py", line 105, in arun
crawl4ai-1 | return await self._arun_batch(start_url, crawler, config)
crawl4ai-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^crawl4ai-dev-1 | File "/app/crawl4ai/deep_crawling/bfs_strategy.py", line 188, in _arun_batch
crawl4ai-1 | await self.link_discovery(result, url, depth, visited, next_level, depths)
crawl4ai-1 | File "/app/crawl4ai/deep_crawling/bfs_strategy.py", line 131, in link_discovery
crawl4ai-1 | self.logger.info(f"Limiting to {remaining_capacity} URLs due to max_pages limit")
crawl4ai-1 | ^^^^^^^^^^^^^^^^
crawl4ai-1 | AttributeError: 'dict' object has no attribute 'info'
crawl4ai-1 | 2025-08-03 02:54:07,012 - api - ERROR - Error closing crawler during exception handling: 'dict' object has no attribute 'info'

Metadata

Metadata

Labels

🐞 BugSomething isn't working

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions