CVE-2024-0243

Server-side Request Forgery In Recursive URL Loader

Severity Score

3.7

*CVSS v3

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

*Multiple Sources

Exploited in Wild

*KEV

Decision

Track*

*SSVC

Descriptions

With the following crawler configuration: ```python
from bs4 import BeautifulSoup as Soup url = "https://example.com"
loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text
)
docs = loader.load()
``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559

Con la siguiente configuración del rastreador: ```python de bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_ Depth=2, extractor=lambda x: Soup(x, "html .parser").text ) docs = loader.load() ``` Un atacante que controle el contenido de `https://example.com` podría colocar un archivo HTML malicioso allí con enlaces como "https:/example.completely.different/my_file.html" y el rastreador procedería a descargar ese archivo también aunque `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resuelto en https://github.com/langchain-ai/langchain/pull /15559

With the following crawler configuration: ```python from bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text ) docs = loader.load() ``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559

*Credits: N/A

Attack Vector

Local

Attack Complexity

High

Privileges Required

High

User Interaction

Required

Scope

Changed

Confidentiality

Low

Integrity

Low

Availability

None

Attack Vector

Network

Attack Complexity

Medium

Authentication

None

Confidentiality

None

Integrity

Partial

Availability

None

* Common Vulnerability Scoring System

SSVC

Decision:Track*

Exploitation

Poc

Automatable

Tech. Impact

Partial

* Organization's Worst-case Scenario

Timeline

2024-01-04 CVE Reserved
2024-02-24 CVE Published
2025-04-22 CVE Updated
2025-05-05 EPSS Updated
---------- Exploited in Wild
---------- KEV Due Date
---------- First Exploit

CWE

CWE-918: Server-Side Request Forgery (SSRF)

CAPEC

References (3)

URL	Tag	Source
https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22
https://github.com/langchain-ai/langchain/pull/15559
https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861

URL	Date	SRC

URL	Date	SRC

URL	Date	SRC

Affected Vendors, Products, and Versions

Vendor		Product				Version		Other		Status
Vendor	Product	Version	Other	Status	<-- -->	Vendor	Product	Version	Other	Status
Langchain-ai Search vendor "Langchain-ai"		Langchain-ai/langchain Search vendor "Langchain-ai" for product "Langchain-ai/langchain"				*		-		Affected