// For flags

CVE-2024-0243

Server-side Request Forgery In Recursive URL Loader

Severity Score

3.7
*CVSS v3

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

0
*Multiple Sources

Exploited in Wild

-
*KEV

Decision

Track*
*SSVC
Descriptions

With the following crawler configuration: ```python
from bs4 import BeautifulSoup as Soup url = "https://example.com"
loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text
)
docs = loader.load()
``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559

Con la siguiente configuración del rastreador: ```python de bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_ Depth=2, extractor=lambda x: Soup(x, "html .parser").text ) docs = loader.load() ``` Un atacante que controle el contenido de `https://example.com` podría colocar un archivo HTML malicioso allí con enlaces como "https:/example.completely.different/my_file.html" y el rastreador procedería a descargar ese archivo también aunque `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resuelto en https://github.com/langchain-ai/langchain/pull /15559

With the following crawler configuration: ```python from bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text ) docs = loader.load() ``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559

*Credits: N/A
CVSS Scores
Attack Vector
Local
Attack Complexity
High
Privileges Required
High
User Interaction
Required
Scope
Changed
Confidentiality
Low
Integrity
Low
Availability
None
Attack Vector
Network
Attack Complexity
Medium
Authentication
None
Confidentiality
None
Integrity
Partial
Availability
None
* Common Vulnerability Scoring System
SSVC
  • Decision:Track*
Exploitation
Poc
Automatable
No
Tech. Impact
Partial
* Organization's Worst-case Scenario
Timeline
  • 2024-01-04 CVE Reserved
  • 2024-02-24 CVE Published
  • 2024-03-14 EPSS Updated
  • 2024-08-01 CVE Updated
  • ---------- Exploited in Wild
  • ---------- KEV Due Date
  • ---------- First Exploit
CWE
  • CWE-918: Server-Side Request Forgery (SSRF)
CAPEC
Affected Vendors, Products, and Versions
Vendor Product Version Other Status
Vendor Product Version Other Status <-- --> Vendor Product Version Other Status
Langchain-ai
Search vendor "Langchain-ai"
Langchain-ai/langchain
Search vendor "Langchain-ai" for product "Langchain-ai/langchain"
*-
Affected