How Are Data Collected?

Data in Footprint DB are collected through honeypots strategically deployed across the internet. These honeypots are designed to attract bots and automated scanners that initiate requests aiming to exploit potential vulnerabilities in web systems.

The Process

Attracting Bots: Honeypots simulate common endpoints that malicious bots often target, such as vulnerable routes, sensitive files, and other potential entry points containing critical data.
Logging Requests: Every request made to the honeypots is recorded, including details such as:
- The requested endpoint.
- The HTTP method used (GET, POST, etc.).
- The origin IP address.
- The User Agent sent by the bot or client.
Analysis and Classification: Logs are processed by the system, which analyzes behavior patterns, request origin, and transmitted data. This process determines:
- If the request is legitimate (e.g., from a recognized crawler).
- Or if it is an automated attack (e.g., footprint enumeration, XSS attempts, or sensitive file exploitation).
Cataloging: When a threat is identified, it is cataloged in the Footprint DB with a risk classification and detailed information about the nature of the attack. These records are then used for research and to improve security systems.

Motivation

The primary goal of Footprint DB is to study, understand, and predict malicious behaviors in digital environments. Many of these attacks involve exploitation attempts, such as XSS attacks, directory traversal, and targeted probes for sensitive endpoints. By analyzing patterns in automated threats, Footprint DB aims to provide valuable insights to developers, security researchers, and organizations, helping them identify vulnerabilities and strengthen their defenses. This project is driven by a commitment to fostering a safer and more secure digital ecosystem.