Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Common Crawl Foundation

Team
non-profit
Verified
https://commoncrawl.org
commoncrawl
commoncrawl
Activity Feed

AI & ML interests

Crawled data and metadata

Recent Activity

lfoppiano  published a dataset 6 days ago
commoncrawl/web-graph-testing-v1
lfoppiano  updated a dataset 7 days ago
commoncrawl/web-graph-testing-v1
malteos  updated a bucket 8 days ago
commoncrawl/commoncrawl
View all activity

Thom Vaughan's profile picturePedro Ortiz Suarez's profile picturePaul Lazar's profile pictureGreg Lindahl's profile pictureFord H's profile pictureJen English's profile pictureSebastian Nagel's profile pictureLaurie Burchell's profile pictureHande Celikkanat's profile picturemalteos's profile pictureThijs Dalhuijsen's profile pictureLuca's profile pictureCatherine Arnett's profile pictureMichael Paris's profile picture

commoncrawl 's datasets 9

commoncrawl/web-graph-testing-v1

Updated 7 days ago • 12

commoncrawl/statistics

Viewer • Updated 11 days ago • 631k • 451 • 27

commoncrawl/commonlid-results

Preview • Updated 29 days ago • 982 • 1

commoncrawl/citations

Viewer • Updated Apr 2 • 9.18k • 126 • 3

commoncrawl/CommonLID

Viewer • Updated Feb 10 • 373k • 203 • 53

commoncrawl/gneissweb-annotation-host-testing-v1

Viewer • Updated Dec 11, 2025 • 617M • 57

commoncrawl/gneissweb-annotation-url-testing-v1

Viewer • Updated Dec 10, 2025 • 11.5B • 6.04k

commoncrawl/host-index-testing-v2

Preview • Updated Nov 10, 2025 • 35.7k

commoncrawl/eot2024_hostlevel_logs

Viewer • Updated Oct 9, 2024 • 271k • 7 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs