Vue normale

Reçu aujourd’hui — 15 octobre 2025

Listing top Pypi keywords | BigQuery Datasets - PyPI Docs

15 octobre 2025 à 11:09

Using Google bq CLI, the following command allows to get the top Pypi keywords from the bigquery-public-data.pypi.distribution_metadata table:

bq query --use_legacy_sql=false 'SELECT keyword, COUNT(*) as keyword_count FROM `bigquery-public-data.pypi.distribution_metadata`, UNNEST(SPLIT(keywords, ", ")) as keyword GROUP BY keyword ORDER BY keyword_count DESC LIMIT 100'

Result for the top-15 keywords:

  • python : 128555 appearances
  • DuckDB Database SQL OLAP : 70739 appearances
  • ai : 64997 appearances
  • tensorflow tensor machine learning : 51144 appearances
  • pulumi : 50076 appearances
  • api : 47986 appearances
  • probabilities probabilistic-graphical-models inference diagnosis : 46552 appearances
  • rust : 45607 appearances
  • cli : 39512 appearances
  • OpenAPI : 38814 appearances
  • sdk : 38060 appearances
  • llm : 37487 appearances
  • OpenAPI-Generator : 36734 appearances
  • database : 35578 appearances
  • automation : 34393 appearances

Note that this is a very basic query, that does take into account that some packages have a lot more versions published on Pypi than others.


Permalink
❌