Listing top Pypi keywords | BigQuery Datasets - PyPI Docs
15 octobre 2025 à 11:09
Using Google bq
CLI, the following command allows to get the top Pypi keywords from the bigquery-public-data.pypi.distribution_metadata
table:
bq query --use_legacy_sql=false 'SELECT keyword, COUNT(*) as keyword_count FROM `bigquery-public-data.pypi.distribution_metadata`, UNNEST(SPLIT(keywords, ", ")) as keyword GROUP BY keyword ORDER BY keyword_count DESC LIMIT 100'
Result for the top-15 keywords:
python
: 128555 appearancesDuckDB Database SQL OLAP
: 70739 appearancesai
: 64997 appearancestensorflow tensor machine learning
: 51144 appearancespulumi
: 50076 appearancesapi
: 47986 appearancesprobabilities probabilistic-graphical-models inference diagnosis
: 46552 appearancesrust
: 45607 appearancescli
: 39512 appearancesOpenAPI
: 38814 appearancessdk
: 38060 appearancesllm
: 37487 appearancesOpenAPI-Generator
: 36734 appearancesdatabase
: 35578 appearancesautomation
: 34393 appearances
Note that this is a very basic query, that does take into account that some packages have a lot more versions published on Pypi than others.
— Permalink