Skip to content

Kaggle API

🍉 What is Kaggle API?

Kaggle API (Application Programming Interface) merupakan API yang disediakan oleh Kaggle yang memungkinkan kita bisa berinteraksi dengan dataset dan notebook Kaggle. Kaggle API dapat diakses melalui command-line tool (CLI). Kaggle CLI dapat di-install melalui:

pip install kaggle

🫑 Authentication

Sebelum dapat menggunakan Kaggle API, kita perlu melakukan otentikasi terlebih dahulu. Token otentikasi kaggle.json dapat diunduh pada kaggle.com/username/account. Selanjutnya, simpan/copy kaggle.json pada ./kaggle/kaggle.json.

🦩 Interacting with Dataset

  • Downloding Dataset
kaggle datasets download -d [DATASET]

usage: kaggle datasets download [-h] [-f FILE_NAME] [-p PATH] [-w] [--unzip] [-o] [-q] [dataset]

optional arguments:
  -h, --help            show this help message and exit
  dataset               Dataset URL suffix in format <owner>/<dataset-name> (use "kaggle datasets list" to show options)
  -f FILE_NAME, --file FILE_NAME
                        File name, all files downloaded if not provided
                        (use "kaggle datasets files -d <dataset>" to show options)
  -p PATH, --path PATH  Folder where file(s) will be downloaded, defaults to current working directory
  -w, --wp              Download files to current working path
  --unzip               Unzip the downloaded file. Will delete the zip file when completed.
  -o, --force           Skip check whether local version of file is up to date, force file download
  -q, --quiet           Suppress printing information about the upload/download progress
# example
kaggle datasets download --unzip --force -d didiruh/nuscene-mini 
  • Initiate Dataset
usage: kaggle datasets init [-h] [-p FOLDER]

optional arguments:
  -h, --help            show this help message and exit
  -p FOLDER, --path FOLDER
                        Folder for upload, containing data files and a special datasets-metadata.json file (https://github.com/Kaggle/kaggle-api/wiki/Dataset-Metadata). Defaults to current working directory
  • Upload Dataset
usage: kaggle datasets create [-h] [-p FOLDER] [-u] [-q] [-t] [-r {skip,zip,tar}]

optional arguments:
  -h, --help            show this help message and exit
  -p FOLDER, --path FOLDER
                        Folder for upload, containing data files and a special datasets-metadata.json file (https://github.com/Kaggle/kaggle-api/wiki/Dataset-Metadata). Defaults to current working directory
  -u, --public          Create publicly (default is private)
  -q, --quiet           Suppress printing information about the upload/download progress
  -t, --keep-tabular    Do not convert tabular files to CSV (default is to convert)
  -r {skip,zip,tar}, --dir-mode {skip,zip,tar}
                        What to do with directories: "skip" - ignore; "zip" - compressed upload; "tar" - uncompressed upload