• AI ABSOLUT
  • Posts
  • Storage Unlimited and for Free: How to Upload Files and Folder to. Huggingface Spaces in an Optimized WayNew Post

Storage Unlimited and for Free: How to Upload Files and Folder to. Huggingface Spaces in an Optimized WayNew Post

Storage Unlimited and for Free: How to Upload Files and Folder to. Huggingface Spaces in an Optimized Way

Storage Unlimited and for Free: How to Upload Files and Folder to. Huggingface Spaces in an Optimized Way

To upload files or folders to the Hugging Face Hub using huggingface_hub and to make use of the hf_transfer feature for improved upload speeds, here’s a quick guide based on your request:

Setup

  1. Installation: To enable hf_transfer for optimized file transfers, ensure that the Hugging Face Hub library is installed with this feature:

    pip install huggingface_hub[hf_transfer]

    Also, you must set an environment variable:

    export HF_HUB_ENABLE_HF_TRANSFER=1
  2. Authenticate: Before uploading files, authenticate your session with your Hugging Face token:

    from huggingface_hub import HfApi
    
    api = HfApi()
    api.login(token="your_hf_token")
    

     

Uploading a Single File

To upload a file to the Hub, use the upload_file() function:

from huggingface_hub import HfApi

api = HfApi()
api.upload_file(
    path_or_fileobj="/path/to/your/file.txt",  # Local file path
    path_in_repo="file.txt",  # Target path in the repo
    repo_id="username/your-repo-name",  # Your repo name
    repo_type="dataset"  # Can be "model", "dataset", or "space"
)

 

Uploading an Entire Folder

For uploading folders (especially when dealing with many files), use upload_folder():

api.upload_folder(
    folder_path="/path/to/your/folder",  # Local folder path
    repo_id="username/your-repo-name",  # Target repo
    path_in_repo="folder-in-repo",  # Optional, where in the repo to upload
    repo_type="dataset",  # Type of repo
    multi_commits=True,  # Useful for large uploads in multiple commits
    multi_commits_verbose=True  # Enable verbose output for progress
)

 

api.upload_folder( folder_path="/path/to/your/folder", # Local folder path repo_id="username/your-repo-name", # Target repo path_in_repo="folder-in-repo", # Optional, where in the repo to upload repo_type="dataset", # Type of repo multi_commits=True, # Useful for large uploads in multiple commits multi_commits_verbose=True # Enable verbose output for progress )

You can also add filters using the allow_patterns and ignore_patterns arguments to upload specific files:

api.upload_folder(
    folder_path="/path/to/your/folder",
    repo_id="username/your-repo-name",
    allow_patterns="*.json",  # Only upload .json files
    ignore_patterns="*.log"  # Ignore log files
)

 

Creating Commits for Custom Operations

If you need more granular control over the upload process (like uploading multiple files and performing deletions in one commit), you can use the create_commit() function:

from huggingface_hub import CommitOperationAdd, CommitOperationDelete, HfApi

api = HfApi()
operations = [
    CommitOperationAdd(path_in_repo="new_file.txt", path_or_fileobj="local_path/new_file.txt"),
    CommitOperationDelete(path_in_repo="old_file.txt")
]

api.create_commit(
    repo_id="username/your-repo-name",
    operations=operations,
    commit_message="Update with new files"
)

 

Performance Optimization

To optimize uploads, especially for large files or datasets, you can chunk your uploads using hf_transfer and multi-commit strategies, ensuring that if a transfer is interrupted, you can resume it without starting over.

These methods cover essential functionality for uploading files and managing repositories in Hugging Face Hub efficiently. Let me know if you need further assistance or a specific code example!