Dropbox To Google Drive

Dropbox To Google Drive

Reading time1 min
#Cloud#Automation#Business#Dropbox#GoogleDrive#DataMigration

Dropbox to Google Drive Migration: Automated, Secure, and Metadata-Accurate

Data gravity increases as organizations scale—often, so do cloud storage constraints. A typical scenario: Dropbox served well at 20 users, but now, with a cross-functional team standardized on Google Workspace, the legacy file estate blocks integration, search, and DLP efforts. Manual migration? Unrealistic for anything beyond toy-scale directories.

Below, the methodology and practical scripts for a custom Dropbox-to-Google Drive migration. This approach eliminates drag-and-drop errors and mitigates exposure from third-party migration vendors. Designed for teams concerned about chain of custody, metadata preservation, and predictable cutover.


Key Objectives

  • Preserve file/folder structure, timestamps, and (as feasible) permissions.
  • Minimize business-impacting downtime.
  • Avoid unnecessary data leakage via SaaS “helpers”.

Some unsolved edges—full version history or atomic sharing permission matching—require bespoke mapping. The following scriptable process delivers 95% of what most organizations expect.


Recommended Toolchain & Versions

ComponentVersion (tested)Purpose
Python3.10.xScripting and orchestration
dropbox>=11.34.0 (pip)Dropbox API access
google-api-python-client>=2.110.0 (pip)Google Drive API integration
google-auth, google-auth-oauthlib, google-auth-httplib2LatestAuth for Google API

OAuth 2.0 configuration required on both vendors.


1. API Credentials: Setup

Dropbox

  • Navigate: https://www.dropbox.com/developers/apps.
  • Create app with “Full Dropbox” access (scoped).
  • Generate a long-lived OAuth access token.
  • Store in a secure secret backend or environment variable (DROPBOX_TOKEN).

Google Drive

  • Google Cloud Console: create a dedicated project (don’t piggyback infra under “default”).
  • Enable the Drive API.
  • Under “APIs & Services > Credentials”: generate Client ID and select “Desktop App”.
  • Download credentials.json, store alongside the script, or point GOOGLE_APPLICATION_CREDENTIALS.
  • Assign the correct OAuth 2.0 scopes:
    • https://www.googleapis.com/auth/drive
  • If using a service account: pay attention—a service account can't access user Drives without domain-wide delegation.

2. Authentication Initialization

Install dependencies:

pip install dropbox google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client

Minimal Dropbox client instantiation:

import dropbox

dbx = dropbox.Dropbox(os.environ['DROPBOX_TOKEN'])

For Google Drive:

from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
import pickle
import os

GOOGLE_SCOPES = ['https://www.googleapis.com/auth/drive']

def drive_auth():
    creds = None
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    if not creds or not creds.valid:
        flow = InstalledAppFlow.from_client_secrets_file('credentials.json', GOOGLE_SCOPES)
        creds = flow.run_local_server(port=8580)
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)
    return build('drive', 'v3', credentials=creds)

drive_service = drive_auth()

Note: If “redirect_uri_mismatch” occurs when running on headless VM, substitute with run_console() instead of run_local_server().


3. Enumerate Dropbox Objects and Extract Metadata

Fetching full metadata and paths crucial for reconstructing the Drive folder tree.

def list_all_dropbox_files(path=''):
    files = []
    res = dbx.files_list_folder(path, recursive=True)
    while True:
        for entry in res.entries:
            if isinstance(entry, dropbox.files.FileMetadata):
                files.append(entry)
        if not res.has_more:
            break
        res = dbx.files_list_folder_continue(res.cursor)
    return files

dropbox_files = list_all_dropbox_files()
print(f"Dropbox files discovered: {len(dropbox_files)}")

Side note: For accounts >50,000 entries, Dropbox can intermittently throw TooManyRequestsError. Throttle or chunk fetches accordingly.


4. Download from Dropbox (Efficiently)

Local download as intermediate buffer:

import io

def get_dropbox_file_buffer(metadata):
    _, response = dbx.files_download(metadata.path_lower)
    return io.BytesIO(response.content)

For larger data volumes, consider streaming directly to cloud storage bucket.


5. Reconstructing Folder Structure and Uploading to Drive

Key pitfall: Google Drive folder “names” are not globally unique—tree context matters.

Folder creation/lookup logic should cache Drive folder IDs to reduce API quotas:

folder_cache = {}

def get_or_create_drive_folder(path, parent_id=None):
    if path in folder_cache:
        return folder_cache[path]
    folder_metadata = {
        'name': path.split('/')[-1],
        'mimeType': 'application/vnd.google-apps.folder',
    }
    if parent_id:
        folder_metadata['parents'] = [parent_id]
    results = drive_service.files().list(
        q=f"mimeType='application/vnd.google-apps.folder' and name='{folder_metadata['name']}'" +
          (f" and '{parent_id}' in parents" if parent_id else ""),
        spaces='drive',
        fields='files(id, name)').execute()
    files = results.get('files', [])
    if files:
        folder_id = files[0]['id']
    else:
        folder_id = drive_service.files().create(body=folder_metadata, fields='id').execute()['id']
    folder_cache[path] = folder_id
    return folder_id

Actual upload: preserve timestamps.

from googleapiclient.http import MediaIoBaseUpload
import os

def upload_to_drive(dropbox_file):
    folder_path, filename = os.path.split(dropbox_file.path_display)
    parent_id = get_or_create_drive_folder(folder_path if folder_path else '/')
    file_meta = {
        'name': filename,
        'parents': [parent_id],
        'modifiedTime': dropbox_file.client_modified.isoformat() + 'Z'
    }
    bytes_stream = get_dropbox_file_buffer(dropbox_file)
    media = MediaIoBaseUpload(bytes_stream, mimetype='application/octet-stream', resumable=True)
    uploaded = drive_service.files().create(
        body=file_meta, media_body=media, fields='id').execute()
    print(f"Uploaded: {filename} ({uploaded['id']})")
    return uploaded['id']

Known issue: MIME type autodetection is simplistic. For post-migration editing in Google Docs/Sheets, trigger Drive's conversion via convert=true parameter.


6. Optional: Sharing ACL Alignment

Dropbox and Drive use incompatible ACL models. For high-value shares, manually enumerate Dropbox member lists:

shared_links = dbx.sharing_list_shared_links(path=dropbox_file.path_lower)

Apply corresponding members and permissions with Drive's permissions().create.

Trade-off: Mapping "Viewer" and "Editor" is straightforward; "Can comment" roles require granular Drive permissions. Not all Dropbox permissions map cleanly. Audit post-migration.


7. Orchestration and Logging

Automate sequentially or with concurrency (e.g., asyncio, multiprocessing with bounded concurrency—Google Drive rate-limits aggressively).

Error handling example:

for dbx_file in dropbox_files:
    try:
        upload_to_drive(dbx_file)
    except Exception as exc:
        print(f"Failed: {dbx_file.path_lower} | {exc!r}")
        # Optional: write to dead-letter CSV for later retry

For real-world runs, persist migration state to disk (sqlite3, CSV, etc.)—Google API's occasional 500/429 errors will otherwise force time-consuming reruns.


Non-Obvious Considerations

  • Shared Drives (Shared Drive/Team Drive): Use only if your Workspace is correctly provisioned.
  • Google Drive item limits: Hard cap of 400,000 items per My Drive; plan accordingly for large-scale orgs.
  • Timestamps: API quotas for modifiedTime set operations can be restrictive—schedule jobs during off-peak.
  • Large files: For files >150MB, use resumable uploads.
  • File ID re-use: Drive’s “duplicate” detection is inconsistent—dedupe source data if Dropbox holds many similarly-named flat files.

Example Error Output

Google API quota exceeded during batch upload:

googleapiclient.errors.HttpError: <HttpError 403 when requesting ... returned "User rate limit exceeded.">

Mitigation: Implement exponential backoff with jitter on failure.


Final Notes

Perfect migration parity isn’t always the goal—document what is not retained (e.g., version history) for compliance records. For edge scenarios (thousands of shared links, terabyte repositories), evaluate cloud-to-cloud migration vendors (e.g., CloudM, Mover.io) but vet for GDPR/SOC2 requirements.

Test initial runs against a subset of folders; validate structure and permissions before scaling.


For further automation (scheduled/incremental syncs, migration monitoring dashboards), see Drive API push notifications and Dropbox webhooks. Questions, or need a sample repo? Contact via email—public repositories with migration logic often require per-org customization.


No migration is invisible—track, document, audit, and expect a few rough edges.