-
-
Notifications
You must be signed in to change notification settings - Fork 19.5k
DOC: Add Google Colab data loading section #63354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e4ad3f4
c19f0df
395dc7b
8f83f03
d93e43c
e261bd0
0e480ad
901c546
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1613,6 +1613,53 @@ a permanent store. | |
| .. _fsimpl1: https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations | ||
| .. _fsimpl2: https://filesystem-spec.readthedocs.io/en/latest/api.html#other-known-implementations | ||
|
|
||
| Loading data in Google Colab notebooks | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Google Colab is a hosted Jupyter notebook environment. Since it runs remotely, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's link to it. |
||
| files must be explicitly uploaded or mounted before they can be read by pandas. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is useful to clarify for beginners, thanks. |
||
|
|
||
| Uploading local files | ||
| ^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Files can be uploaded directly to the Colab runtime using ``google.colab.files``: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| from google.colab import files | ||
| uploaded = files.upload() | ||
|
|
||
| import pandas as pd | ||
| df = pd.read_csv("data.csv") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| Using Google Drive | ||
| ^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Google Drive can be mounted to make files available to the runtime: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| from google.colab import drive | ||
| drive.mount("/content/drive") | ||
|
|
||
| import pandas as pd | ||
| df = pd.read_csv("/content/drive/MyDrive/data.csv") | ||
|
|
||
| Loading data from a URL | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can cut this section, as it isn't really specific to Colab. |
||
| ^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Data hosted remotely can be read directly using a URL: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| import pandas as pd | ||
|
|
||
| url = ( | ||
| "https://raw.githubusercontent.com/pandas-dev/pandas/main/" | ||
| "doc/data/air_quality_no2.csv" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this split enforced by pre-commit? I know you're trying to keep the lines short, but especially given this section is oriented to beginners, I think this split may confuse them. |
||
| ) | ||
| df = pd.read_csv(url) | ||
|
|
||
| Writing out data | ||
| '''''''''''''''' | ||
|
|
||
|
|
@@ -6520,3 +6567,4 @@ The files ``test.pkl.compress``, ``test.parquet`` and ``test.feather`` took the | |
| 24009288 Oct 10 06:43 test_fixed_compress.hdf | ||
| 24458940 Oct 10 06:44 test_table.hdf | ||
| 24458940 Oct 10 06:44 test_table_compress.hdf | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being more consistent with the other headings:
Can we also move this section to be right above
Google BigQuery?