Skip to content

Fix #5627: SFT example notebook references inaccessible S3 dataset URI#5704

Open
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5627-sft-example-notebook-references-inaccess
Open

Fix #5627: SFT example notebook references inaccessible S3 dataset URI#5704
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5627-sft-example-notebook-references-inaccess

Conversation

@JiwaniZakir
Copy link
Copy Markdown

Closes #5627

Motivation

The SFT finetuning example notebook hardcoded an internal S3 URI (s3://mc-flows-sdk-testing/...) that external users cannot access, causing an immediate 403 Forbidden error when running the dataset registration cell.

Changes

File: v3-examples/model-customization-examples/sft_finetuning_example_notebook_pysdk_prod_v3.ipynb

  • Dataset registration cell (~line 85): Replaced the hardcoded s3://mc-flows-sdk-testing/input_data/sft/sample_data_256_final.jsonl source with a named placeholder variable MY_DATASET_S3_URI = "s3://<your-bucket>/<path-to-your-dataset>.jsonl" marked with a # TODO comment. Added an explanatory comment block describing the required JSONL format (prompt/completion fields per line) and linking to the SageMaker SFT documentation.
  • Training job cell (~line 169): Replaced s3_output_path="s3://mc-flows-sdk-testing/output/" with "s3://<your-bucket>/output/" and a # TODO comment.
  • Second training job cell (~line 384): Same s3_output_path substitution as above.
  • Nova training job cell (~line 445): Replaced s3_output_path="s3://mc-flows-sdk-testing-us-east-1/output/" with the same placeholder pattern.

Testing

Manual verification: open the notebook and confirm no cells reference mc-flows-sdk-testing — all four occurrences are replaced with <your-bucket> placeholders. A user following the notebook will now see the TODO markers before executing any cell that requires S3 access, preventing the 403 error. Substituting a valid bucket and a JSONL file with prompt/completion fields allows the notebook to run end-to-end successfully.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SFT example notebook references inaccessible S3 dataset URI

1 participant