Skip to content

Fix #5632: HyperparameterTuner drops content_type when converting Inp...#5703

Open
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5632-hyperparametertuner-drops-content-type-w
Open

Fix #5632: HyperparameterTuner drops content_type when converting Inp...#5703
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5632-hyperparametertuner-drops-content-type-w

Conversation

@JiwaniZakir
Copy link
Copy Markdown

Closes #5632

Motivation

HyperparameterTuner._build_training_job_definition() converts InputData objects to Channel objects but omits content_type during that conversion, causing built-in algorithms (e.g., XGBoost) to fail with validate_data_file_path errors because the training container cannot determine the data format.

Changes

sagemaker-train/src/sagemaker/train/tuner.py

  • Line 1433: Added content_type=inp.content_type to the Channel(...) constructor call inside the isinstance(inp, InputData) branch of _build_training_job_definition(). This is the sole change required to propagate the field that was silently dropped.

sagemaker-train/tests/unit/train/test_tuner.py

  • Added test_build_training_job_definition_preserves_content_type() to TestHyperparameterTunerStaticMethods. The test constructs an InputData with content_type="text/csv", calls tuner._build_training_job_definition(), and asserts that the resulting Channel for the "train" channel carries content_type == "text/csv". This directly exercises the previously broken code path.

Testing

The new unit test covers the regression:

tests/unit/train/test_tuner.py::TestHyperparameterTunerStaticMethods::test_build_training_job_definition_preserves_content_type PASSED

Manually verified against XGBoost 1.7-1 using the reproduction case from the issue report: training jobs now complete successfully when InputData(content_type="csv") is passed to tuner.tune(), without requiring the Channel-based workaround.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HyperparameterTuner drops content_type when converting InputData to Channel

1 participant