Research project exploring the performance gains of using ARK foundation model on aspect based stance detection and sentiment analysis.
Below are selected performance plots summarizing model comparisons (AUC / accuracy) from the project. These give a quick visual overview of results.
![]() |
![]() |
![]() |
![]() |
Key dataset visualizations are shown below to illustrate class distribution, polarization and aspect composition.
![]() |
![]() |
![]() |
Dataset/— data files (csv, json, etc.).Narratives/— narrative files and tweet id lists.dataset_analysis.py— small script to inspect the dataset and generateDataset/unique_words.txt.dataset.py— PyTorch Dataset wrappermaskedABSA_Datasetfor the weakly labeled race dataset.Ark_Aspect_Masking/— model code and helpers for image/text models (if present).
- Install minimal dependencies (example):
pip install pandas torch timm transformers- Inspect dataset and generate the list of unique words:
python3 dataset_analysis.pyThis will print a sample and write Dataset/unique_words.txt (now stored as a Python list literal).
- Use the PyTorch dataset in
dataset.py:
from dataset import maskedABSA_Dataset
train_ds = maskedABSA_Dataset("Dataset/weakly_labeled_race_dataset.csv", split='train')maskedABSA_Dataset supports optional split ('train'|'val'|'test') and annotation_percent to subsample.
Dataset/unique_words.txtis written as a Python list literal bydataset_analysis.py.- Large binary/model files should be tracked with Git LFS if you plan to push them to a remote.
Add your preferred license here.
The project includes a run_command.txt file with example commands used to run experiments. Below are the same commands with short explanations so you can reproduce common runs.
- Resume example (image backbone resume run)
python -u main_ark.py \
--data_set MIMIC --model convnext_base --individual --pretrain_epochs 5 --batch_size 32 --test_epoch 1 --anno_percent 100 \
--exp_name ark_convnext_ImageNet1k_MIMIC_test \
--pretrained_weights /path/to/Models/convnext_base_ImageNet_1k/Ark_MIMIC/ark_convnext_ImageNet1k_MIMIC_test/Ark_MIMIC.pth.tar \
> Logs/ark_convnext_ImageNet1k_MIMIC_test.log 2>&1What it does:
- Runs
main_ark.pyon theMIMICdataset using theconvnext_basebackbone. --individualindicates an individual training configuration per the project's CLI.--pretrained_weightsresumes from a checkpoint; update the path to your local checkpoint.- Output (stdout/stderr) is redirected to
Logs/... .log.
- MaskedABSA with stance_bert (dev run)
python3 -u main_ark.py --data_set MaskedABSA --model stance_bert --exp_name ark_stancebert_MaskedABSA_dev > Logs/ark_stancebert_MaskedABSA_dev.log 2>&1What it does:
- Runs the
stance_bertmodel on theMaskedABSAdataset and logs output toLogs/ark_stancebert_MaskedABSA_dev.log.
- Unmasked ABSA run (example)
python3 -u main_ark.py --data_set unMaskedABSA_race --model stance_bert --unmaskedWhat it does:
- Runs the stance_bert model on an unmasked ABSA race dataset. The
--unmaskedflag toggles the dataset variant.
- MaskedABSA_race baseline training
python3 -u main_ark.py --data_set MaskedABSA_race --individual --exp_name MaskedABSA_race_individual_baseline --pretrain_epochs 50 --test_epoch 1What it does:
- Train a baseline model on
MaskedABSA_racefor 50 pretrain epochs and run evaluation on--test_epoch 1.
- MaskedABSA_politic baseline training
python3 -u main_ark.py --data_set MaskedABSA_politic --individual --exp_name MaskedABSA_politic_individual_baseline --pretrain_epochs 50 --test_epoch 1What it does:
- Same as above but for the
MaskedABSA_politicdataset.
- Joint training over two datasets
python3 -u main_ark.py --data_set MaskedABSA_race --data_set MaskedABSA_politic --exp_name Joint_Training --pretrain_epochs 50 --test_epoch 1What it does:
- Runs joint training across both
MaskedABSA_raceandMaskedABSA_politicdatasets.
- Testing / evaluation with a pretrained checkpoint
python3 -u main_ark.py --data_set MaskedABSA_race --data_set MaskedABSA_politic --exp_name Model_Testing --pretrain_epochs 0 --test_epoch 1 --pretrained_weights /path/to/Models/stance_bert_masked_Random/Ark_MaskedABSA_race_MaskedABSA_politic.pth.tarWhat it does:
- Loads a pretrained checkpoint and runs evaluation (
--pretrain_epochs 0means no training,--test_epoch 1runs the test pass). - Update
--pretrained_weightsto the correct local path for your checkpoint.
Notes
- Replace
/path/to/...with the actual paths on your machine. - Redirecting output to
Logs/*.log(> file.log 2>&1) is optional but useful for long runs. - If you run experiments on GPU servers, ensure CUDA_VISIBLE_DEVICES or other environment variables are set as needed.






