This tool extracts features from sign language videos for the purpose of sign language recognition. The features extracted by this tool can be used by a variety of machine learning algorithms to recognize sign language gestures.
The features extracted by this tool can be used to enhance sign language recognition models by providing a more linguistically coherent latent space. This is because the features are extracted in a way that takes into account the linguistic properties of sign language, such as the relationship between handshapes, movements, and meaning. This can help to improve the accuracy of sign language recognition models, as they will be better able to distinguish between different signs that are similar in appearance but have different meanings.
demo-v0.mp4
The following feature tree defined at Van der Hurst(2001) are extracted by this tool:
graph TD
Manual-Features-->Articulator;
Manual-Features-->Manner;
Manual-Features-->Place;
Articulator-->Orient;
Articulator-->Handshape;
Orient-->palm/back;
Orient-->tips/wrist;
Orient-->urnal/radial;
Handshape-->finger-selection;
Handshape-->finger-coordination;
finger-coordination-->open/close;
finger-coordination-->curved;
Manner-->Path;
Manner-->Temporal;
Manner-->Spacial;
Path-->straight;
Path-->arc;
Path-->circle;
Path-->zigzag;
Temporal-->repetition;
Temporal-->alternation;
Temporal-->sync/async;
Temporal-->contact;
Spacial-->above;
Spacial-->in_front;
Spacial-->interlock;
Place-->Major_Place;
Place-->Setting;
Setting-->high_low;
The requirement for running the is python>=3.9. The installation command is given below.
git clone git@github.com:karahan-sahin/Automated-Sign-Language-Feature-Extraction.git
conda env --name asl-fe python3.9
conda activate asl-fe
conda install pip
pip install -r requirements.txtAfter setting up the environment, you can run the user-interface with the command below:
The module can be either used via user-interface:
streamlit run main.pyBefore you start running, you need add your files to the data/samples directory:
├── data
│ ├── output
│ │ ├── feature-v2.csv
│ │ └── live-features.csv
│ ├── samples
│ │ ├── v1.mp4
│ │ ├── v2.mp4
│ │ └── v4.mp4
Or the linguistic-feature extraction library:
import os
import sys
import mediapipe as mp
from lib.phonology.temporal import TemporalPhonology
from lib.phonology.handshape import HandshapePhonology
current_path = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.join(current_path, "Automated-Sign-Language-Feature-Extraction/lib"))
mp_drawing = mp.solutions.drawing_utils
mp_holistic = mp.solutions.holistic
LEFT = HandshapePhonology('Left')
RIGHT = HandshapePhonology('Right')
TEMPORAL = TemporalPhonology(PARAMS['selected'])
cam = cv2.VideoCapture()
_, frame = cam.read()
with mp_holistic.Holistic(min_detection_confidence=0.1, min_tracking_confidence=0.1) as holistic:
while cam.isOpened() and frame:
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
results = holistic.process(image)
RIGHT.getJointInfo(results.right_hand_landmarks, results.pose_landmarks, PARAMS['selected'])
LEFT.getJointInfo(results.left_hand_landmarks, results.pose_landmarks, PARAMS['selected'])
INFO = TEMPORAL.getTemporalInformation(LEFT, RIGHT)
CLASSIFIER.predict(CLASSIFIER.transform(TEMPORAL.HISTORY), topK=PARAMS['topK'])
ret, frame = PARAMS['camera'].read()- Work in Progress
demo-classification.mp4
- Baseline dense retrieval is implemented
- Vector representations are extracted by 1D representation phonological feature sequence
This tool requires the following Python libraries:
- OpenCV
- MediaPipe
- NumPy
- SciPy
- Streamlit
This tool is released under Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license