Skip to content

APRashedAhmed/combinatorial-generalization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Combinatorial Generalization and Interactivity

Code that implements the task described in Randy O'Reily's task in Generalization in Interactive Networks: The Benefits of Inhibitory Competition and Hebbian Learning. The package can be installed for general reuse and contains unit tests to ensure proper functionality.

Table of Contents


Task Description

The majority of the text below is taken directly from section 2 of the paper.

  • Combinatorial structure is implemented by having four different input-output slots.
  • The output mapping for a given slot depends only on the corresponding input pattern for that slot (see Brousse, 1993; Noelle & Cottrell, 1996, for similar tasks).
  • Each slot has a vocabulary of input-output mappings.
  • Input vocabulary consists of all 45 combinations of 5 horizontal and 5 vertical bars in a 5x5 grid.
  • Output mapping is a localist identification of the two input bars (similar to bar tasks used by Foldiak, 1990; Saund, 1995; Zemel, 1993; Dayan & Zemel, 1995).
  • Total number of distinct input patterns is approximately 4.1 million.
  • Models are intended only to train on 100 randomly constructed examples, and then test on an arbitrarily large testing set (500 in the paper).
  • Error criterion is scored such that each output unit has to be on the right side of 0.5 according to the correct target pattern.

Desiderata

The paper described the task as having several desiderata.

  • It has a simple combinatorial structure that allows for novel inputs to be composed from a small vocabulary of features.
  • There is some interesting substructure to the vocabulary mapping at each slot.
  • The structure of the task should be apparent in the weight patterns of the models.

Getting Started

Installation

To use the package, first clone it:

git clone https://github.com/APRashedAhmed/combinatorial-generalization.git

Install the requirements:

# For pip
pip install requirements.txt

# For conda
conda install `cat requirements.txt` -c conda-forge

And then install the repo using pip:

# Local install
pip install .

Usage

Once installed, the repo functions similarly to any package installed by pip or conda and can be imported:

# Import and alias the whole package
import combinatorial_generalization as cg

# Import a specific module
from combinatorial_generalization import combigen

# Import a specific function
from combinatorial_generalization.make_datasets import generate_combigen_x_y_dataset

API

Generating Datasets

The main way to use the package is through the high-level data generation functions in make_datasets.py.

  1. generate_combigen_x_y_dataset - Generates sample (X) and label (y) pairs according to the desired task structure and statistics.
from combinatorial_generalization.make_datasets import generate_combigen_x_y_dataset

# Outputs the data and labels
X, y = generate_combigen_x_y_dataset() 
  1. generate_combigen_datasets - Wraps the above functoin to return training, testing, and validation datasets
from combinatorial_generalization.make_datasets import generate_combigen_datasets

# Outputs the data and labels
(x_train, y_train), (x_val, y_val), (x_test, y_test) = generate_combigen_datasets()

See the docstrings for each of the functions for more details including their call signatures.

Visualization

There are two main ways to visualize the task in vizualize.py:

  1. heatmap - A wrapper for sns.heatmap that plots the inputted y and X as a heatmap.
import matplotlib.pyplot as plt
from combinatorial_generalization.make_datasets import generate_combigen_x_y_dataset
from combinatorial_generalization.visualize import heatmap

# Generate a two sample dataset of X and y pairings
X, y = generate_combigen_x_y_dataset(n_samples=2)

# Plot the sample
heatmap(y, X)
plt.show()

  1. visualize_combigen - Plots some number of randomly generated X, y pairs of the combinatorial generalization task.
import matplotlib.pyplot as plt
from combinatorial_generalization.visualize import visualize_combigen

# Plot two randomly generated sample, label pairs
visualize_combigen(n_pairs=2)
plt.show()


Example Use-Cases

Different Numbers of Lines

The task is set by default to use two lines per sample (as in the paper), but this is controllable by the user through the n_lines argument:

# Generate with four lines
X, y = generate_combigen_x_y_dataset(n_samples=4, n_lines=4)

# Plot the sample
heatmap(y, X)
plt.show()

Nonuniform Line Statistics

The statistics governing how likely a particular a line appears on a specific axis can be fully controlled using the line_stats argument:

# First and last elements only
line_stats = [[1,0,0,0,0],[0,0,0,0,1]]

# Pass the line_stats arg
X, y = generate_combigen_x_y_dataset(n_samples=2, line_stats=line_stats)

# Plot the sample
heatmap(y, X)
plt.show()

The values in line_stats will be normalized, so values can be defined relative to each other. For example, to define positions that are three times more likely than others:

# First and last elements are 3 times as likely as other elements
line_stats = [[3,1,1,1,1],[1,1,1,1,3]]

# Pass the line_stats arg
X, y = generate_combigen_x_y_dataset(n_samples=4, line_stats=line_stats)

# Plot the sample
heatmap(y, X)
plt.show()


References

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages