-
mnist_loader
MNIST data repository. Handles downloading, loading, and organizing MNIST digt images
-
image_utils
MNIST digit image utility libraries used to aggregate, transform, and combine into a single image
- Designates ROOT_DIR, where all MNIST dataset I/O operations occur
- Creates "dataset" directory where MNIST datasets are stored
pip install "MNIST_number_sequence @ git+https://github.com/fksato/MNIST_number_sequence"
git clone https://github.com/fksato/MNIST_number_sequence.git
cd MNIST_number_sequence
pip install .
## or
python install setup.py
- python >= 3.6
- numpy >= 1.16.4
- pillow >= 6.1.0
All requirements will be met automatically when pip/setup.py installed
Loading MNIST dataset
from MNIST_number_sequence.mnist_loader import MNISTLoader
MNIST_dataset = MNISTLoader(segment="train")
segment can be either "train" or "test" and will load the respective MNIST image and label dataset
MNISTLoader object will create a hash table which can be used to query the MNIST dataset for specific digits
MNIST_dataset.get_digit_image(digit=2)
Users may also choose to get an image from the dataset at an arbitrary index
MNIST_dataset.get_image_at(idx=532)
The return type of these functions are a uint8 numpy ndarray of size [28, 28]
Users can generate an image of a sequence of digit images from MNIST dataset by either utilizing the image_utils library, or by using the generate_sequence.py convenience script.
Users can use generate_numbers_sequence by importing it into any project:
from MNIST_number_sequence import generate_sequence
combined_image = generate_sequence([1, 2, 3, 4, 5], (-10,10), 145)
generate_sequence will save a copy of the generated image into the current working directory of the calling script. This function also returns a float32 numpy ndarray that can be used for any user needs. The size of the returned array is [28, image_width], where image_width can be set by the user or computed from the arguments passed into the function.
generate_sequence(digits, spacing_range, image_width, dataset_regime, image_save_name)
- digits: a desired sequence of integers between 0-9 as a list
- spacing_range: a tuple that represents the minimum/maximum allowable spacing between each digit in the final combined image
- image_width: final width of the combined image. If left empty, the final width of the sequence of digits will be computed from the number of digit images and the allowable spacing between digits (default=None)
**care should be taken when specifying the image_width. The user should specify a width large enough image to accomodate the number of digits and the spacing between each digit
- dataset_regime: an optional parameter that specifies which MNIST dataset regime ("train" or "test") to pull digit images from. (default="train")
- image_save_name: an optional parameter to specify the file name of the produced digit image (default='combined_sequence.png')
If installed via pip or by building from source using the setup.py, users can use the generate_sequence by invoking it from a command-line terminal.
To get help with the available options call:
generate_sequence -h
generate_sequence
- -d, --digits: a sequence of numbers to generate an image. Each digit in the sequence should be separated by a space and can only be single digit number found in MNIST. (cannot be a number less than 0 or larger than 9)
- -s, --spacing_range: two numbers that represents the minimum/maximum allowable spacing between each digit. The two numbers must be separated by a space. Order of the numbers must be minimum first then the maximum allowable spacing.
- -w, --image_width: An optional flag which the user can explicitly set the image width of the final combined image. (default=None)
**care should be taken to specify a large enough image width to accomodate the number of digits and the spacing between each digit
- --dataset_regime: An optional string parameter that specifies which MNIST dataset to retrieve from. Options include: "train" or "test". (default="train")
- --image_save_name_: An optional string parameter which the user may use to specify the file name of the final combined image. (default="combined_sequence.png")
Example:
generate_numbers_sequence -d 1 2 3 4 5 -s -10 10 -w 150 --dataset_regime "testing" --image_save_name "combined_12345.png"
The user can use the generate_sequence.py convenience script directly by invoking it.
python <path to generate_sequence>/generate_sequence.py [OPTIONS]
The available arguments are the same as above and a detailed argument list can be printed to screen using:
python <path to generate_sequence>/generate_sequence.py -h
Users can get retreive a list of digit images from the MNISTLoader by calling the get_digits function
from MNIST_number_sequence.image_utils import get_digits, combine_images
image_extents, digit_images = get_digits(digits=[1,2,3,4,5], MNIST_dataset)
get_digits(digits, mnist_images_ds, tight=False, apply=None)
- digits: a desired sequence of integers between 0-9 as a list
- mnist_images_ds: MNISTLoader object
- tight: (default=False) prescribes how to calculate the extents of the digit image -- By default the extents of the digit images are calculated using the bounding box of the digit image. Otherwise, the minimum/maximum x positon of non-zero pixel value is used.
- apply: (default=None) An optional field that applies an image transformation on th digit image. Currently, skew is the only image transformation that is implemented.
With the sequence of digit images and their extents, users can generate a single image by invoking the combine_images function.
min_max_spacing = (-10, 10)
image_width = 145
combine_images(digit_images, image_extents, min_max_spacing, image_width)
combine_images(image_array, image_extents, min_max_spacing, image_width=None)
- image_array: a list of digit images
- image_extents: a list of tuples that correspond to the minimum/maximum extents of each digit image in the image array
- min_max_spacing: a tuple that represents the minimum/maximum allowable spacing between each digit in the final combined image
- image_width: (default=None) an optional parameter which the user may use to specify the final width of the combined image
**care should be taken to specify a large enough image width to accomodate the number of digits and the spacing between each digit
combine_images will return a single uint8 numpy ndarray that represents the sequence of digit images as a single image.
At this step, the user may choose to do as they wish with the image array. To save the image, the user can use a library such as pillow, which is included in this package.
Current image transformations available is found under MNIST_number_sequence.image_utils.image_transformations An implementation of generate_sewuence that utilizes the skew transformation is provided by the generate_skewed_sequence.py.
Invoking by command line is possible using:
generate_skewed_sequence <OPTIONS>
A full list of the available arguments can be printed to screen by:
generate_skewed_sequence -h
generate_skewed_sequence
- -d, --digits: a sequence of numbers to generate an image. Each digit in the sequence should be separated by a space and can only be single digit number found in MNIST. (cannot be a number less than 0 or larger than 9)
- -s, --spacing_range: two numbers that represents the minimum/maximum allowable spacing between each digit. The two numbers must be separated by a space. Order of the numbers must be minimum first then the maximum allowable spacing.
- -w, --image_width: An optional flag which the user can explicitly set the image width of the final combined image. (default=None)
**care should be taken to specify a large enough image width to accomodate the number of digits and the spacing between each digit
- --dataset_regime: An optional string parameter that specifies which MNIST dataset to retrieve from. Options include: "train" or "test". (default="train")
- --image_save_name_: An optional string parameter which the user may use to specify the file name of the final combined image. (default="combined_sequence.png")
- --remove_skew: an optional flag that tells the program to turn of the skew image transformation (default=False)
- --make_tight: an optional flag that tells the program to calculate the extents of the digit images by the minimum/maximum significant pixel values, rather than the MNIST image bounding box. (default=False)
Similar to generate_sequence , generate_skewed_sequence can be invoked in a python script in a similar manner as above. The main difference being the two optional flags --remove_skew and --make_tight.
Future development of more image transformation may be considered.
for pull requests please send an email to [email protected]
Continuous integration brought to you by Travis CI