SDSIA (Synthetic Data Set for Image Analysis)

SDSIA is a Python module to generate synthetic data sets for image analysis, and a C library to manipulate these data sets.

The goal of the Python module is to provide a way to easily generate data sets of images used by image analysis softwares or algorithms, in a context of learning or development.

The generation of the images is made using the 3D rendering software POV-Ray. This software uses text scripts to defined the scene to render, so it's very easy to parameterize a script to create variations of a given scene. The Python module then render as many samples as needed in the set, each time using different parameters' value.

By creating its own POV-Ray script, the user can create any needed data set, selecting precisely which aspect of the scene is constant and which is variable. The level of difficulty of the data set, from the point of view of the software or algorithm used for analysis, can then be controlled at will. By choosing the appropriate variable in the scene, the user could also study a particular property of a given software or algorithm (e.g, is it more sensitive to variation in shape or variation in color).

Once the user has written the script to render images, the python module takes care of the tedious tasks of rendering (only needed data sets), files and folders management, creation of a description file in JSON format for later use. It also generates perfect masks (black and white image) by modifying the texture properties of the objects in the scene.

The number of images per data set, the dimensions and format of each image are also defined by the user. Then, one can create sets corresponding to its needs, in particular memory, disk storage, processing time limits.

The current version of SDSIA is designed for image segmentation (localization of pixels corresponding to an object in a scene). However it has been developped with the view to be extended to other kind of data sets.

The C library offers the following functionalities: loading a data set from its description file, spitting the data set into user defined categories (e.g. training, validation, test), shuffling the data set, looping through the samples of the data set. It provides the samples of the data set as GenBrush objects to manipulate them easily through this library's functionalities.

How to install this repository

Python module

The Python module doesn't require any particular operation, but you will need to have the following Python module installed on your machine: os, sys, json, glob, subprocess, re, shutil

C library

Create a directory which will contains this repository and all the repositories it is depending on. Lets call it "Repos"
Download the master branch of this repository into "Repos". Unzip it if necessary.
The folder's name for the repository should be followed by "-master", rename the folder by removing "-master".
cd to the repository's folder
If wget is not installed: sudo apt-get update && sudo apt-get install wget on Ubuntu 16.04, or brew install wget on Mac OSx
If gawk is not installed: sudo apt-get update && sudo apt-get install gawk on Ubuntu 16.04, brew install gawk on Mac OSx
If this repository is the first one you are installing in "Repos", run the command make -k pbmake_wget
Run the command make to compile the repository.
Eventually, run the command main to run the unit tests and check everything is ok.
Refer to the documentation to learn how to use this repository.

The dependancies to other repositories should be resolved automatically and needed repositories should be installed in the "Repos" folder. However this process is not completely functional and some repositories may need to be installed manually. In this case, you will see a message from the compiler saying it cannot find some headers. Then install the missing repository with the following command, e.g. if "pbmath.h" is missing: make pbmath_wget. The repositories should compile fine on Ubuntu 16.04. On Mac OSx, there is currently a problem with the linker. If you need assistance feel free to contact me with my gmail address: at bayashipascal.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DataSets		DataSets
Doc		Doc
POV		POV
UnitTestIn		UnitTestIn
UnitTestOut		UnitTestOut
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
dataset.json		dataset.json
dataset.pov		dataset.pov
exampleUse.py		exampleUse.py
generateDataSet.py		generateDataSet.py
listing.txt		listing.txt
unitTestPythonRef.txt		unitTestPythonRef.txt
unitTestRef.txt		unitTestRef.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SDSIA (Synthetic Data Set for Image Analysis)

How to install this repository

Python module

C library

About

Releases

Packages

Contributors 2

Languages

License

BayashiPascal/SDSIA

Folders and files

Latest commit

History

Repository files navigation

SDSIA (Synthetic Data Set for Image Analysis)

How to install this repository

Python module

C library

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages