Project. Scope

Scope of the project

Here are some thoughts about the scope and long-term objectives of the project. There are a lot of crucial scope-level decisions that need to be made right now.

What problems will vispy solve? There are a few possible answers (non-exhaustive and non multually exclusive list):

Provide a fast, muti-platform, flexible, object-oriented, hardware-accelerated, multi-level canvas for interactive visualization in Python.
Provide a domain-specific language for fast hardware-accelerated interactive visualization that is implemented in Python and possibly other languages (like Javascript).
Provide a hardware-accelerated 2D plotting library in Python capable of handling very large datasets.
Provide an OpenGL backend for Matplotlib or a high-level plotting language like GoG/ggplot2.
Provide a lightweight intermediate-level platform for 2D/3D visualization (including volume rendering) in Python.

Any of these objectives is particularly ambitious and challenging, let alone all of them. We need to have a much more precise scope for the project in the very long term. In addition, we also need to tackle the highest priority features first.

The following contains my (CR) opinions about these questions. Anyone is welcome to contribute here.

Data and non-data visualization

We all agree that hardware-accelerated OpenGL-based interactive visualization in Python is the general scope of the project, but we need to be more precise than that.

There are two kinds of interactive visualization applications:

data visualization,
non-data visualization.

Data visualization is essentially a scientific thing. A scientist has some data, represented as a set of numerical arrays, and wants to visualize them. He or she doesn't care about low-level details that would affect rendering. Ideally, this scientist would need a domain specific language expressing what to display about the data, no matter how large it is.

Non-data visualization generally concerns non-scientists, i.e. developers, teachers, hackers, etc. Applications include demos, presentations, video games, interactive animations. Here, the how is generally the appropriate level of thinking, since these persons are often used to easily switch between implementation-level and visual-level thinking. Besides, this level is all the more adapted as it allows much more flexibility and customization.

For both kinds of visualization, the final outcome is the same: a sequence of OpenGL commands reacting to user events. However, the end-user does not want to mess around with raw OpenGL commands. So we need to provide a common level that targets both kinds.

So, to me, here is a first, major goal of the project: provide an intermediate-level abstraction layer for high-performance hardware-accelerated interactive visualization in Python that can be used either for data visualization or non-data visualization applications. More precisely, we want a common module to share between different libraries (visvis, pyqtgraph, galry, glumpy...). This is the highest priority objective of the project.

The main challenge here is to find the appropriate level of abstraction so that both data and non-data visualizations can be implemented as easily. We do not exclude to have multiple levels here (e.g. object-oriented OpenGL wrapper, and low-level visuals).

Achieving this first goal would be a major success. Yet, we don't necessarily have to stop here. Our different visualization libraries have some overlap and in the very long run, I think we do want to merge most of them.

Here are further features we might want to have in the long run, but which are probably quite challenging. They are merely vague ideas at this point.

Domain-Specific Language

Let's first imagine that we're only interested in the Python language. Then, we could come up with a library offering different APIs at different levels of abstraction. A scientist could use the high-level data visualization API, a game developer a lower-level layer, etc.

Now, if we are even more ambitious, we could think that maybe Python is not enough. It is not the only language on Earth. In particular, web-based languages are very "trendy" (maybe not that much yet in the scientific world though). Browsers are widely spread, work on desktop computers, laptops, tablets and smartphones, which is not the case for Python. They are cross-platform, generally work out-of-the-box, and support hardware-accelerated graphics through WebGL. We spend more and more time in browsers. Even Python scientists now use the browser with the IPython notebook. So I think that the idea of targetting browser-based visualizations is totally making sense.

But how can this be achieved? Python and Javascript are very different languages. And, by pursuing the reasoning, we could even imagine support in yet other languages (C++, Java, Objective C: think about iOS, Android, etc.). Imagine a scientist who needs to create a particular visualization application (e.g. a volume rendering-based application for visualizing medical scans in 3D). Requiring this application to work on iPads and other tablets is totally legitimate. But implementing such an application for multiple desktop and mobile platforms from scratch is overwhelmingly complicated and expensive. One would basically need to write raw OpenGL in different languages.

A way of simplifying this task would be to create a domain-specific language for hardware-accelerated interactive visualization. However, one could argue that OpenGL is a domain-specific language. But it is a quite low-level one. A higher-level DSL implemented in different languages would be ideal. But the appropriate level needs to be found. For example, Grammar of Graphics would not fit at all in this example. The appropriate level is probably one where low-level visuals can be used, extended and created, defined by OpenGL objects and GLSL code. Interactivity would also need to be in scope. Designing and implementing such a language from scratch is totally out of reach.

A simpler solution would be to design and implement an embedded DSL within other languages. Basically, an API that could easily be translated from a generic language (like Python) to another (like Javascript). Using a combination of object-oriented and functional paradigms could be part of a solution.

High-level 2D plotting

High-level 2D plotting is a subpart of data visualization (I include some 3D visualization features in data visualization). So this should be seen as a particular application on top of a common intermediate level of abstraction. That is to say, there should not be any notion of axes, ticks, grids at this intermediate level, since this level is shared with non-data visualization applications.

There are existing DSLs for high-level 2D plotting, and GoG is a quite famous and actual one. I think that creating a new one here is out of the question. What we could do is to create a hardware-accelerated renderer for this DSL. The idea is that a scientist could describe its data quite naturally and visualize it interactively without thinking about how the implementation works. The amazing D3 library should be an example to follow here. I encourage everyone to take a serious look at it.

Links

Here are a few very interesting links I've found during these reflections.

This paper by J. Heer and Mike Bostock (D3's creator) describes a DSL for interactive visualization (more particularly a Java implementation of Protovis, the ancestor of D3). It is a bit outdated now but still interesting. And there are some (old) references of existings DSLs and libraries for interactive visualization that we should check out:
Superconductor, a language for big data visualization.
This research group appears to be one of the leading groups in visualization research.
Shadie, a DSL for volume rendering.

Conclusion

I see two major objectives for the project.

First, come up with an intermediate-level programming interface for OpenGL-based interactive visualization.
- Python is the primary target language.
- The interface should be designed such that it can be easily translated into another object-oriented language (especially Javascript).
- The same interface could be used for data or non-data visualization application.
Then, implement data visualization APIs/DSLs on top of the intermediate-level for:
- High-level 2D plotting, inspired by GoG and D3.
- Volume rendering and polygon meshes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly