30 Oct 2013
Evolution and Revolution in Geometric Kernels
How can performance of modern geometric kernels be improved? Can it be achieved using an already written code, or revolutionary approaches are needed – in the algorithms and software architecture? The paper analyses the issues of supporting and introducing parallel computations in commercial kernels based on information from open sources, the author’s personal experience and opinions of LEDAS Labs experts.
Kernels and Cores
Nearly half a century ago one of the Intel co-founders Gordon Moore made his famous observation, known as the Moore's Law, that the number of transistors on integrated-circuits chips doubles approximately every two years. For a long time it meant exponential improvement of computer performance, thanks to which software developers were able to create computationally complex applications geared to the “tomorrow’s computing power”. Today the situation is different: although the Moore’s Law is still valid, due to technological reasons producers prefer to increase the number of transistors by increasing the number of processor cores.
It goes without saying that the focus of developing computationally complex software has shifted considerably. Nowadays, to target computer hardware of the near future, it is necessary to develop scalable parallel algorithms, when performance increases proportionally to the increased number of processors or processor cores. Ideally the algorithms should be suitable not only for desktops but also for cluster (supercomputer) systems.
In the real world (as well as in computational mathematics) by no means all tasks can be parallelized efficiently. For instance, it would be strange to hire a dozen of porters to drag a single bag. And it would be even stranger to engage two women to give birth to a single baby hoping that the process will take much less than the routine nine months. But this is a different story when there are several hundred bags to move or more than one baby is planned. Here parallelism may help.
In reality tasks are a set of sub-tasks, some of which can be parallelized and some not. In this case the overall efficiency is described by the Amdahl’s Law, which states that speedup coefficient for a whole task is limited by a non-parallelized sub-task. In practice it means that if, say, 20% of an algorithm execution time is spent for sequential computation, and the remaining 80% can be easily parallelized, speedup with 4 processor cores can be no more than 2.5 times. And with 16 cores – no more than 4 times.
How can this be projected to geometric kernels? Majority of the most time-consuming kernel functions is a complex set of dozens of sequential algorithms closely linked to each other. To achieve a noticeable effect each of them must implement its own parallel version. Roughly speaking, the situation reminds of a struggle against traffic jams in big cities: the problems must be carefully eliminated at every crossroad. Otherwise a bottleneck is guaranteed. Unfortunately, this is not so simple, for example, in a real Russian urban environment because when roads were designed their creators did not think at all about the motorization boom that then has happened over the past 15 years. Just the same, designing modern commercial geometric kernels developers did not think about the advent of parallel computations (e.g. see the paper by Evan Yares, and post in DS Spatial blog).
The market leaders - ACIS and Parasolid declare thread safety and support for parallel computations. However, marketing materials do not explicitly show specific examples of efficiency in complex single operations (such as Boolean or blending). So if we look critically to the carefully selected examples in those materials (which, frankly speaking, are simply ideal for parallelism), we will see a decent quality of parallelism, but still far from the ideal state. The situation with CGM kernel is especially interesting: although developers stated clearly that the kernel is not thread-safe, it did not prevent them from implementing the "multi-process" Boolean operations (function using a set of processes in the operational system). It means that several kernel copies are loaded in the memory and they interact with each other passing messages rather than by sharing memory as threads..
As for PTC Granite One and Autodesk Shape Manager, information is scarce (see, for instance, about Creo Parametric and about Autodesk Inventor) – most probably it means that companies have nothing to be proud of in the field of kernel multithreading so far. Developers of the C3D kernel used in KOMPAS-3D (provided by ASCON, Russia-based company) recently mentioned the start of the works for expanding use of parallel computations. But no detailed information is available yet.
Kernels and Clouds
Is your kernel ready for a cloud CAD?
At the first glance, the answer is trivial. What difference does it make where a kernel library (binary code) is located: on a remote server or a local desktop? To provide a user’s interface (through web-client, virtualization tools or a usual window application) is a task for a CAD-shell rather than a CAD kernel. But here’s the real issue: is a kernel ready to employ all computational resources available in a cloud and at the same time instantaneously give a solution for a task of any complexity?
If a local desktop with an installed “native” desktop application executes complex operations extremely slowly – it is unpleasant but can be excused by a user. But if a remote server with dozens of processors gets stuck then users will certainly be frustrated.
It is unlikely that modern geometric kernels are ready for such scalability of parallel computations to announce their capability to instantaneously solve any tasks. This fact, however, does not at all embarrass Autodesk and Dassault Systemes. This year the companies have released cloud CADs - Inventor Fusion 360 on Shape Manager kernel (already available to anybody) and SolidWorks Mechanical Conceptual, built on 3DExperience platform (probably, CGM kernel) and currently is used by a dozen of DS customers. It looks like the third one will be a start-up - Belmont Technology with its at this point secret CAD and announced use of Parasolid kernel.
Where Does the Revolution Come From?
Vendors of commercial kernels and CAD developers are hardly able to be in a revolutionary mood: to “jump” from one kernel to another or to launch a new CAD to the market, they need to make colossal and risky investment. It is a different story when, as the classics said “there is nothing to lose but one's chains”, and there is no heavy heritage in the form of millions lines of a code and thousands end users. Then there is a room for radical maneuvers and innovations.
We already published detailed technological information about RGK geometric kernel developed in 2011 - 2013. I must say without false modesty, that RGK has perhaps the most well-thought algorithmic fundamental principle and architecture including thread safety and extended support to multithread calculations, and gives a unique combination of the best practices and methods developed in geometric modeling.
Nevertheless, talking about a truly revolutionary approach with the paradigm shift, this role most fits Gen6 kernel, devised by a Californian start-up - TinkerCAD (currently – AirStone Labs), which forms the basis for a same-name could CAD - TinkerCAD, recently acquired by Autodesk.
Although TinkerCAD and its geometric kernel constitute a typical niche solution (initially the product was created as a modeling tool for 3D printing), and its functionality cannot compete with general-purpose CAD geometric kernels; but at least two technological ideas are very interesting. These ideas enable Boolean operations on supercomputers with thousand of processors, at the same time allowing a great number of users to work simultaneously.
The first idea is a cluster management approach to control processes and processors on a supercomputer. As a result users can employ nearly all available resources of the supercomputer (processors) to execute time-consuming operations. Indeed it is possible, because cloud CAD users will not start those operations simultaneously (if it’s not a flash-mob or a DDoS-attack).
The second is to use volumetric (voxel) representation for solid body modeling (well known from computer graphics) as an alternative to B-rep used in all other modern kernels. The advantage is that voxel representation helps efficiently parallelize Boolean operations, so they all are completed instantaneously. Curiously, this representation is much less efficient in sequential one-processor computations than B-rep, but may have chances on a supercomputer.
(For the technical aspects one may see a presentation. Although it discusses an earlier version of the kernel - Gen4, that was improved to Gen6 since then). However, there are a couple of significant drawbacks of voxel representation:
- To achieve reasonable accuracy enormous volumes of data must be stored: a non-optimized approach requires O(N^3) but with optimization (for instance, using octrees) their number can be decrease to O(N^2). Here number N is proportional to 1/eps, where eps is required accuracy. It means that providing relative accuracy eps = 10^-5 (1 mm per 100 m) it is necessary to store and operate several hundred GB of data. This is unattainable luxury even for the modern, medium size supercomputers. (Just to compare: ACIS and Parasolid declare 10-11 orders of magnitude for accuracy.)
- Absence of robust algorithms for converting a voxel representation in B-rep and back. Creating a general-purpose kernel, it will be necessary to integrate the kernel with other CAD and implement export/import functionality for the existing models; it also includes a problem of identification and parameterization of B-rep model elements.
But let’s imagine - what if there is an alternative approach to storing and processing volumetric representation enabling to get rid of the “squared” data volume, maintain scalability of parallel calculations and seamlessly convert into B-rep? LEDAS Labs experts confidently confirm that such technology has the right to exist and they have been studying it intensively for some time. Looks like it will be useful not only for the purely “kernel” purposes – as a 3D modeling tool, but also for sister technologies, including geometric treatment, geometric model comparison, integrating a digital model with reverse engineering applications, and much more.
Is Evolutional Improvement Still Possible?
It is not a gross exaggeration to say that all software developers dream passionately about implementing something innovative. But apart from innovations, majority of them must protect the interests of their current customers and increase the income of their company. As a result, they cannot consider “revolution” as a serious option. So their R&D has to evolve around the existing practice.
There is an interesting question in the context of geometric kernels: is it possible to imagine efficient evolutionary performance improvement? Or, in other words, what are the chances for creating scalable parallel algorithms integrated with the current kernel's code?
The theory for work with B-rep structures, NURBS, constructing the surfaces intersection and other operations is not only well-known but, one can say, is pretty standard. In fact, geometric kernels differ from each other by the methods of implementation, algorithm details (although this is the place where the devil is), and how these operations are composed from numerous base functions – building blocks. Is it possible to develop an algorithms for complex operations, like Booleans, which can be constructed from those base functions (without re-writing them) and can be efficient with multi-core processors? In this case it will give significant competitive advantages to the vendor, justifying the costs of developing an alternative “parallel” function.
This approach is also attractive for some other reasons: it gives the best consistency of the results of a parallel operation with its sequential variant and does not require re-writing code of the most of basic functions. Already mentioned CGM experience is illustrative: according to the developers, they managed to support parallel computations by a relatively small team without providing thread safety at all.
Admittedly, marketing materials on CATIA V62012 give no specific figures and only declare that "some internal operations within the Boolean are done in parallel, improving performance on some scenarios". However LEDAS experts are optimistic about the potentiality of this (evolutionary) technology: with the right organizations of parallel computations the pay-off for large models can be particularly significant.
What Will Happen Then?
Hopefully in the next few years we will see improvements in both directions: completely new geometric kernel technology on the market, and development of the existing technologies, which will be able to deal with previously unsolved problems, especially in cloud CAD environment. Without any doubts users will benefit from the diversity and competition. And here at LEDAS Labs we like to solve those challenging problems - both in evolutionary and revolutionary ways.
Permanent link :: http://isicad.net/articles.php?article_num=16544