Guiding choices

In the introduction, we have introduced the SLAP which is the grounding idea that guides us in our developments. In the below we put a list of various choices we have done so fare.

C/C++

Then C++ to stay close to C which is the language of operating systems, but also because we need object orientation (encapsulation, namespace, class, inheritance and virtuality) to help organize big software.

Compiled and run local

We promote a "compiled attitude" to stay close to the silicium and have effective apps. Our experience with environments promoting virtual machines (for interpreters or for operating systems (JVM, VirtualBox, Docker, etc)) or promoting "remote running", as the web, shows that we loose reactivity on interactive applications built/run in this way, something that degrades the user experience and we do not want that. We have now the material to cope with local graphics on effective devices close to people and we promote to run in this way. (For the moment we can run natively on the five today (2018) interactive platforms: macOS, Windows-10, Linux, iOS, Android and we are very happy of that).

Not a C++ extremist

We definitely do not jump to last brand new version of the C++ standard. As portability is an issue for us, we wait that a feature of the language (or a library) be firmly available on all our platforms before using it.

For the moment (2018), C++98 covers all our needs and we stick to it for the core (inlib/exlib) of our things. But our build system (bush) can cope with C++11, and we had been able to build and run the Geant4 related apps (g4view, g4exa, MEMPHYS_vis) with the Geant4-10.x serie that needs C++11, and this on all platforms. (But well, not sure that having "auto" everywhere is a great idea for readability...).

Pure header

As explained in the introduction.

Layered

Our classes are layered in the sense that there is no bidirectional relationships between them. In particular there is no forward declarations and there is no usage of the friend keyword. A class uses other classes but always in "one way only". In fact coding pure header compells more or less to do that and experience shows that it simplifies a lot the overall organization of the code and then the overall readability of it. (Technically, it pushes to scratch head to have the right set of base classes and/or interfaces; which is fine).

STL style

We have chosen the style of the STL because xxx_yyy is more readable than XxxYyy.

No writable statics

We have no writeable statics, it breaks multi-threading.

No configure and config.h

If you write ANSI C/C++ code you do not need "configure stuff". In inlib/exlib, there is no "configure" because there is no config.h to produce. Moreover the ourex logic, consisting in embarquing the needed "externals", bypasses the need for a tricky configure script.

No "source setup" for apps

When running an application (for example ioda on a laptop), you do not have to "source setup" a shell script to set, for example, some environment variables. We have arranged to avoid env variables, it complicates an installation. To run an application, you just launch the binary.

Build with Bourne shell scripts

When you think of it, a make system is not needed to "build for install" because, at installation, compilations are done once. Moreover (strong) experience showed us that at installation, in case of problems, it is more easy to deal with a "human readable" Bourne shell script than with various other third party tools (coming with their own scripting syntax). sh being introduced very early in any UNIX training, it is some kind of "universal" that we can assumed familiar to anyone attempting a "build from source". When developing, if you do a maximum of things "header only", a make logic is not really needed too. It is now since 2010 that we build with "sh only" (including on Windows) and we find that it simplifies a lot and then we stick to this choice for the moment.

To help, we remember some "Bourne shell minimum" :

  variable :
     my_variable=value
     echo "my_variable value is ${my_variable}"
  conditional :
     if [ "${my_variable}" = "hello" ] ; then
       echo "my_variable is hello"
     elif [ "${my_variable}" != "bye" ] ; then
       echo "my_variable is not bye"
     else
       echo "my_variable is not hello and is bye"
     fi
  loop :
     list='aa bb'
     for item in ${list} ; do echo "item ${item}"; done

With that in head you have good chance to be able to read our build scripts.

No singletons

There is a BIG falltrap with the singleton pattern. If you don't care you can quickly heavily break the OO principle of encapsulation with it. How? On a singletoned class there is in general some instance() class method. The first time it is invoked it creates internally the lonely object and then return the pointer to this object each time it is called. Then to use A, instead of doing :

     A* a = new A();
 you have to do :
     A* a = A::instance();  

(a correct singleton pattern should enforce a private constructor to avoid a user doing a new). Up so far all is ok but things start to go wrong if you want to use the A object in a class B. Here you are going to be highly temptated to use directly within some method use_A() of B the A::instance(). And then doing :

     class B {
       void use_A() {
         ...
         A* a = A::instance();
         a->do_something();
         ...
       }
     };

And then? Then here you have broken the OO encapsulation principle in B::use_A()! Why? Because in OO if having to establish a relationship between B and A you should have done it by passing a A to the use_A method. A nasty point with the upper is that now there a "hidden" relationship between B and A that can't be traced by looking the signature of the methods of B. And then a relationship that can't be traced also by tools that uses the method signatures to draw class diagrams.

In fact all would be ok if you had used A::instance() to create the lonely A and have done on B :

     class B {
       void use_A(A& a) {
         a.do_something();
         ...
       }
     };
 and for example done in the main() :
     ...
     A* a = A::instance();
     B b;
     b.use_A(*a);
     ...

In the upper you guarantee to have one instance of A but moreover you can trace the relationship of class B toward A throught its methods.

Then the point we don't like is not so much to enforce to have only one instance of A. This could be ok on some situation. No, the point is the intempestive usage of instance() that establishes hidden relationships between classes.

Someone may answer that if doing :

     class B {
       void use_A() {
         A a;
         a.do_something();
         ...
       }
     };

then we establish a relationship between B and A. Right, but here the object a is local and by applying a.do_something() you do not influence other objects.

And related to the usage of singletons there is also a problem of "design lasiness". It is clear that transforming a class to a singleton and using everywhere instance() avoid to scratch head to establish relationships through methods in the right way...

Geant4 uses a lot of singleton now. Ok why not. But now the "instance() hidden pattern" is used in a lot of places. This is bad. Is Geant4 still OO?

CERN_ROOT uses singletons too (TROOT, TApplication, etc...). But here situation is worst since you have to access the lonely instances through... global pointers! (gROOT, gApplication). And the situation is really much worst since global pointers are used also for things that are not singletons! gDirectory, gEnv, gStyle, gPad, etc... (Around one hundred in v5-18-00, a disaster). The encapsulation principle is definitely trampled here. Then CERN-ROOT can't be claimed to be OO. It is "something in C++" but that's all. (Something in C++ that g-intricates everything to everything). To enforce the nail, let us take for example the lines of pseudo code :

     Histogram h("my histo",10,1,2)      //line 1
     h.fill(10)                          //line 2
     Function f("my function")           //line 3
     // after the construction of f.     //line 4
     h.fill(5)                           //line 5

In the upper we expect that line 2 changes the state of the object h since we use a method of the Histogram class on the h object. But since we do not pass the pointer or a reference of h to the constructor of f at line 3, we expect to find at line 4 the object h in the same state that at line 2. But it is not the case with CERN-ROOT! Because in CERN-ROOT the constructor of f may use a bunch of "g" global pointers also seen in an hidden way by the object h! Seen in an hidden way because not appearing in a method of the Histogram or Function class. And then in the upper case you have NO guarantee that the state of h at line 4 is the same as at line 2! And this is highly misleading. After a dozen of lines of CERN-ROOT programming, you simply do not know in which state your objects are! And this would not happen if following the encapsulation rule that says that the relationships have to be done by using the methods. For example in :

     Histo h("my histo",10,1,2)            //line 1
     h.fill(10)                            //line 2
     Function f(h,"my function")           //line 3
     // after the construction of f.       //line 4
     h.fill(5)                             //line 5

at line 3 we explicitly establish a relationship between the histo and the function and then we expect that at line 4 the state of h may had been changed by line 3. Here things are much more clear and big code done in this way are much more understandable.

In the inlib/exlib and the code of our apps, we avoid writable statics (and then singletons), and then there is no hidden relationships in this code. You can have a look at the methods to see the relationships ; you see what you get.

Master the externals

Beside the STL, it is hard to build a consequent application without some code not written at home. We call these "external packages". In general we are interested in an external package because we need a piece of code with "high added value" on a given problem, for example reading a jpeg file, parsing an XML file, decompressing a file at gzip format, etc... Any problem that would need us a lot of time to rewrite the algorithms because these algorithms embed a strong expertise on the problem at hand. In softinex we try to master our externals. Under ourex, we keep a copy of the externals we need, and we give priority to the usage of these instead of using ones coming with the system or installable by other way (apt-get on some Linuxes, etc...). It permits first to have the same overall code on all platforms and then be sure to have the same behaviour of the applications on all platforms. Moreover since we arrange to build the ourex externals with the same Bourne shell build system and without using any "config stuff", it permits to have in general a straightforward "build and install".

There is also the case for which we need only a sub part of an external package. This is the case for dcmtk to read a medical dicom file. dcmtk itself is rather large (around 700 files to compile) and it brings code to do other things than reading a file. In ourex/dcmtk we bring only the 180 .cc files we need, it eases the life.

This way of doing comes from having observed what happened around the software for the LHC experiments. Here we have now an overall upsetting "code inflation" coming in particular (but not only) from untamed externals. An inflation that led to a general loss of portabibilty ; these software can be built now only on one given platform : clone of Linux lxplus. Even macOS is out of reach, then iOS and Android...

Documentation

We document the behaviour of our apps by demos (on YourTube) and examples and through web only. We do not intend to have paper docs. For the code, there is no reference manual. The fact to be pure header permits to be "what you see is what you get" and this is sufficient for most of things, especially knowing that our software is targeted for a limited number of educated persons. Anyway, what is sure is that we have definitely no time to document everything and we consider us happy when we can reach the functionalities that we want. In particular, you will not find in our code, "doc" of the kind :

     ~X(); //the destructor.
     size_t get_size() const {return m_size;} //get the size. const method.

About comments in the code: hell, there is nothing more upsetting than a comment not in sync with the line of code that it is supposed to document. Something as:

     ~X(); //the constructor.

You may find comments in our code, but we limit only to coarse graining explanations or to document the overall logic of a method having a consequent number of line of codes related to a tricky algorithm. (Tricky algorithms that we avoid anyway as much as possible; be sure that a couple of years later, even with good comments, you will not be able to understand back your tricky algorithm).

All this being said, we have the deep conviction that some tool is cruelly lacking to be able to navigate in source code, especially when a large number of classes are around. And this is not a question of static documentations and comments. The class diagrams of doxygen are nice, but they are produced in a static way and oftenty not in sync with releases (yes, yes. We stopped to deliver that in our web pages). What would be great would be to have some kind of super editor able to show on large screens class relationships and then pieces of code in a highly dynamical and interactive way. This would help a lot to understand and improve large software (whatever the language).