Uncertain parameterization as a universal method for building application architecture in C ++ and Java for a minimum. the price

C ++ is a confusing language, and its major drawback is the difficulty of creating isolated blocks of code. In a typical project, it all depends on everything. This article shows how to write highly-insulated code that minimally depends on specific libraries (including standard ones), implementations, reducing the dependence of any piece of code to a set of interfaces. In addition, architectural solutions for code parameterization will be proposed, which may interest not only C ++ programmers, but also Java programmers. And what is important, the proposed solution is very economical in terms of development time.

Disclaimer : In this article, I have gathered my ideas about ideal architecture. Some ideas are not mine (but I don’t remember whose), some ideas are commonplace and are known to everyone - it doesn’t matter, because I offer not my ideas about a good architecture, but a specific code that will allow this architecture to be approached at a minimum price.

Disclaimer N2 : I will be happy with the constructive feedback expressed in words. If you understand worse than me, and scold me, it means that somewhere I have not clearly enough explained, and it makes sense to rework the text. If you understand better than me, it means that I will gain valuable experience. Thanks in advance.

Disclaimer N3 : I wrote large applications from scratch, but did not write server and client enterprise applications. Everything is different there and, probably, my experience will seem strange to specialists in this field. And the article is not about that, the same scalability issues are not considered here at all.
Disclaimer N4 ( Upd. Based on comments): Some commentators have suggested that I reinvent Fowler and offer long-known design patterns. This is definitely not the case. I propose a very small parameterization tool that allows you to implement these patterns with a minimum of scribble. Including Fowler's Dependency Injection and Service Locator, but not only - using the TypedSet class you can also implement a set of strategies economically. In this case, Fowler accessed through lines, which is expensive - my zero-cost tool, zero cost (if absolutely strictly, then log (N) instead of 2M * log (N), where M is the length of the parameter string for the Service Locator. after the appearance of constexpr typeid in c ++ 20, the price should become completely zero). Therefore, I ask you not to extend the meaning of the article to design patterns. Here you will find only a method of cheap implementation of these patterns.

The examples will be in C ++, but all that is said is quite realizable in Java. Perhaps, over time, I will give working code for Java if the request for this will be in the comments from you.

Part 1. Spherical architecture in a vacuum


Before brilliantly resolving all the difficulties, you need to correctly create them. Masterfully creating difficulties for yourself in the right place, you can greatly facilitate their solution. For this, we formulate a goal for the solution of which we will come up with methods - the minimum principles of good architecture.

In fact, the magic of good architecture is just two principles, and what is written below is just a decoding. The first principle is code testability. Testability is like Ariadne’s thread that leads you to good architecture. If you do not know how to write a test for functionality, then you have spoiled the architecture. If you don’t know how to create a good architecture, think about what the test will be for the functionality you planned - and you will automatically create a bar of architectural quality for yourself, and quite high. Thoughts on tests automatically increase modularity, lower connectivity, and make architecture more logical.

And I do not mean TDD. A typical disease of many programmers is the religious worship of technologies read somewhere without understanding the limits of their effectiveness. TDD is good when several programmers are working on the code, when there is a testing department and the authorities have an understanding of why good coding practices are needed and it is ready to pay not only for some code that solves the problem, but also for its reliability. If your superiors are not ready to pay, you will have to work more economically. Nevertheless, you still have to test the code - unless, of course, you have a sense of self-preservation.

The second principle is modularity. More precisely, highly insulated modularity without the use of libraries / hardcode that are not related to the module itself. Now when designing server architectures, it is fashionable to divide a monolith into microservices. I’ll tell you a terrible secret - each module in a monolith should be like a microservice. In the sense that it should easily stand out from the general code with a minimum of connected headers in the test environment. It is not yet clear, but I will explain with an example: Have you ever tried to allocate shared_ptr from a boost? If at the same time you manage to drag not just the whole boost, but only half of its raw materials, it means that you killed three or five days to cut off unnecessary addictions !!! At the same time, you drag along the fact that it definitely has nothing to do with shared_ptr !!!

And it's worse than a mistake - it's an architectural crime.

With a good architecture, you should be able to tear out shared_ptr, painlessly and quickly replacing everything that is not related to shared_ptr with test versions. For example, a test version of the allocator. Or forget about the boost. Let's say you write an xml / html parser. You need to work with strings and work with files for the parser. And if we are talking about an ideal architecture that is not tied to the needs of a particular production / software company, then for a parser with an ideal architecture we do not have the right to use std :: istream, std :: file_system, std :: string and hardcode search operations with strings in the parser. We must provide a stream interface, an interface for file operations (perhaps divided into subinterfaces, but access to subinterfaces will still have to be done through the interface of the file operations module), an interface for working with strings, an allocator interface, and ideally also an interface for the line itself. As a result, we can painlessly replace everything that is not related to parsing with test blanks, or insert a test version of the allocator / work with files / string search with additional checks. And the versatility of the solution will increase - tomorrow there will be not a file under the interface of the stream, but a site somewhere on the Internet, and no one will notice it. You can replace the standard library with Qt, and then switch to visual c ++, and then start using only Linux things - and the alterations will be minimal. As a spoiler, I’ll say that with this approach, the price question arises in full growth - it’s expensive to close everything, including elements of the standard library, with interfaces, but this is not a goal, but a solution.

In general, the radical module-as-microservice principle proclaimed in this article is a sore spot in C ++ and generally typical plus code. If you create declaration files and separate interfaces separately from implementations, you can still create independence / isolation of cpp-files from each other, and then, relative, not 100%, then the headers are usually woven into a solid monolith, from which nothing can be torn out without meat. And although this has a terrible effect on compilation time, it is. Moreover, even if independence of the headings is achieved, this automatically means the inability to aggregate classes. Actually, the only way to achieve independence of both .cpp files and headers in c ++ is to declare pre-used classes (without defining them), and then use only pointers to them. as soon as you use the class itself instead of the class pointer in the header file (that is, aggregate it), you will create a bunch of all .cpp-shniks that include this heading, and that .cpp-shnik that contains the class definition. There is also fastpimpl, but it is just guaranteed to create dependencies at the cpp level.

So, for a good architecture, isolation of modules is important - the ability to pull out one module with the first heading connecting macros and main types of the library, with the second heading for declarations and several inclusions connecting the set of interfaces. And only what relates to this functionality, and everything else should be stored in other modules and accessible only through interfaces.

We formulate the main features of good architecture, including the points indicated above, point-by-point.

Let's define the term “Module”. A module is the sum of logically related functionalities. For example, work with streams or file work, or an HTML parser.

The “File Work” module can combine many functionalities - open a file, close, position, read properties, read file size. At the same time, the folder scanner can be designed as part of the “File Work” interface, or as a separate module, and work with streams can be placed in a separate module for sure. Which, however, does not interfere with organizing access to all other modules to streams and the folder scanner indirectly, through the "File Work". This is not necessary, but quite logical.

  1. Modularity. Imperative "Module-as-microservice".
  2. Allocation of 20% of the code executed 80% of the time in a separate library - the core of the program
  3. Testability of each functionality of each module
  4. Interface, it’s the lack of hardcode. You can only call the hardcode that is directly related to the module’s functionality, and you must make the other direct library calls to a separate module and access them through the interface.
  5. Complete isolation of the module by interfaces from the external environment. The ban on “nailing” implementations that are not related to the class’s functionality. And more radically, isolating libraries (including standard ones) with interfaces / adapters / decorators
  6. Aggregating a class or creating a class variable or fastpimpl is used only when it is critical to performance.

Of course, we will figure out how to quickly achieve all this for a lower price, but I would like to draw attention to another problem, the solution of which will be a bonus for us - the transfer of platform-dependent parameters. For example, if you need to make code that works equally on both Android and Windows, then it will be logical to allocate platform-dependent algorithms into separate modules. In this case, probably, the implementation for the android may require a reference to the Java (jni) environment, JNIEnv *, and possibly a couple of Java objects. And implementation on Windows may require a working folder of the program (which on android can be requested from the system, having JNIEnv *). The trick is that the same JNIEnv * does not exist in the Windows context, so even a typed union or its c ++ alternative to std :: variant is impossible. You can, of course, use the void * vector or the std :: any vector as a parameter, but honestly, this is an atypical crutch. Atypical - because it rejects the main advantage of c ++, strong typing. And this is more dangerous than SARS.

Further we will analyze how to solve this issue in a strictly typified manner.

Part 2. Magic bullets and their price tag


So, let's say we have a large amount of code that needs to be written from scratch, and the result will be a very large project.

How can it be assembled in accordance with the principles we have determined?

The classic way, approved by all the manuals, is to divide everything into interfaces and strategies. With the help of interfaces and strategies, if there are a lot of them, any subproblem of our project can be isolated to such an extent that the “module-as-microservice” principle will start working on it. But my personal experience is that if you divide the project into 20-30 parts, which will be isolated to the level of "module-as-microservice", then you will succeed. But the main feature of good architecture is the ability to test any class outside the project context. And if you isolate each class already, then there are already more than 500 modules, and in my experience, this increases the development time by 3-5 times, which means that in “combat conditions” you will not do this and will compromise between price and quality.

Someone may doubt, and will be in his own right. Let's make a rough estimate. Let the middle class have 3-5 members and 20 functions and 3 constructors. Plus 6-10 getters and setters (mutators) for access to our members. Total about 40 units in the class. In a typical project, each “center” class needs access to an average of five functionalities, not a center to 3. For example, very many classes need an allocator, a file system, work with strings, work with streams, and access to databases.

Each strategy / interface will require one member of type std::shared_ptr<CreateStreamStrategy> m_create_stream; . Two mutators, plus initialization in each of the three constructors. plus somewhere, when initializing our class, you will need to call something like myclass->SetCreateStreamStrategy( my_create_stream_strategy ) couple of times, for a total of 8 units per interface / strategy, and since we have about five of them, there will be 40 units. That is, we have made the original class twice as cumbersome. And the loss of simplicity will inevitably affect readability, and somewhere else in the process of debugging, and a half times, despite the fact that nothing seems to have essentially changed.

So the question is. How to do the same, but at a minimum price? The first thing that comes to mind is the static parameterization on templates, in the style of Alexandrescu and the Loki library.

We are writing a class in style

 template < struct Traits > class MyClass { public: void DoMainTaskFunction() { ... MyStream stream = Traits::streamwork::Open( stream_name ); ... } }; 

This solution has all the architectural advantages that we identified in the first part. But there are also a lot of disadvantages.

I myself love to shabble, but I regret for myself I admit: templates in the ordinary code are loved only by template magicians. A significant mass of programmers with the word "template" slightly frown. Moreover, in the industry, the vast majority of pluses are in fact not pluses, but slightly retrained in c ++ syshniks who do not have deep knowledge of the pluses, but fall under the word “template” and pretend to be dead.

If we translate this into a production language, then maintaining code on static parameterization is more expensive and more complicated.

At the same time, if we want, for the purposes of greater readability, to carefully remove the body of the function outside the class, then we get a lot of scribble with the names of templates and template parameters. And in case of a compilation error, we get long human-readable shelves of causes and problem areas with a bunch of complex nested templates.

But, there is a simple way out. As a template magician, I declare that almost everything that can be done using static parameterization / static polymorphism can be transferred to dynamic polymorphism. No, of course, we will not eradicate the template evil to the end - but we will not scatter it with a generous hand for parameterization in each class, but will limit it to a couple of instrumental classes.

Part three. The proposed solution and the code encoded for this solution


So THERE !!! Meet the template class TypedSet. He associates one smart pointer of this type with one single type. Moreover, for the specified type it may have an object, but it may not. I don’t like the name - so I will be grateful if in the comments tell me a more successful option.

One type - one object. But the number of types is not limited! Therefore, you can pass such a class as a parameterizer.

I want to draw your attention to one point. It may seem that at some point you may need two objects under one interface. In fact, if such a need arises, then (in my opinion) this means an architectural error. That is, if you have two objects under one interface, then they are no longer functional access interfaces: these are either input variables for the function, or you have not one but two functionalities that you need access to, then it’s better to divide the interface into two .

We will make three basic functions: Create, Get and Has. Accordingly, the creation, receipt, and verification of the presence of an element.

 /// @brief    .      ,    ///           /// class TypedSet { public: template <class TypedElement> void Create( const std::shared_ptr<TypedElement> & value ); template <class TypedElement> std::shared_ptr<TypedElement> Get() const; template <class TypedElement> bool Has() const; size_t GetSize() const { return storage_.size(); } protected: typedef std::map< size_t, std::shared_ptr<void> > Storage; Storage const & storage() const { return storage_; } Storage & get_storage() { return storage_; } private: Storage storage_; }; template <class TypedElement> void TypedSet::Create( const std::shared_ptr<TypedElement> & value ) { size_t hash = typeid(TypedElement).hash_code(); if ( storage().count( hash ) > 0 ) { LogError( "Access Violation" ); return; } std::shared_ptr<void> to_add ( value ); get_storage().insert( std::pair( typeid(TypedElement).hash_code(), to_add ) ); } template <class TypedElement> bool TypedSet::Has() const { size_t hash = typeid(TypedElement).hash_code(); return storage().count( hash ) > 0; } template <class TypedElement> std::shared_ptr<TypedElement> TypedSet::Get() const { size_t hash = typeid(TypedElement).hash_code(); if ( storage().count( hash ) > 0 ) { std::shared_ptr<void> ret( storage().at(hash) ); return std::static_pointer_cast<TypedElement>( ret ); } else { LogError( "Access Violation" ); return std::shared_ptr<TypedElement> (); } } 

By the way, I saw an alternative solution from colleagues writing in Qt. There, access to the desired interface was carried out through a singleton, which “mapped” the desired interface, packed into Varaint, via a text line (!!!), and after casting this option, the result could be used.

 GlobalConfigurator()["FileSystem"].Get().As<FileSystem>() 

It certainly works, but the overhead of counting the length and further hashing the string is somewhat scary for my optimistic soul. Here, the overhead is zero, because the choice of the desired interface is carried out at compile time.

Based on TypedSet, we can craft the StrategiesSet class, which is already more advanced. In it we will store not only one object per access interface for each functional, but also for each interface (hereinafter referred to as the strategy) an additional TypedSet with parameters for this strategy. I clarify: parameters, unlike function variables, are what are set once during program initialization or once for a large program run. Parameters allow you to make the code truly cross-platform. It is in them that we drive the entire platform-dependent kitchen.

Here we will have more basic functions: Create, Get, CreateParamsSet and GetParamsSet. Has not laid, because it is architecturally redundant: if your code refers to the functionality of working with the file system, and the calling code did not provide it, you can only throw an exception or assert, or make the sebukka program call the abort () function.

 class StrategiesSet { public: template <class Strategy> void Create( const std::shared_ptr<Strategy> & value ); template <class Strategy> std::shared_ptr<Strategy> Get(); template <class Strategy> void CreateParamsSet(); template <class Strategy> std::shared_ptr<TypedSet> GetParamsSet(); template <class Strategy, class ParamType> void CreateParam( const std::shared_ptr<ParamType> & value ); template <class Strategy, class ParamType> std::shared_ptr<ParamType> GetParam(); protected: TypedSet const & strategies() const { return strategies_; } TypedSet & get_strategies() { return strategies_; } TypedSet const & params() const { return params_; } TypedSet & get_params() { return params_; } template <class Type> struct ParamHolder { ParamHolder( ) : param_ptr( std::make_shared<TypedSet>() ) {} std::shared_ptr<TypedSet> param_ptr; }; private: TypedSet strategies_; TypedSet params_; }; template <class Strategy> void StrategiesSet::Create( const std::shared_ptr<Strategy> & value ) { get_strategies().Create<Strategy>( value ); } template <class Strategy> std::shared_ptr<Strategy> StrategiesSet::Get() { return get_strategies().Get<Strategy>(); } template <class Strategy> void StrategiesSet::CreateParamsSet( ) { typedef ParamHolder<Strategy> Holder; std::shared_ptr< Holder > ptr = std::make_shared< Holder >( ); ptr->param_ptr = std::make_shared< TypedSet >(); get_params().Create< Holder >( ptr ); } template <class Strategy> std::shared_ptr<TypedSet> StrategiesSet::GetParamsSet() { typedef ParamHolder<Strategy> Holder; if ( get_params().Has< Holder >() ) { return get_params().Get< Holder >()->param_ptr; } else { LogError("StrategiesSet::GetParamsSet : get unexisting!!!"); return std::shared_ptr<TypedSet>(); } } template <class Strategy, class ParamType> void StrategiesSet::CreateParam( const std::shared_ptr<ParamType> & value ) { typedef ParamHolder<Strategy> Holder; if ( !params().Has<Holder>() ) CreateParamsSet<Strategy>(); if ( params().Has<Holder>() ) { std::shared_ptr<TypedSet> params_set = GetParamsSet<Strategy>(); params_set->Create<ParamType>( value ); } else { LogError( "Param creating error: Access Violation" ); } } template <class Strategy, class ParamType> std::shared_ptr<ParamType> StrategiesSet::GetParam() { typedef ParamHolder<Strategy> Holder; if ( params().Has<Holder>() ) { return GetParamsSet<Strategy>()->template Get<ParamType>(); //   template          .    . } else { LogError( "Access Violation" ); return std::shared_ptr<ParamType> (); } } 

An additional plus is that at the prototyping stage you can make one super-large typing class, cram access to all modules into it, and pass it to all modules as a parameter, quickly become small, and then quietly break it into pieces that are minimally necessary for each module.

Well, and a small and (yet) overly simplified use case. I hope you in the comments suggest me what you would like to see as a simple example, and I will make the article a small upgrade. As the popular programming wisdom says, “release as early as possible and improve using feedback after release.”

 class Interface1 { public: virtual void Fun() { printf("\niface1\n");} virtual ~Interface1() {} }; class Interface2 { public: virtual void Fun() { printf("\niface2\n");} virtual ~Interface2() {} }; class Interface3 { public: virtual void Fun() { printf("\niface3\n");} virtual ~Interface3() {} }; class Implementation1 : public Interface1 { public: virtual void Fun() override { printf("\nimpl1\n");} }; class Implementation2 : public Interface2 { public: virtual void Fun() override { printf("\nimpl2\n");} }; class PrintParams { public: virtual ~PrintParams() {} virtual std::string GetOs() = 0; }; class PrintParamsUbuntu : public PrintParams { public: virtual std::string GetOs() override { return "Ubuntu"; } }; class PrintParamsWindows : public PrintParams { public: virtual std::string GetOs() override { return "Windows"; } }; class PrintStrategy { public: virtual ~PrintStrategy() {} virtual void operator() ( const TypedSet& params, const std::string & str ) = 0; }; class PrintWithOsStrategy : public PrintStrategy { public: virtual void operator()( const TypedSet& params, const std::string & str ) override { auto os = params.Get< PrintParams >()->GetOs(); printf(" Printing: %s (OS=%s)", str.c_str(), os.c_str() ); } }; void TestTypedSet() { using namespace std; TypedSet a; a.Create<Interface1>( make_shared<Implementation1>() ); a.Create<Interface2>( make_shared<Implementation2>() ); a.Get<Interface1>()->Fun(); a.Get<Interface2>()->Fun(); Log("Double creation:"); a.Create<Interface1>( make_shared<Implementation1>() ); Log("Get unexisting:"); a.Get<Interface3>(); } void TestStrategiesSet() { using namespace std; StrategiesSet printing; printing.Create< PrintStrategy >( make_shared<PrintWithOsStrategy>() ); printing.CreateParam< PrintStrategy, PrintParams >( make_shared<PrintParamsWindows>() ); auto print_strategy_ptr = printing.Get< PrintStrategy >(); auto & print_strategy = *print_strategy_ptr; auto & print_params = *printing.GetParamsSet< PrintStrategy >(); print_strategy( print_params, "Done!" ); } int main() { TestTypedSet(); TestStrategiesSet(); return 0; } 

Summary


Thus, we solved an important problem: we left in the class only that interface that is directly related to the class’s functionality. The rest was “shoved” into the StrategiesSet, while avoiding both cluttering the class with unnecessary elements and “nailing” certain algorithms of the required functionality to the algorithms. This will allow us not only to write highly insulated code, with zero dependencies on implementations and libraries, but also to save a huge amount of time.

The code for the example and tool classes can be found here.

Upd. from 11/13/2019
In fact, the code shown here is just a simplified example for readability. The fact is that typeid (). Hash_code is implemented in modern compilers slowly and inefficiently. Its use kills much of the meaning. Moreover, as the respected 0xd34df00d suggested , the standard does not guarantee the ability to distinguish types by hashcode (in practice, this approach however works). But then the example is well read. I rewrote TypedSet without typeid (). Hash_code (), moreover, replaced map with array (but with the ability to quickly switch from map to array and vice versa by changing one digit in #if). It turned out more difficult, but more interesting for practical use.
at coliru
 namespace metatype { struct Counter { size_t GetAndIncrease() { return counter_++; } private: size_t static inline counter_ = 1; }; template <typename Type> struct HashGetterBody { HashGetterBody() : hash_( counter_.GetAndIncrease() ) { } size_t GetHash() { return hash_; } private: Counter counter_; size_t hash_; }; template <typename Type> struct HashGetter { size_t GetHash() {return hasher_.GetHash(); } private: static inline HashGetterBody<Type> hasher_; }; } // namespace metatype template <typename Type> size_t GetTypeHash() { return metatype::HashGetter<Type>().GetHash(); } namespace details { #if 1 //   ,        () class TypedSetStorage { public: static inline const constexpr size_t kMaxTypes = 100; typedef std::array< std::shared_ptr<void>, kMaxTypes > Storage; void Set( size_t hash_index, const std::shared_ptr<void> & value ) { ++size_; assert( hash_index < kMaxTypes ); // too many types data_[hash_index] = value; } std::shared_ptr<void> & Get( size_t hash_index ) { assert( hash_index < kMaxTypes ); return data_[hash_index]; } const std::shared_ptr<void> & Get( size_t hash_index ) const { if ( hash_index >= kMaxTypes ) return empty_ptr_; return data_[hash_index]; } bool Has( size_t hash_index ) const { if ( hash_index >= kMaxTypes ) return 0; return (bool)data_[hash_index]; } size_t GetSize() const { return size_; } private: Storage data_; size_t size_ = 0; static const inline std::shared_ptr<void> empty_ptr_; }; #else //    ,        (std::map) class TypedSetStorage { public: typedef std::map< size_t, std::shared_ptr<void> > Storage; void Set( size_t hash_index, const std::shared_ptr<void> & value ) { data_[hash_index] = value; } std::shared_ptr<void> & Get( size_t hash_index ) { return data_[hash_index]; } const std::shared_ptr<void> & Get( size_t hash_index ) const { return data_.at(hash_index); } bool Has( size_t hash_index ) const { return data_.count(hash_index) > 0; } size_t GetSize() const { return data_.size(); } private: Storage data_; }; #endif } // namespace details /// @brief    .      ,    ///           /// class TypedSet { public: template <class TypedElement> void Create( const std::shared_ptr<TypedElement> & value ); template <class TypedElement> std::shared_ptr<TypedElement> Get() const; template <class TypedElement> bool Has() const; size_t GetSize() const { return storage_.GetSize(); } protected: typedef details::TypedSetStorage Storage; Storage const & storage() const { return storage_; } Storage & get_storage() { return storage_; } private: Storage storage_; }; template <class TypedElement> void TypedSet::Create( const std::shared_ptr<TypedElement> & value ) { size_t hash = GetTypeHash<TypedElement>(); if ( storage().Has( hash ) ) { LogError( "Access Violation" ); return; } std::shared_ptr<void> to_add ( value ); get_storage().Set( hash, to_add ); } template <class TypedElement> bool TypedSet::Has() const { size_t hash = GetTypeHash<TypedElement>(); return storage().Has( hash ); } template <class TypedElement> std::shared_ptr<TypedElement> TypedSet::Get() const { size_t hash = GetTypeHash<TypedElement>(); if ( storage().Has( hash ) ) { std::shared_ptr<void> ret( storage().Get( hash ) ); return std::static_pointer_cast<TypedElement>( ret ); } else { LogError( "Access Violation" ); return std::shared_ptr<TypedElement> (); } } 

Here access is carried out in linear time, type hashes are counted before main () is run, losses are only for validation checks, which can be thrown out if desired.

Source: https://habr.com/ru/post/475268/


All Articles