Once again about the principle of Lisk substitution, or the semantics of inheritance in OOP

Inheritance is one of the pillars of OOP. Inheritance is used to reuse common code. But not always the general code needs to be reused, and not always inheritance is the best way to reuse the code. It often turns out, so that there is a similar code in two different pieces of code (classes), but the requirements for them are different, i.e. classes actually inherit from each other and may not be worth it.

Usually, to illustrate this problem, they use an example about inheriting the Square class from the Rectangle class, or vice versa.

Suppose we have a rectangle class:

class Rectangle: def __init__(self, width, height): self._width = width self._height = height def set_width(self, width): self._width = width def set_height(self, height): self._height = height def get_area(self): return self._width * self._height ... 

Now we wanted to write the Square class, but in order to reuse the area calculation code, it seems logical to inherit the Square from the Rectangle:

 class Square(Rectangle): def set_width(self, width): self._width = width self._height = width def set_height(self, height): self._width = height self._height = height 

It seems that the code of the Square and Rectangle classes is consistent. It seems that Square preserves the mathematical properties of the square, i.e. and a rectangle. That means we can pass Square objects instead of Rectangle.

But if we do this, we can violate the behavior of the Rectangle class:

For example, there is a client code:

 def client_code(rect): rect.set_height(10) rect.set_width(20) assert rect.get_area() == 200 

If you pass an instance of the Square class as an argument to this function, the function will behave differently. Which is a violation of the contract for the behavior of the Rectangle class, because actions with an object of the base class should give exactly the same result as with the object of the descendant class.

If the square class is a descendant of the rectangle class, then working with the square and performing the methods of the rectangle, we should not even notice that it is not a rectangle.

You can fix this problem, for example, like this:

  1. make an assert to exactly match the class, or make an if that will work differently for different classes
  2. in Square, make the set_size () method and override the set_height, set_width methods so that they throw exceptions

Such code and such classes will work, in the sense that the code will be working.

Another question is that the client code that uses the Square class or the Rectangle class will need to know either about the base class and its behavior, or about the descendant class and its behavior.

Over time, we can get that:

It turns out that the client code written for the base class becomes dependent on the implementation of the base class and the descendant class. Which greatly complicates the development over time. And OOP was created just so that you could edit the base class and the descendant class independently of each other.

Back in the 80s of the last century, we noticed that for class inheritance to work well for code reuse, we must know for sure that the descendant class can be used instead of the base class. Those. semantics of inheritance - this should be not only and not so much data as behavior. Heirs should not "break" the behavior of the base class.

Actually, this is the principle of Lisk substitution or the principle of determining a subtype based on the behavior of strong behavioral typing classes: if you can write at least some meaningful code in which replacing a base class object with a descendant class object will break it, then it’s not worth it inherit them from each other. We should extend the behavior of the base class in descendants, and not significantly change it. Functions that use the base class should be able to use subclass objects without knowing it. In fact, this is the semantics of inheritance in OOP.

And in real industrial code, it is highly recommended that this principle be followed and abide by the described semantics of inheritance. And with this principle there are several subtleties.

The principle should be satisfied not with abstractions of the domain level, but with code abstractions - classes. From a geometric point of view, a square is a rectangle. From the point of view of the hierarchy of class inheritance, whether the class of a square will be the heir of the class rectangle depends on the behavior that we require from these classes. Depends on how and in what situations we use this code.

If the Rectangle class has only two methods - calculating the area and rendering, without the possibility of redrawing and resizing, then in this case Square with an overridden constructor will satisfy the Lisky principle of replacement.

Those. such classes satisfy the substitution principle:

 class Rectangle: def draw(): ... def get_area(): ... class Square(Rectangle): pass 

Although of course this is not very good code, and even, probably, the antipattern of class design, but from a formal point of view it satisfies the Liskov principle.

Another example . A set is a subtype of a multiset. This is the ratio of domain abstractions. But the code can be written so that we inherit the Set class from Bag and the substitution principle is violated, or we can write so that the principle is respected. With the same domain semantics.

In general, inheritance of classes can be considered as the implementation of the relationship “IS”, but not between the entities of the subject area, but between classes. And whether the descendant class is a subtype of the base class is determined by what restrictions and contracts of class behavior the client code uses (and in principle can use).

Constraints, invariants, base class contract are not fixed in the code, but fixed in the heads of developers who edit and read the code. What is “breaking”, what is breaking the “contract” is determined not by the code, but by the semantics of the class in the head of the developer.

Any code that is meaningful for an object of a base class should not break if we replace it with an object of a descendant class. Meaningful code is any client code that uses an object of a base class (and its descendants) within the framework of the semantics and restrictions of the base class.

What is extremely important to understand is that the limitations of the abstraction that is implemented in the base class are usually not contained in the program code. These restrictions are understood, known and supported by the developer. It monitors the consistency of abstraction and code. For the code to express what it means.

For example, a rectangle has another method that returns a view in json

 class Rectangle: def to_dict(self): return {"height": self.height, "width": self.width} 

And in Square we redefine it:

 class Square: def to_dict(self): return {"size": self.height} 

If we consider the basic contract for the behavior of the Rectangle class to_json to have height and width, then the code

 r = rect.to_dict() log(r['height'], r['width']) 

will be meaningful for an object of the base class Rectangle. When replacing an object of a base class with a class, the Square heir code changes its behavior and violates the contract, and thereby violates the principle of Lisk substitution.

If we believe that the basic contract for the behavior of the Rectangle class is that to_dict returns a dictionary that can be serialized without laying on specific fields, then such a to_dict method will be ok.

By the way, this is a good example, destroying the myth that immutability saves from violation of the principle.

Formally, any overriding of a method in a descendant class is dangerous, as well as changes to the logic in the base class. For example, quite often the descendant classes adapt to the “incorrect” behavior of the base class, and when the bug is fixed in the base class, they break.

It is possible to transfer all the conditions of the contract and invariants to the code, but in the general case, the semantics of behavior all the same lies outside the code - in the problem area, and is supported by the developer. The example about to_dict is an example where the contract can be described in the code, but for example, to verify that the get_hash method really returns a hash with all the properties of the hash, and not just a line, is impossible.

When a developer uses code written by other developers, he can understand what the semantics of the class are only directly by code, method names, documentation, and comments. But in any case, semantics is often a human domain, and therefore erroneous. The most important consequence: only by code - syntactically - is it impossible to verify compliance with the Liskov principle, and you need to rely on (often) vague semantics. There is no formal (mathematical) means of a verifiable and guaranteed way to verify strong behavioral typing.

Therefore, often instead of the Liskov principle, formal rules for preconditions and postconditions from contract programming are used:

For example, in a descendant class method, we cannot add a required parameter that was not in the base class - because this is how we strengthen the preconditions. Or we cannot throw exceptions in the overridden method, because violate the invariants of the base class. Etc.

What matters is not the current behavior of the class, but what class changes implies responsibility or semantics of the class.

The code is constantly corrected and changed. Therefore, if right now the code satisfies the principle of substitution, this does not mean that the changes in the code will not change this.

Let's say there is a Rectangle library class developer, and an application developer who inherits Square from Rectangle. The moment the application developer inherited Square from Rectangle - everything was fine, the classes satisfied the principle of substitution.

And at some point, the developer in charge of the library added a reshape or set_width / set_height method to the Rectangle base class. From his point of view, an extension of the base class just happened. But in fact, there has been a change in the semantics and contracts on which the descendant class relied. Now classes no longer satisfy the principle.

In general, when inheriting in OOP, changes in the base class that will look like an extension of the interface — another method or field will be added may violate previous “natural” contracts, and thereby actually change semantics or responsibilities. Therefore, adding any method to the base class is dangerous. You can accidentally inadvertently change the contract.

And from a practical point of view, in the example with a rectangle and a class, it is important whether there is now a reshape or set_width / set_height method. From a practical point of view, it is important how high the likelihood of such changes in the library code. Does the semantics or boundaries of class responsibility imply such changes. If implied, then the likelihood of error and / or further need for refactoring is significantly increased. And if there is even a small possibility, it is probably better not to inherit such classes from each other.

Maintaining subtype definitions based on behavior is difficult, even for simple classes with clear semantics , let alone enterprise with complex business logic. Despite the fact that the base class and the successor class are different pieces of code, for them you need to carefully and carefully think out the interfaces and responsibility. And even with a slight change in the semantics of the class - which cannot be avoided in any way, we have to look at the code of the related classes, check to see if the new contract or invariant violates what is already written (!) And used. With almost any change in the branchy class hierarchy, we need to look and check a lot of other code.

This is one of the reasons why some people don’t really like classical inheritance in OOP. And therefore, they often prefer composition of classes, inheritance of interfaces, etc., etc. instead of the classic inheritance of behavior.

In fairness, there are some rules that are most likely not to violate the principle of substitution. You can protect yourself as much as possible if you prohibit all dangerous structures. For example, for C ++, Oleg wrote about this. But in general, such rules do not turn classes into classes in the classical sense.

Using administrative methods, the task is also not very well solved. Here you can read how Uncle Martin did in C ++ and how it did not work.

But in the real industrial code, quite often, the Liskov principle is violated, and this is not scary . It is difficult to follow the principle, because 1) the responsibility and semantics of a class are often not explicit and not expressed in the code 2) the responsibility of a class can change - both in the base class and in the descendant class. But this does not always lead to some really terrible consequences. The most common, simplest, and most basic violation is that an overridden method modifies behavior. Like in for example here:

 class Task: def close(self): self.status = CLOSED ... class ProjectTask(Task): def close(self): if status == STARTED: raise Exception("Cannot close a started Project Task") ... 

The close method of ProjectTask will throw an exception in those cases in which the objects of the Task class work fine. In general, redefinition of methods of a base class very often leads to a violation of the principle of substitution, but does not become a problem.

In fact, in this case, the developer perceives the inheritance NOT as an implementation of the “IS” relationship, but simply as a way to reuse the code. Those. a subclass is just a subclass, not a subtype. In this case, from a pragmatic and practical point of view, it matters more - but what is the likelihood that there will be or already exists client code that will notice different semantics of the methods of the descendant class and base class?

Is there a lot of code that expects an object of a base class, but to which we pass the object of the descendant class to? For many tasks, such code will never exist at all.

When does an LSP violation lead to big problems? When, due to differences in behavior, the client code will have to be rewritten with changes in the descendant class and vice versa. This becomes especially a problem if this client code is a library code that cannot be changed. If reusing the code will not be able to create dependencies between the client code and the class code in the future, then even despite the violation of the Liskov substitution principle, such a code may not bring big problems.

In general, during development, inheritance can be viewed from two perspectives: subclasses are subtypes, with all the limitations of contract programming and the Lisk principle, and subclasses are a way to reuse code, with all its potential problems. Those. you can either think and design class responsibilities and contracts and not worry about client code. Either think about what client-side code might be, how classes will be used, and be prepared for potential problems, but to a lesser extent cares about observing the substitution principle. The decision, as usual, is up to the developer, the most important thing is that the choice in a particular situation is conscious and that there is an understanding of what pros, cons and pitfalls accompany this or that solution.

Source: https://habr.com/ru/post/463385/

All Articles