October 29-30 in St. Petersburg hosted the
DevOops conference. In this article, I will share my impressions and insights, as well as brief notes on the reports I heard. A small disclaimer: since I am a developer, some thoughts and comments may be biased in Dev than in Ops, but I will try to be as objective as possible.
DevOops is one of the events organized by the
JUG Ru Group . And I must admit, the organization and level of reports were at the level. The conference lasted two days, in three streams. In addition, there were discussion areas for communication with speakers, workshops, as well as lightning talks - easier and shorter presentations, including for those who have not previously spoken and want to try themselves as a speaker.
Thematic canvas DevOops 2019 - cloud native. Most of the reports were directly or indirectly devoted to the clouds. The topic has not been new for a long time, but there are many unobvious difficulties that arise in the process of using cloud technologies. And many came on purpose to find answers. This was especially noticeable at the QA sessions after the presentations. The speakers were asked practical questions that really excite people. Almost every question was followed by the remarks of other participants “We have the same problem!” And a lively discussion began.
The first dayCharacters, community, and culture: Important factors for prosperity (Timothy Lister, The Atlantic Systems Guild Inc.)
The conference was opened by Timothy Lister, who back in 1987 (!) Wrote a
book about DevOps practices. Tim talked a lot about what distinguishes a strong, successful team with a healthy and pleasant atmosphere inside from a mediocre and toxic team. I especially remember the thought:
“A good company is full of people who say“ I don't know. ”
This is about openness and trust within the team. An atmosphere of openness is essential if you have an ambitious goal. This requires that everyone in the team feel comfortable and free. This does not mean that in the event of a problem, a team member immediately escalates it to everyone. An
atmosphere of openness is important and the
ability at any time to contact someone specific (regardless of position) or the whole team and clearly state that it is necessary to improve or change. Such a culture forms a calm background for work when everyone knows that if questions arise they will be listened to.
In my experience, for a productive and stable operation, this factor is of great importance. After all, questions, friction and change of direction will always be. And the atmosphere of openness is a universal tool that allows you to cope with personal and team challenges.
Another thought that seems right to me: there is no one true formula for building a culture in a company.
“No culture can be called ideal. And not a single culture can be called a complete failure. ”
Recent trend: more and more conferences include reports on management, communication, team building and culture. DevOops was no exception. I believe that this is a positive trend, because these factors have an even greater impact on the final result than technological difficulties.
“The leader does not manage the team, but grows it.”
Do it in code (not YAML)! Unlock power of Kotlin DSL for Kubernetes (Victor Gamow, Confluent, and Fedor Korotkov, Cirrus Labs)
Semi-funeral report, but it fully expresses the pain of writing endless YAML files, their support and (oh, horror!) Debug in case of an error or typo. This even led to the emergence of certain YAML Engineer line items.
How it all began? There were scripts once. Then there were more of them. And more. There was a need to unify, simplify and scale solutions. So in the DevOps world, the YAML format appeared and became the standard in many tools.
The authors of the report thought and said: "something in the conservatory is wrong."
- It is not clear how to test YAML files.
- It’s easy to make a mistake. Moreover, some errors are almost impossible to catch and very painful to correct. For example, you can easily specify the version of a dependency as a number, instead of a string. And then find out for a long time why the wrong version is used, which is indicated in the config. And it's all about type casting and rounding.
- If the error is syntactic, then it will be detected quickly enough, in CI. But it is not exactly.
The highlight of the report was the upload of an invalid config to Kubernetes, to which he calmly replied:
Too many errors .
Victor and Fedor offer to write configurations on the Kotlin DSL, which helps to cope with all these problems. Yes, the solution is interesting and convenient, but it is not universal and works only for k8s. In addition, in case of updating the API, you must also update the library.
Pipelines & pods: DevOps with Kubernetes (Burr Sutter, Red Hat)
An easy report on the general concept and main components of Kubernetes, as well as other fashionable Ops-tools and Ops-practices. For a beginner in the subject - what is needed. It would fit perfectly into the conference program for developers, but it was strange to listen to a report of this level at a specialized conference on DevOps. Nevertheless, the review turned out to be good, simple and clear.
But to form JSON from Java code using StringBuilder is somehow too much. Even considering that this is a demo project.
Patterns and antipatterns of continuous updates in DevOps practice (Baruch Sadogursky, JFrog)
In the report of Baruch it is difficult to single out any one idea or direction. Rather, it is a collection of personal experience, life stories, examples of “how to do good and how bad”, tales, other hi-hacks and “fakapchikov”.
Especially from this report I remembered the
story when, due to an error in the deployment process, the Knight Capital Group lost $ 440,000,000 in 45 minutes and went bankrupt.
In the end, Baruch told a story about a bug in the Airbus A350 software. Because of this bug, airlines were forced to
restart the aircraft every 149 hours, and for this it had to be planted on the ground. And if someone forgets to do this, the plane will freeze. Such an unpleasant bug. The problem is simple - overflow occurs in the code. Fix is also simple. But suppose that they still forgot to restart the plane, it took off on a Los Angeles → London flight and 3 hours before the landing, the pilots realized that in an hour the plane would freeze. "Houston, we have a problem." “Now we’ll fix it!” The dispatchers answered, assembled AirBus programmers, fixed, everything works. What to do next? Deploy a new version on a plane by air? Or not take risks? Baruch was determined: “Deploy. It won't be any worse. ”
Second dayCDK and infrastructure as a code (Sergey Kurson, AWS)
Sergey talked about the AWS CDK (Cloud Development Kit). This is a set of libraries that allows you to manage your code infrastructure. The decision is controversial, since the management of infrastructure in an imperative style is a kind of rollback. All modern automation tools allow you to describe the infrastructure in a declarative style (i.e. the state that should result from). However, this approach has advantages. For example, the testing process of the deployed infrastructure is greatly simplified, and the process of deploying and deciding what and how to deploy becomes much more flexible. In addition, there are great opportunities for dynamic and extremely flexible infrastructure management by events, attributes or metrics.
Why do we need a service sieve? (Anton Weiss, Otomato Software)
Perhaps one of the best and deepest reports of this conference. By “service sieve” a speaker means a Service mesh - a separate layer of cloud infrastructure that controls the communication of services among themselves. The Service mesh pattern delegates many tasks from the service level (i.e., from the application developer) to the service sieve level itself (on DevOps):
- security management;
- traffic monitoring;
- traffic management.
Although the Service mesh as a technology has been around for several years, the author made an in-depth report about it and analyzed the history of its origin, how it evolved and how it will develop in the coming years.
The report is especially useful for developers, since the tasks that the Service mesh solves are now often solved at the service level. And this takes the developers extra time and makes it difficult to concentrate on solving business problems.
Speeding up Internet requests and sleeping peacefully (Sergey Fedorov, Netflix)
Sergey works at Netflix to ensure that the service works for end users as quickly as possible. All requests in it are divided into two large groups:
- cloud requests (dynamics);
- CDN queries (static).
In many cases, it is necessary to simultaneously make requests to both the dynamic and static parts of the infrastructure. But such a scheme has overhead: you need to establish at least two connections, conduct TLS Handshake twice, etc.
There was an idea that if you make a request only to a static infrastructure, install a smart proxy on it and entrust it with making dynamic requests to the cloud on its own behalf, this will speed up client requests. Netflix team has implemented such a scheme, tested on real users. However, it became clear that it does not always work and not for everyone, some requests begin to be processed worse.
Therefore, the team decided to go the other way. They came up with a scheme that allows each client to individually decide how it is more profitable to perform dynamic requests: directly from the client or entrust the proxying of this request to the static part of the infrastructure.
This is a good example of not having to step aside from technical challenges. You need to be courageous and choose a compromise option if it makes the product better, and the life of users is easier.
Why the IT industry is going through dark times, how DevOps is to blame, and why Capital can help (Roman Shaposhnik, ZEDEDA Inc.)
The most “visionary” report of this conference. In it, Roman spoke a lot about how technologies (and capital) are interconnected (and inseparable). I think this thesis is very important for engineers and understanding that technologies are created to solve specific problems of people and business. With such thinking, it becomes easier to prioritize tasks and understand what is important and what is not. Roman also spoke about why closed policies and corporations, which are increasingly increasing their influence, can lead to a global crisis in the IT industry. And also in general about the history and philosophy of the sphere of information technology.
DevOops is about development
Several speakers asked the audience who is involved in the operation and infrastructure, and who is developing. The results surprised me: the distribution is about 50 to 50. It's great that more and more developers want to understand what happens to their code after they are written, how applications are deployed and communicate with each other. With this understanding, when writing code, you immediately think about how it will work in living conditions and where you can lay straws.