Всем привет! Меня зовут Газимагомед, я занимаюсь разработкой внутреннего распределённого профайлера Vision в Ozon. В этой статье я раскрою понятие профиля, расскажу о том, что такое распределённый профайлинг, чем отличается автоматический сбор профилей от ручного. А также рассмотрим проблемы, возникающие при построении профайлера. Что ж, усаживайтесь поудобнее, мы начинаем.
Distributed systems *
Nuances of designing distributed systems
Thoughts and short notes (in go) after reading «Clean Code»
Clean Go
Hey guys, I recently dove into 'Clean Code' by Robert C. Martin and found some valuable insights. The book is originally in Java, but I decided to reinterpret the principles in Go. Here's my take on the clean code concepts and how they can improve our coding practices.
1. Clean Code
The gist: Clean code is more than just working code; it's code that other developers can easily read, understand, and modify.
Elasticsearch as NoSQL Database
In this article, I will introduce NoSQL concepts and show how they are related to Elasticsearch, and we will consider this search engine as a NoSQL document store.
The journey of scaling up a production Elasticsearch cluster
In this article, I will tell you about a-few-years journey of scaling the Elasticsearch cluster in production environment, which is one of the vital elements of the iPrice technology stack.
I will describe challenges we encountered and how we approached them.
The Ideal Economy
I am not an economist, but in light of current events with cryptocurrencies and the economy in general, I would like to share my thoughts on some kind of ideal economy, around which everything is happening now.
Distributed Artificial Intelligence with InterSystems IRIS
Author: Sergey Lukyanchikov, Sales Engineer at InterSystems
What is Distributed Artificial Intelligence (DAI)?
Attempts to find a “bullet-proof” definition have not produced result: it seems like the term is slightly “ahead of time”. Still, we can analyze semantically the term itself – deriving that distributed artificial intelligence is the same AI (see our effort to suggest an “applied” definition) though partitioned across several computers that are not clustered together (neither data-wise, nor via applications, not by providing access to particular computers in principle). I.e., ideally, distributed artificial intelligence should be arranged in such a way that none of the computers participating in that “distribution” have direct access to data nor applications of another computer: the only alternative becomes transmission of data samples and executable scripts via “transparent” messaging. Any deviations from that ideal should lead to an advent of “partially distributed artificial intelligence” – an example being distributed data with a central application server. Or its inverse. One way or the other, we obtain as a result a set of “federated” models (i.e., either models trained each on their own data sources, or each trained by their own algorithms, or “both at once”).
Distributed AI scenarios “for the masses”
We will not be discussing edge computations, confidential data operators, scattered mobile searches, or similar fascinating yet not the most consciously and wide-applied (not at this moment) scenarios. We will be much “closer to life” if, for instance, we consider the following scenario (its detailed demo can and should be watched here): a company runs a production-level AI/ML solution, the quality of its functioning is being systematically checked by an external data scientist (i.e., an expert that is not an employee of the company). For a number of reasons, the company cannot grant the data scientist access to the solution but it can send him a sample of records from a required table following a schedule or a particular event (for example, termination of a training session for one or several models by the solution). With that we assume, that the data scientist owns some version of the AI/ML mechanisms already integrated in the production-level solution that the company is running – and it is likely that they are being developed, improved, and adapted to concrete use cases of that concrete company, by the data scientist himself. Deployment of those mechanisms into the running solution, monitoring of their functioning, and other lifecycle aspects are being handled by a data engineer (the company employee).
Decentralized Torrent storage in DHT
The DHT system has existed for many years now, and torrents along with it, which we successfully use to get any information we want.
Together with this system, there are commands to interact with it. There are not many of them, but only two are needed to create a decentralized database: put and get.
VeChain Has Introduced Blockchain-Based Healthcare Data Management Platform At Cyprus Hospital
Blockchain possesses incredible potential, that's why blockchain systems are considered as a paradise for data. In all these years, the discovery of Blockchain has been enjoyed for the introduction of distributed systems to secure data by cryptography.
From the creation of cryptocurrency to distributed ledger systems and mobile applications, this technology is being welcomed by every business vertical, and its adoption has become complimentary for companies. Moreover, its adoption can majorly be cherished by the healthcare industry.
A number of IT institutions are engaged in finding the most promising usage of blockchain technology in healthcare. Let's take a brief look at Blockchain adoption in 2020.
The Global Blockchain Adoption
In 2020, the worldwide spending on blockchain systems is USD 4.3 billion. As per Statista, the market of Blockchain will be worth 20 billion USD by the year 2025. Its most critical adoption can be seen in the healthcare industry, where this technology is being considered as the biggest game-changer.
The truth is that this technology has shown a path to distributed systems coupled with unmatched security measures That secure data in a chain of blocks infused with cryptographic locks. Top-notch level security and quality of not being tempered by any external entity boost its adoption in several instances.
Deploying Tarantool Cartridge applications with zero effort (Part 2)
We have recently talked about how to deploy a Tarantool Cartridge application. However, an application's life doesn't end with deployment, so today we will update our application and figure out how to manage topology, sharding, and authorization, and change the role configuration.
Feeling interested? Please continue reading under the cut.
Blockchain Is Changing The Way Rail Industry Works
Railways had made our transportation very easy since 1830 when the first railway began in England. From 1830 to 2020, the development in the railways has been quite significant. The concept of blockchain is expanding widely; hence the public interests are also growing on a vast scale. Major enthusiasts about blockchain are the investors and businessmen who wish for transparency and equity in the transaction. Now since blockchain is no more just a concept its application in railways is expected to smoothen the transportation.
Deploying Tarantool Cartridge applications with zero effort (Part 1)
We have already presented Tarantool Cartridge that allows you to develop and pack distributed applications. Now let's learn how to deploy and control these applications. No panic, it's all under control! We have brought together all the best practices of working with Tarantool Cartridge and wrote an Ansible role, which will deploy the package to servers, start and join instances into replica sets, configure authorization, bootstrap vshard, enable automatic failover and patch cluster configuration.
Interesting, huh? Dive in, check details under the cut.
How to Write a Smart Contract with Python on Ontology? Part 4: Native API
Earlier, I have introduced the Ontology Smart Contract in
Part 1: Blockchain & Block API and
Part 2: Storage API
Part 3: Runtime API
Today, let’s talk about how to invoke an Ontology native smart contract through the Native API. One of the most typical functions of invoking native contract is asset transfer.
Тarantool Cartridge: Sharding Lua Backend in Three Lines
In Mail.ru Group, we have Tarantool, a Lua-based application server and a database united. It's fast and classy, but the resources of a single server are always limited. Vertical scaling is also not the panacea. That is why Tarantool has some tools for horizontal scaling, or the vshard module [1]. It allows you to spread data across multiple servers, but you'll have to tinker with it for a while to configure it and bolt on the business logic.
Good news: we got our share of bumps (for example, [2], [3]) and created another framework, which significantly simplifies the solution to this problem.
Тarantool Cartridge is the new framework for developing complex distributed systems. It allows you to concentrate on writing business logic instead of solving infrastructure problems. Under the cut, I will tell you how this framework works and how it could help in writing distributed services.
How to Write a Smart Contract with Python on Ontology? Part 2: Storage API
This is an official tutorial published earlier on Ontology Medium blog
Excited to publish it for Habr readers. Feel free to ask any related questions and suggest a better format for tutorial materials
Foreword
Earlier, in Part 1, we introduced the Blockchain & Block API of Ontology’s smart contract. Today we will discuss how to use the second module: Storage API. The Storage API has five related APIs that enable addition, deletion, and changes to persistent storage in blockchain smart contracts. Here’s a brief description of the five APIs:
How to Write a Smart Contract with Python on Ontology? Part 1: the Blockchain & Block API
This is an official tutorial published earlier on Ontology Medium blog
Excited to publish it for Habr readers. Feel free to ask any related questions and suggest a better format for tutorial materials
Foreword
In this article, we will begin to introduce the smart contract API of Ontology. The Ontology’s smart contract API is divided into 7 modules:
- Part 1: Blockchain & Block API
- Part 2: Storage API
- Part 3: Runtime API
- Part 4: Native API
- Part 5: Upgrade API
- Part 6: Execution Engine API
- Part 7: Static & Dynamic Call API
In this article, we will introduce the Blockchain & Block API, which is the most basic part of the Ontology smart contract system. The Blockchain API supports basic blockchain query operations, such as obtaining the current block height, whereas the Block API supports basic block query operations, such as querying the number of transactions for a given block.
Let’s get started!
First, create a new contract in SmartX and then follow the instructions below.
1. How to Use Blockchain API
References to smart contract functions are identical to Python’s references. Developers can introduce the appropriate functions as needed. For example, the following statement introduces GetHeight, the function to get the current block height, and GetHeader, the function to get the block header.
Qrator filtering network configuration delivery system
TL;DR: Client-server architecture of our internal configuration management tool, QControl.
At its basement, there’s a two-layered transport protocol working with gzip-compressed messages without decompression between endpoints. Distributed routers and endpoints receive the configuration updates, and the protocol itself makes it possible to install intermediary localized relays. It is based on a differential backup (“recent-stable,” explained further) design and employs JMESpath query language and Jinja templating for configuration rendering.
Qrator Labs operates on and maintains a globally distributed mitigation network. Our network is anycast, based on announcing our subnets via BGP. Being a BGP anycast network physically located in several regions across the Earth makes it possible for us to process and filter illegitimate traffic closer to the Internet backbone — Tier-1 operators.
On the other hand, being a geographically distributed network bears its difficulties. Communication between the network points-of-presence (PoP) is essential for a security provider to have a coherent configuration for all network nodes and update it in a timely and cohesive manner. So to provide the best possible service for customers, we had to find a way to synchronize the configuration data between different continents reliably.
In the beginning, there was the Word… which quickly became communication protocol in need of an upgrade.
The big interview with Martin Kleppmann: “Figuring out the future of distributed data systems”
Dr. Martin Kleppmann is a researcher in distributed systems at the University of Cambridge, and the author of the highly acclaimed «Designing Data-Intensive Applications» (O'Reilly Media, 2017).
Kevin Scott, CTO at Microsoft once said: «This book should be required reading for software engineers. Designing Data-Intensive Applications is a rare resource that connects theory and practice to help developers make smart decisions as they design and implement data infrastructure and systems.»
Martin’s main research interests include collaboration software, CRDTs, and formal verification of distributed algorithms. Previously he was a software engineer and an entrepreneur at several Internet companies including LinkedIn and Rapportive, where he worked on large-scale data infrastructure.
Vadim Tsesko (@incubos) is a lead software engineer at Odnoklassniki who works in Core Platform team. Vadim’s scientific and engineering interests include distributed systems, data warehouses and verification of software systems.
Contents:
- Moving from business to academic research;
- Discussion of «Designing Data-Intensive Applications»;
- Common sense against artificial hype and aggressive marketing;
- Pitfalls of CAP theorem and other industry mistakes;
- Benefits of decentralization;
- Blockchains, Dat, IPFS, Filecoin, WebRTC;
- New CRDTs. Formal verification with Isabelle;
- Event sourcing. Low level approach. XA transactions;
- Apache Kafka, PostgreSQL, Memcached, Redis, Elasticsearch;
- How to apply all that tools to real life;
- Expected target audience of Martin’s talks and the Hydra conference.
Flightradar24 — how does it work? Part 2, ADS-B protocol
In the first part the basic ideas of operation were described. Now let's go further and figure out, what data is exactly transmitting and receiving between the aircraft and a ground station. We'll also decode this data using Python.
Building a Private Currency Service Using Exonum
Potential applications for zero-knowledge include, but are not limited to:
- Inter-bank transfer systems (see a research paper by Narula et al.)
- Privacy-focused management of digital assets (see a proof of concept by J.P. Morgan and zCash)
- KYC (see a proof of concept by ING)
- Self-sovereign identity (see an attribute-based credentials EU project)
- Voting (see a proxy voting prototype by Russian National Security Depository)
Another application for zero-knowledge proofs is helping blockchains scale. ZKPs allow for the “compressing” of computations for blockchain transactions without sacrificing security.
In this article, we describe how zero-knowledge (specifically, Bulletproofs) can be applied to build a privacy-focused service using Bitfury’s Exonum platform.
Authors' contribution
olegchir 356.4kovalensky 173.0Bright_Translate 163.2andreyka26 159.0clubadm 159.0jirfag 152.0ph_piter 144.6sergepetrenko 141.0bitec 129.0nezhibitskiy 128.0