Xuanxingbao shares how to build a unified and shared master data platform to create truly clean data governance capabilities?

选型宝

Xuanxingbao shares how to build a unified and shared master data platform to create truly clean data governance capabilities? [Copy link]

This content is originally created by EEWORLD forum user Xuanxingbao . If you want to reprint or use it for commercial purposes, you must obtain the author's consent and indicate the source

Today, the business environment is changing rapidly and competition is becoming increasingly fierce.

No matter what industry you are in, a keyword you cannot avoid is "digital transformation." Making enterprises agile through digital transformation has become the spirit of the times and the mission of our generation of IT people.

However, both the innovation needs at the business level and the data analysis needs at the decision-making level must be supported by clean and accurate business data. Only with a standardized and clean data foundation can we talk about innovation, make scientific decisions in a complex and changing business environment, and implement digital transformation strategies.

Among the complex data of an enterprise, there is a type of data that concerns the overall situation, such as customer data, product data, employee data... These data are frequently reused and affect the overall situation, and are becoming difficulties and pain points in data governance.

The master data management system uses these shared, static data as a starting point, and attempts to establish a unified, shared management system through governance and standardization to create truly clean data governance capabilities.

However, as a type of heavy implementation project, the implementation of master data management is not simple. It involves a lot of dirty and tiring work, and the project implementation risk is very high.

What are the risks of implementing a master data management project?

What are the key considerations in selecting a master data product?

Xuanxingbao: In your opinion, what type of data belongs to master data, and what is its relationship with other data?

Zhang Jinliang: There are three standards for master data. The first one is uniqueness . This is easy to understand. Since master data must be unique and cannot be duplicated, this is uniqueness.

The second is sharing . The master data must be able to circulate throughout the entire enterprise's business system and be used by all systems. This is sharing.

The third is static . This data is relatively static and does not change very frequently. Unlike our transaction data which may change dozens of times a minute, its data is relatively static.

Generally speaking, we will use these three standards to sort out the concepts of traditional master data. Of course, there may be some extensions or changes in the management methods of master data, but we still mainly define it at the data level.

The relationship between business data and master data is that master data is the basis of business data. When the master data arrives at various business systems, I will supplement it with some business attributes, so that the data may be richer.

At the same time, some business data and transaction data are actually generated using master data as the basic data during operation. Therefore, master data is the most basic and core part of all data.

Xuanxingbao: How should we understand the relationship between the two concepts of master data management and data governance?

Zhang Jinliang: In fact, our simple understanding of master data and data governance is that they work together to help enterprises improve data quality.

In fact, data governance is a part of data management, and master data management is the core part of data management.

When an enterprise is doing data governance, it must first have master data management, data standards and specifications, and establish a mature master data management process. Based on this, it can then do data governance, including data cleaning. At least we have a law to follow. What standard should you use to clean the data and govern the data? If your standards are not clear, your data will only become more and more chaotic. Today I have this standard and I will do it this way, and tomorrow another department will have another standard. Then the data will definitely be chaotic. There must be a unified standard.

Master data management is to implement the entire standard process at the master data level, including some definitions, to ensure the quality of this data. Based on this, it will be easier for me to work on data quality.

Stibo believes that master data management is the core cornerstone of the entire data management.

Xuanxingbao: Generally speaking, what kind of process does an enterprise go through when implementing a master data system?

Zhang Jinliang: Generally speaking, it is divided into the following steps:

Step 1: Definition of master data

We need to do some publicity and discuss with customers what kind of data they consider to be master data. This process is called master data definition, or master data identification. The definition criteria are the uniqueness, sharing, and staticity mentioned earlier.

Step 2: Determine the maintenance process and standard specifications for master data

After the interview, we will determine the standard specifications for maintaining these data. If they are reasonable, I can study them. If there are some that can be changed, altered, or optimized, we will give some suggestions, down to the field level, such as what my data type is, size, length, etc. These are some of the master data standards.

At the same time, we also need to help customers sort out the data maintenance process, who will be involved in this process in the future, what kind of people should play the roles at each process node, and what kind of people should be recommended to take up such positions and be responsible for this area.

Because your data standard is not set in stone. Once it is set, there will often be some changes. At this time, there must be a dedicated person or organization to handle this matter.

Step 3: Clean historical data and enter the master data system

Data cleaning is a big part of master data implementation. If the data is not of high quality, dirty or messy, it will still be dirty and messy after entering the master data system. If there is no data cleaning, I am just taking a backup of the dirty data from one place and putting it in another place, which does not solve the fundamental problem.

In combination with the determined standard specifications, the historical data is cleaned to ensure that the clean data enters the master data management system after cleaning.

Step 4 Data Mapping

After cleaning, the master data system stores only reliable data. In the business system, there may be duplicate data or the data quality may be very poor. In this case, this kind of mapping is required.

The master data management system pushes the cleaned data back to the business system and then retains the mapping relationship. Because the transaction is running, if the data is completely changed, the original system document and historical data may not be able to be used. Therefore, there may be a mapping relationship and a transition process.

Xuanxingbao: After the project goes online, what mechanism will ensure that the newly generated data complies with the specifications?

Zhang Jinliang: Usually, we pay attention to the concepts of before, during and after the event.

Beforehand , before the data comes in, it needs to be verified. If the quality is not good or there are problems, I don’t want it. This is one way.

Another thing is that when I am doing maintenance, there will be human errors and I cannot guarantee that everything I do is correct. In this case, there will also be a monitoring and management process during the process .

Afterwards , when the master data system pushes data to the business system, it must also be pushed in accordance with the requirements and specifications of the business system.

All three parts, before, during and after the event, require a data management system, and our group data product, Stibo, actually has these functions.

For example, I will have some verification interfaces. Even if you use your own business department to maintain it, you still need to go to the interface of the main data system to perform data verification to ensure that the incoming data is OK.

At the same time, we have some data quality analysis reports, which can be run regularly. If there are any problems, they can directly remind you what problems the data has. This is very important in data maintenance and management, because I can see at a glance which data has problems, and I can directly make changes and maintenance.

Another function is that I have some business rules or process verification mechanisms. When you are doing maintenance, I can remind you that you have made a mistake, your work is not correct, or it does not meet the standards. This is a complete system. From data standards, specifications, and processes, these collaborations can ensure the cleanliness of the data.

Xuanxingbao: What kind of logic is behind the data verification mechanism? Can you give some examples?

Zhang Jinliang: In fact, we often encounter this verification problem. To put it simply, we often find that when we log in to things online, this box is text and we cannot enter numbers.

Simply put, the length of the code is 20 digits, you cannot enter 40. For example, if your mobile phone number exceeds 11 digits, I will think you have entered it wrong.

But let's take a more complicated example, for example, after your ID number comes in, it will verify whether you made it up yourself. Because the ID number has a check digit, it is not made up by itself. The second number and area code, you can just enter it randomly, right?

Then it gets more complicated. For example, after my data comes in, there is a duplicate detection to identify its uniqueness with the main data. If I recorded a piece of data before, you may record something very similar or close two days later, so I will remind you. And so on. Mechanisms like these are used to ensure that subsequent data is continuously clean.

Xuanxingbao: As a project with relatively high implementation risks, what factors do you think may lead to the failure of the master data management project?

Zhang Jinliang: In fact, from the perspective of master data management, the implementation difficulties mainly lie in the following aspects:

1. How to drive business departments to implement new management standards

This is generally the case in many companies. I want to use the business system, but I think the maintenance and management of the entire data should be the responsibility of IT.

The reality is that a lot of data actually comes from the business department. In this case, when defining the data maintenance process, it is difficult to push down the system norms. Everyone is willing to enjoy the convenience after data standardization, but not necessarily willing to bear the constraints brought by the norms.

2. Cleaning historical data is a dirty and tiring job

Another very important point is data cleaning. Which historical data can be included in the main data? Before entering, it must be cleaned. This step is very critical.

Theoretically, every field in every entry must be checked, so this will be quite difficult and the workload will be quite large. In our words, it is dirty and tiring work.

This step is also a very important potential risk and is the key to the success or failure of the project.

Selection Treasure: What strategies can reduce implementation risks?

Zhang Jinliang: First of all, we must ensure strong leadership

This project definitely requires a higher-level leader to promote it. Only a senior leader can coordinate the resources or manpower among various departments.

For example, this includes experts and team leaders from various departments who are responsible for standardizing data and formulating data standards and leading the process.

If there is a data change, he can do some arbitration, so it must be a person of a higher level to promote this project.

Second, provide incentives

During the entire project implementation process, we will define the data and the array of this data, and determine who will be in charge of this data and which department will be responsible for which part.

Our system will have a whole tracing process, which will tell you who did what maintenance and when, and what data was changed. We will also have an assessment of the data quality. Some KPI indicators can evaluate the timeliness of the entire data maintenance, including accuracy. By using this KPI indicator to count each person, it can be linked to his performance, which is equivalent to some corresponding positive or negative incentives, and also to encourage users to make a contribution while enjoying the advantages of high data quality.

Third, through product and technical means, try to continue the previous data maintenance habits as much as possible

For example, embed the main system and some pages directly into the business system, follow the user's previous maintenance habits, and make the user feel that he is not maintaining the master data system. He feels that I am maintaining data for the business system, but in fact he has entered the master data system.

By reducing changes in habits, the resistance of business personnel can be reduced and the implementation risks can be reduced.

Xuanxingbao: From the customer's perspective, what dimensions do you think should be considered when choosing a master data management platform?

Zhang Jinliang: From a customer's perspective, when choosing a master data platform, one should consider the following aspects:

First , ease of use

Is it very easy to use? Is it easy to get started? And can my business department use it? If the product is easy to use enough, it will be easier to reduce the resistance of the business department.

Second , Scalability

This is what I just said, or it can be called business responsiveness. If the structure and standards of the data change, can we respond to the business requirements in the first time, instead of going to the original manufacturer for re-development, rebuilding the structure, redeploying, and so on? After this series of time has passed, the best time may have passed.

Third , the durability of the company and products

Because the master data is very important. This company must see that it is a sustainable company in the future. It cannot be that the company will disappear in a few years, and then no one will maintain or update my system and such important data. This is also very critical.

Fourth , the professional capabilities of implementation personnel

Because it seems to be just pure data, but you need to have a deeper understanding of its business. You can give him some experience, such as how the data is generally maintained in a certain industry, what are the generally defined data standards, and what are the general quality attributes of my data. This can give him some guidance.

Fifth , project cycle

This part is put last, but it is actually very important.

Many customers want a short implementation cycle because once they discover data quality problems, they will definitely want to do it as quickly as possible. Therefore, the project implementation cycle is also very important. That is, can I manage the data as quickly as possible within half a year?

Xuanxingbao: What do you think are the key differences between you and your competitors?

Zhang Jinliang: In fact, ease of use and strong scalability are the product advantages of Stibo.

For example, validation rules can usually be configured to meet the individual needs of customers.

In addition, the operation interface is completely graphical, and the model can be modified graphically. Therefore, when there are new changes or new fields are added, the user only needs to operate a few times on the page to add the new field directly. The original data will not be affected at all and can continue to be used.

The advantage of this is that the implementation cycle is short and future expansion is relatively easy.

To give a very simple example, we have some people working in the retail industry abroad who are doing product information and product launches.

We know that it may only take a few days for a new product to be released. If it needs to be adjusted and you don’t know how to do it yourself, it will be very troublesome to find the original manufacturer to do it for you.

So in a sense, this is not a cost issue, but a question of agility. Today, competition is extremely fierce, so the business department will make extremely high demands on your IT department, asking, "Why haven't you entered the data for me yet? I need to sell it quickly."

So in such a situation, you must have a highly flexible and agile tool that can help you achieve this capability.

Selection tool: What deployment methods are supported?

Zhang Jinliang: Both local deployment and cloud deployment are supported.

Many customers still use local deployment, and it may be more used in China. Another type actually supports cloud deployment. If we have a public cloud, such as Amazon's cloud, Microsoft's cloud, or Huawei Cloud in China, we have many customers deploying Huawei Cloud, including their own private cloud or public cloud.

Xuanxingbao: Who are your typical customers at home and abroad?

Zhang Jinliang: We have divided some typical cases around the world into several major industries, including retail, distribution, and manufacturing. These are relatively common, and we can also see the brands we are familiar with, such as McDonald's, Walmart, Carrefour, and Hermès.

Domestic clients include companies such as Vinda and Lesso, as well as clients in the manufacturing, automotive, finance, tourism, medical and other industries.