The principle of entity matching of business credit report information

The product of credit investigation and the product used in credit investigation is credit investigation report, which refers to the written report formed by the business credit check agency through information collection, field investigation or interview to collect data and records reflecting the credit status of the investigated object, and after verification and processing for the use of the client.
After calculating the similarity of entities to shared attribute fields, entity matching calculation should be carried out according to the similarity degree of attribute fields and the ability to identify the entity identity, so as to judge whether the two inconsistent credit records are the behaviors of the same entity in reality, so as to collect, store and display credit information according to entities.
There are two common entity matching methods: (1) Manual weighted matching: according to the ability of different attribute fields to identify the entity identity and the importance of the matching operation, the weight and matching threshold of the attribute fields are manually given, and then the overall similarity of the entity record pair is calculated by linear weighting. If they are greater than the matching threshold, they are matching entities, otherwise they are not. (2) Machine learning matching: Entity record pairs are randomly selected from the target data sources participating in matching operations, and the matching results of the record pairs are manually marked as training samples. The learning machine is trained, the learning model is built, and the matching function contained in the training sample is learned. Then, in the entity matching operation of the target data source, the matching function is used to determine the matching relationship of the entity record.

