Data Cleansing
The data used for business analysis are quite often of poor quality. They contain many errors, such as duplications, conflicts, gaps, outliers and many other problems. It is impossible to avoid such problems completely, thus, the data must be cleansed. All possible methods, including organizational and software ones, are used for improving the quality of the input information.
During creation of analytical solutions, low data quality represents one of the most serious problems because incorrect information leads to wrong conclusions. Even the most advanced analytical methods fail and we have to use special data cleansing mechanisms. Loginom allows problems of data cleansing to be resolved thanks to the special tools it contains, as follows:
- Error detection. Loginom has built-in algorithms for detecting different kinds of errors. These include: finding gaps in ranked or unranked data, detection of abnormal deviations, searching for duplicates and conflicts, noise removal and different types of filtering. Having prepared the error detection scenario once, the user can later apply it to new input data.
- Error correction. Loginom not only detects errors, but also corrects them, e.g. it fills in the gaps or edits the outliers. To correct errors, various algorithms can be used which select the correct value based on statistics or data from any external source.
- Data deduplication. Virtually any company faces the problem of data duplication, when one object (a company, a product or a person) is stored in the database under different names. For correct processing, such data need to be deduplicated, which means merging the data referring to the same object. The names may contain typos, misplaced words and other problems which prevent deduplication based on full matching. Loginom provides a way to resolve this problem by finding matching and similar names, and assessing their degree of similarity.
- Integration. After completing the data cleansing scenarios, the integration tools built into Loginom load the correct data in various systems, log the operations and give an alert in case of errors.
Data cleansing is one of the most important analytical problems. It is the most time-consuming process involved in developing solutions but is an essential part of the work in any project. Loginom has everything necessary for resolving the problems of data cleansing, and most of the operations can be performed in the automatic mode.


