Data Normalizer

Normalize your data and correct inconsistent spelling with AI in seconds

标签:

I am spending hours on my job manually adjusting data entries due to inconsistent spelling and shortcuts, so I built a tool to automate this process using large-language models.

The idea is to have a tool that takes in CSV exports and:
> Corrects for inconsistencies in spelling (Coop vs co-op)
> Harmonizes shortcuts (Limited vs Ltd.)
> Corrects for spelling mistakes (serbices vs services)

This is how the tool works:
– You can upload a CSV file and specify which row you want to extract and harmonize.
– The model automatically consolidates data by combining similar-looking phrases.
– You can edit the proposed phrase names or further consolidate entries if there are some groups the model has missed.
– In the end, you can download your CSV file again and push it to the database

Features and Benefits of Data Normalizer

  • Normalize data in Excel, Python, R, SQL, CSV formats.
  • Utilize fuzzy match, fuzzy search, and levenshtein distance techniques.
  • Organize data entries to ensure consistency across fields and records.
Data Normalizer is a versatile tool that allows users to normalize data in various formats using techniques like fuzzy match and levenshtein distance. It helps organize data entries to ensure consistency and efficiency across fields and records.

Related

暂无评论

暂无评论...