Skip to main content
Data CleansingMachine Learning

5 Tips to Keep Your Salesforce Data Duplicate-Free

Duplicate management is one approach to corral the inaccuracies in your CRM. But it seems like it’s never straightforward – and it should be. When it comes to Salesforce, it’s easy for Salesforce Administrators everywhere to become overwhelmed by the seemingly endless ways that a single account can generate duplicate records that get stored in your database. And although Salesforce provides a duplicate management function, it often falls short of what you need to eliminate duplicates completely. 

We’ve assembled 5 tips to help you keep your data clean and duplicate-free.  

1) Assess Your Data Quality

The first step toward increasing data quality is understanding the state of your data. A data quality assessment or data profile will allow you to document the current state of your data by reviewing the following conditions: 

  • Validity Does the data clearly and accurately represent the intended result?  
  • Integrity Are there any safeguards in place that would reduce the risk of data entry errors manipulating the data?  
  • Precision Does the data contain enough detail to use it as a basis for making business decisions?  
  • Reliability Does the data reflect consistent data collection processes and analysis methods over the course of its lifetime? 
  • Timeliness How soon can the data be made available? Is the data current? Can it be available during real-time decision making? 

An effective data assessment will help to identify those actions that need to be taken to improve your data quality and offer insight into how best to maintain data integrity into the future.    

2) Stop Duplicates Before Importing New Records

Duplicates in your data can arise for many different reasons, and it is helpful to distinguish where your duplicates come from. Salesforce can notify users that they are about to create a duplicate record and even block them from doing so, but duplicates are stealthy. They enter your database when you import new records. You should always use a deduplication tool before you run mass merges, for example. Deduplicate, merge, and append records while importing to Salesforce. That way, you’re saved the trouble of wrestling with duplicates after they’ve been introduced to your database. 

3) Maintain a Data Quality Assessment Report

Reporting is crucial in tracking and improving your data management efforts. The report should include answers to the questions that initiated the data quality assessment, provide you with actionable insights to the problems you are experiencing, and describe how to detect and locate problem data. It’s a good idea to create a routine quality assessment report on a regular basis so that you can identify potential problems early on. A report will assist your data team in taking action if certain critical objects are found to have fallen outside the initial scope of the data quality assessment. In addition, such monitoring can help navigate your current data governance process and make changes where needed. If you are migrating data to a new environment, such as CRM migration, a report can document the process for cleansing data prior to the start of the migration. You can learn more about building and maintaining a report in our article, Creating a Data Quality Report
 

4) Understand Data Quality Tools

Data quality tools manage the quality of your data through cleansing, integration, and master and metadata management. It is important to understand the benefits as well as the imitations and complexities of such tools. For example, to clean up your data for your own functionality as well as presentation to other businesses, we recommend tools like OpenRefine, an open-source tool useful in renaming, filtering, and specifying your data. Structural errors that often lead to duplication are especially well addressed with this tool. A tool that can help in the analysis and clean-up is Informatica Master Data Management, which identifies data that is unproductive or sourced from errors like improper entry. You can read our recommendations for data management tools in this article, Staying Appy with Data Management Tools on the Salesforce AppExchange. 

Selecting the right tool for your business can be overwhelming, but searching AppExchange for ‘duplicates’ makes the process easy. Most tools even offer a free version or trial. Once you have selected a tool that works for you, remember to validate the findings of your data cleansing tool to ensure that your data is accurate and can be used within your business practices and requirements.  

5) Embrace Machine Learning (ML)

No matter how accurate someone may be in deduplicating their data, the superior speed and efficiency that technology provides cannot be overlooked. And when it comes to identifying duplicates in your data, machine learning (a subset of artificial intelligence) offers a vastly superior approach. 

One of the major benefits of the machine learning approach is that unlike a rule-based deduplication tool you are not required to set up the matching rules and constantly keep adding new ones to account for all of the “fuzzy” duplicates. The machine learning algorithms automatically select all of the string metrics to compare records based on the examples that you “feed” to the system, this approach is called Active Learning. It gives you the best of both worlds: they analyze records just like a human and they have much greater computational power.  

DataGroomr is on the forefront of leveraging Machine Learning to identify duplicates and evaluate data quality. Our free, 14-day trial demonstrates all the capabilities of the tool and gives you a quick assessment of the state of your data today. Start the free trial by logging in with your Salesforce credentials. There is no setup required. Begin deduplicating your data right away! 

Steven Pogrebivsky

Steve Pogrebivsky has founded multiple successful startups and is an expert in data and content management systems with over 25 years of experience. Previously, he co-founded and was the CEO of MetaVis Technologies, which built tools for Microsoft Office 365, Salesforce and other cloud-based information systems. MetaVis was acquired by Metalogix in 2015. Before MetaVis, Steve founded several other technology companies, including Stelex Corporation, which provided compliance and technical solutions to FDA-regulated organizations. Steve holds a BS in Computer/Electrical Engineering and an MBA in Information Systems from Drexel University.